Skip to content

Conversation

badnack
Copy link

@badnack badnack commented Jun 27, 2025

Issue
Syzkaller often breaks dependencies across syscalls (expressed through the use of resources in Syzkaller programs). And because of this, Syzkaller struggles to build fuzzing inputs (i.e., programs) that would exercise deeper paths in the drivers. Syzkaller breaks syscalls dependencies because: 1) fuzzing mutation might remove random calls (and resources), 2) the resource generation itself is stochastic -- with a certain probability P, Syzkaller purposefully disregards syscall dependencies in an attempt to increase randomness, and 3) program minimization -- Syzkaller minimizes the programs that it generates, as they are initially quite big.

Patch
The patch addresses all these issues by adding a Boolean (called PromoteDep) in the validation module of Syzkaller. If the Boolean is set to true, certain measures are taken to enforce that IOCTL dependencies are respected in any generated program. In particular, Mutation does not break dependencies, Resource generation is no longer stochastic, and dependencies across syscalls are never disregarded, If a minimized program has broken dependencies, it gets discarded. The patch allows for different ways to enable the PromoteDep flag: 1) The configuration flag promote_syscalls_dependency. If enabled Syzkaller is instructed to not break dependencies, and 2) The flag dynamic_promote_syscalls_dependency. This flag contains a time expressed in minutes (e.g, 30). Once the manager starts it sets a timer with the value contained in this flag. Once the timer reaches the 0, the Boolean PromoteDep is switched (i.e., if it contained false it now contains true, and vice versa), and the timer starts again. This switch allows us to introduce more randomness in the generated programs.

All flags are optional, and they don't have to be set.

@dvyukov
Copy link
Collaborator

dvyukov commented Jun 30, 2025

Hi Nilo,

How/when these tunables should be set? And when shouldn't they be set? Have you done any benchmarking? What are the results? Can this be enabled by default always?

We specifically tried to avoid lots of obscure knobs that nobody knows how to set right.

@badnack
Copy link
Author

badnack commented Jul 1, 2025

Hello Dmitry,

Nice to meet you! Thanks for your message, let me answer to each one of your questions by topic:

Benchmarks

We did experiments during our fuzzing campaigns here at Qualcomm. In particular, we ran syzkaller with and without the proposed patch for a week-time on three different drivers. We observed that by enabling the option "dynamic_promote_syscalls_dependency" and setting it to 10 (minutes), syzkaller covered ~47% functions against ~42% functions covered without the patch (this data was recorded by using GCOV), thus improving function coverage of, on average, ~5%.

How/when to set the turnables

You can enable the flag "dynamic_promote_syscalls_dependency=10" and leave it always enabled, that's what we do in our fuzzing campaigns. What that does is: 1) Run syzkaller with the flag "PromoteDeps" set to true for 10 minutes, and then 2) Run syzkaller with the flag set to False for 10 minutes, and 3) repeat. In phase 1 Syzkaller generates fuzzing programs that do not contain any broken dependencies, thus triggering deeper states in the fuzzed driver. During phase 2, Syzkaller behaves normally and it does not enforce any dependency. Phase 2 allows syzkaller to use the previous generated programs in phase 1 and apply mutation, etc.. By setting the "PromoteDeps" on and off we are able to generate a more complex and thorough fuzzing corpus. The knob could be set by default and never touched again. There are not really any cases in which the knob should not be set.

All this said, I'd be happy to change/update the patch, if you think there is a better design to fix the issue of broken dependencies. Please let me know of any other questions/feedback.

@dvyukov
Copy link
Collaborator

dvyukov commented Jul 2, 2025

Hello Dmitry,

Nice to meet you! Thanks for your message, let me answer to each one of your questions by topic:

Benchmarks

We did experiments during our fuzzing campaigns here at Qualcomm. In particular, we ran syzkaller with and without the proposed patch for a week-time on three different drivers. We observed that by enabling the option "dynamic_promote_syscalls_dependency" and setting it to 10 (minutes), syzkaller covered ~47% functions against ~42% functions covered without the patch (this data was recorded by using GCOV), thus improving function coverage of, on average, ~5%.

How/when to set the turnables

You can enable the flag "dynamic_promote_syscalls_dependency=10" and leave it always enabled, that's what we do in our fuzzing campaigns. What that does is: 1) Run syzkaller with the flag "PromoteDeps" set to true for 10 minutes, and then 2) Run syzkaller with the flag set to False for 10 minutes, and 3) repeat. In phase 1 Syzkaller generates fuzzing programs that do not contain any broken dependencies, thus triggering deeper states in the fuzzed driver. During phase 2, Syzkaller behaves normally and it does not enforce any dependency. Phase 2 allows syzkaller to use the previous generated programs in phase 1 and apply mutation, etc.. By setting the "PromoteDeps" on and off we are able to generate a more complex and thorough fuzzing corpus. The knob could be set by default and never touched again. There are not really any cases in which the knob should not be set.

All this said, I'd be happy to change/update the patch, if you think there is a better design to fix the issue of broken dependencies. Please let me know of any other questions/feedback.

Ok, this is good. Thanks.

  1. Do we need to switch it globally? Can we just choose the mode locally for each program with some probability?
    The global mutex-protected state looks very against design of the prog package. Choosing a mode randomly could simplify things and would be completely inline with the current design.
    Since we also change Fuzzer.genFuzz, it may be a reasonable place to choose a mode and pass it to prog package.

  2. I think we should try to enable to enable it by default w/o tunables if it's an improvement.

  3. mutateRate = 0.3 is pretty significant change (70% of generated programs instead of 5%). So I wonder if it accounts for the main improvement, and the rest is maybe less important.

@dvyukov dvyukov requested a review from a-nogikh July 2, 2025 09:19
@badnack
Copy link
Author

badnack commented Jul 2, 2025

Hello Dmitry,
Nice to meet you! Thanks for your message, let me answer to each one of your questions by topic:

Benchmarks

We did experiments during our fuzzing campaigns here at Qualcomm. In particular, we ran syzkaller with and without the proposed patch for a week-time on three different drivers. We observed that by enabling the option "dynamic_promote_syscalls_dependency" and setting it to 10 (minutes), syzkaller covered ~47% functions against ~42% functions covered without the patch (this data was recorded by using GCOV), thus improving function coverage of, on average, ~5%.

How/when to set the turnables

You can enable the flag "dynamic_promote_syscalls_dependency=10" and leave it always enabled, that's what we do in our fuzzing campaigns. What that does is: 1) Run syzkaller with the flag "PromoteDeps" set to true for 10 minutes, and then 2) Run syzkaller with the flag set to False for 10 minutes, and 3) repeat. In phase 1 Syzkaller generates fuzzing programs that do not contain any broken dependencies, thus triggering deeper states in the fuzzed driver. During phase 2, Syzkaller behaves normally and it does not enforce any dependency. Phase 2 allows syzkaller to use the previous generated programs in phase 1 and apply mutation, etc.. By setting the "PromoteDeps" on and off we are able to generate a more complex and thorough fuzzing corpus. The knob could be set by default and never touched again. There are not really any cases in which the knob should not be set.
All this said, I'd be happy to change/update the patch, if you think there is a better design to fix the issue of broken dependencies. Please let me know of any other questions/feedback.

Ok, this is good. Thanks.

  1. Do we need to switch it globally? Can we just choose the mode locally for each program with some probability?
    The global mutex-protected state looks very against design of the prog package. Choosing a mode randomly could simplify things and would be completely inline with the current design.
    Since we also change Fuzzer.genFuzz, it may be a reasonable place to choose a mode and pass it to prog package.
  2. I think we should try to enable to enable it by default w/o tunables if it's an improvement.
  3. mutateRate = 0.3 is pretty significant change (70% of generated programs instead of 5%). So I wonder if it accounts for the main improvement, and the rest is maybe less important.
  1. The global switch is not strictly necessary, and we can achieve similar results by doing it locally using some probability as you mentioned. I can definitely do this change.
  2. For sure, no problem.
  3. I wouldn't say this is the main one, because even if we never mutate a program (for example), syscall dependencies are still very likely to be broken because of: 1) program minimization, 2) the fact that the resource generation is itself stochastic, and a resource might not be used even if it was previously created, and 3) there is a hard limit for the number of recursions allowed during the generation of a resource, which basically hinders the correct generation of programs whose syscalls dependencies are longer chains.

@badnack badnack force-pushed the master branch 2 times, most recently from 411a602 to b34b265 Compare July 30, 2025 23:37
…scall dependencies

Syzkaller often breaks dependencies across syscalls (e.g., due minimization, stochastic resource
generation, and mutation) when generating programs, thus failing to build fuzzing inputs that
exercise deep states in the target program.

This patch adds the logic to check whether syscall dependencies are broken in a given program.
Everytime a new program is to be generated, we set a flag called EnforceDeps to true with
a certain probability. If this flag is set, we enforce that the dependencies of each syscall
used in the program is respected.
@badnack
Copy link
Author

badnack commented Jul 31, 2025

I applied all the requested changes. I removed the EnforceDeps flag from the configuration file and global memory. The flag is now automatically enabled and disabled with a certain probability.

@badnack badnack requested review from dvyukov and a-nogikh August 5, 2025 22:22
@a-nogikh
Copy link
Collaborator

a-nogikh commented Aug 7, 2025

Thanks!

I've set up a syz-testbed-based experiment to see the effects of the changes. We'll see some results in the coming days.

@badnack
Copy link
Author

badnack commented Aug 8, 2025

Awesome, thank you!

@a-nogikh
Copy link
Collaborator

Hi Nilo,

So:
20 runs of mainline syzkaller run for 24 hours (~28M execs each)
20 runs of syzkaller from this PR for 24 hours (~28M execs each)

I'd say that the syz-testbed results are very similar:

Parameter Mainline This PR
crash types (median) 133 136
crash types (union of all runs) 370 371
crashes (median) 3130 3085
coverage (median) 306863 306842

Do you observe the improvements when you run the current code of the PR on the benchmarks you mentioned in #6131 (comment)?

@badnack
Copy link
Author

badnack commented Aug 12, 2025

Hello @a-nogikh ,

Thanks for getting back to me. With the previous version of the patch, we ran fuzzing campaigns on a code base of ~75,000 functions and we registered an improvement of ~3.7% function coverage (i.e., ~2,775 new functions). I am still running tests with the current version of the patch on the same codebase to compare results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants