Skip to content

Grammar-Aware Fuzzing

At a Glance

Category Grammar-Aware Fuzzing
Key Tools Nautilus, Fuzzilli, Domato, FormatFuzzer, Superion
Maturity Growing
Primary Use Fuzzing targets that consume structured inputs by generating mutations that respect input grammar

Overview

Many high-value fuzzing targets (compilers, interpreters, database engines, protocol parsers, file format handlers) expect inputs that conform to a specific grammar or structure. A JavaScript engine will reject x#@$%^ immediately at the parser stage; an XML processor will discard anything that is not well-formed markup. For these targets, coverage-guided fuzzing with random byte-level mutations is profoundly inefficient: the vast majority of generated inputs are syntactically invalid and exercise only error-handling code.

Grammar-aware fuzzing addresses this problem by encoding the input structure into the mutation engine. Rather than flipping bits in a flat byte stream, grammar-aware fuzzers operate on abstract syntax trees (ASTs), production rules, or template-based generators. Mutations occur at the structural level (swapping subtrees, replacing non-terminal symbols with alternative productions, adjusting field lengths) producing inputs that are syntactically valid but semantically diverse.

The tradeoff is clear: grammar-aware fuzzers require upfront investment in specifying the input grammar, but in return they reach code paths that random mutation would take astronomically long to discover. For targets like JavaScript engines, where grammar complexity makes random mutation essentially useless beyond the parser, grammar-aware approaches are not merely beneficial; they are necessary.

Modern grammar-aware fuzzers increasingly combine structural awareness with coverage feedback, creating a hybrid that preserves input validity while still being guided by code coverage. This combination (grammar-aware mutation with greybox feedback) represents the current state of the art for fuzzing structured-input targets.

Key Tools

Nautilus

Nautilus is a grammar-based, coverage-guided fuzzer that uses context-free grammars to generate and mutate structured inputs. Presented at NDSS 2019, Nautilus combines the structural awareness of grammar-based generation with the feedback-driven corpus evolution of coverage-guided fuzzing.

Architecture

Nautilus takes a context-free grammar as input, specified in a JSON format defining production rules and terminal symbols. The fuzzer generates inputs by deriving strings from the grammar's start symbol, building an AST that it can then mutate structurally. Mutations include subtree replacement (swapping one subtree for another derived from the same non-terminal), random regeneration of subtrees, and splicing between corpus entries.

Coverage feedback is collected via AFL-style instrumentation, allowing Nautilus to identify which structurally valid inputs reach new code. The corpus evolves just as in traditional coverage-guided fuzzing, but every input in the corpus is guaranteed to conform to the specified grammar.

Nautilus also supports chunks; learned input fragments that are discovered during fuzzing and reused in future mutations. This allows the fuzzer to learn domain-specific patterns beyond what the grammar explicitly encodes.

Strengths

  • Combines grammar awareness with coverage-guided feedback
  • Structural mutations preserve input validity by construction
  • Chunk-based learning augments the grammar with discovered patterns
  • Effective for compilers, interpreters, and language processors
  • Open-source with reproducible research results

Weaknesses

  • Requires manual grammar specification (can be labor-intensive)
  • Grammar must be context-free; context-sensitive constraints (e.g., variable scoping) need workarounds
  • Performance depends heavily on grammar quality and completeness
  • Smaller community than general-purpose fuzzers like AFL++

Use Cases

  • Fuzzing programming language interpreters and compilers
  • Testing XML/JSON/HTML parsers with structurally valid but diverse inputs
  • Fuzzing configuration file parsers
  • Any target where input validity is defined by a context-free grammar

Community & Maintenance

Nautilus is maintained by researchers at Ruhr University Bochum. The project is available on GitHub with documentation and example grammars. Active development continues, with the tool being used in ongoing research on grammar-aware fuzzing techniques.


Fuzzilli

Fuzzilli is a JavaScript engine fuzzer developed by Google Project Zero researcher Samuel Gross. It takes a unique approach to grammar-aware fuzzing by operating on a custom intermediate language (FuzzIL) rather than directly generating JavaScript source code.

Architecture

Fuzzilli's key innovation is its intermediate representation. Rather than defining mutations over JavaScript's complex and context-sensitive grammar, Fuzzilli defines a simpler intermediate language (FuzzIL) where every program is valid by construction. FuzzIL programs are then lifted to JavaScript for execution against the target engine. This approach sidesteps the difficulty of maintaining semantic validity in a complex language.

Mutations in Fuzzilli operate on FuzzIL programs: inserting new instructions, mutating operands, splicing program fragments, and performing type-aware transformations. Because FuzzIL enforces invariants like proper variable scoping and type consistency, the resulting JavaScript programs are not only syntactically valid but also semantically meaningful; they are unlikely to throw immediate ReferenceError or TypeError exceptions.

Fuzzilli uses coverage feedback from the target JavaScript engine to guide corpus evolution, combining structural awareness with greybox fuzzing principles. It supports distributed fuzzing across multiple machines.

Strengths

  • Generates semantically valid JavaScript programs, not just syntactically correct ones
  • FuzzIL intermediate language avoids the complexity of direct JS grammar manipulation
  • Coverage-guided corpus evolution for systematic exploration
  • Distributed fuzzing support for scaling across machines
  • Proven track record: discovered numerous security-critical bugs in V8, JavaScriptCore, and SpiderMonkey
  • Active development by Google Project Zero

Weaknesses

  • Specialized for JavaScript engines, not a general-purpose grammar fuzzer
  • FuzzIL design decisions may bias toward certain program patterns
  • Limited documentation for extending to new targets
  • Requires significant compute resources for effective campaigns

Use Cases

  • Security testing of JavaScript engines (V8, JavaScriptCore, SpiderMonkey)
  • Finding JIT compiler bugs, type confusion, and memory corruption in JS runtimes
  • Regression testing for JavaScript engine development

Community & Maintenance

Fuzzilli is maintained by Google Project Zero and available on GitHub. It has an active development history and is used internally at Google for JavaScript engine security. The project has inspired similar approaches for other language runtimes.


Domato

Domato is a DOM fuzzer developed by Google Project Zero's Ivan Fratric. It generates random HTML, CSS, and JavaScript to test browser DOM implementations. Domato uses a template-based grammar approach, where grammars define how to generate random but structurally valid web content.

Strengths

  • Purpose-built for browser DOM testing
  • Template grammar is human-readable and extensible
  • Proven effectiveness; found multiple browser vulnerabilities
  • Lightweight and easy to run

Weaknesses

  • Not coverage-guided; relies on generation breadth rather than feedback
  • Specialized for browser testing only
  • Grammar maintenance requires browser DOM expertise

Use Cases

  • Browser DOM implementation testing
  • Finding rendering engine bugs and memory corruption in browsers

Community & Maintenance

Maintained by Google Project Zero. The grammar files are periodically updated to cover new DOM APIs. Domato is widely cited in browser security research.


FormatFuzzer

FormatFuzzer generates and parses binary file formats based on format specifications written in the 010 Editor binary template language. By understanding the exact structure of binary formats (PNG, PDF, ELF, ZIP), FormatFuzzer can produce valid files with controlled mutations at specific fields.

Strengths

  • Leverages existing 010 Editor format specifications; large library of formats available
  • Generates structurally valid binary files by construction
  • Can target specific fields for mutation (e.g., corrupt a checksum but keep the rest valid)
  • Bidirectional: can both generate and parse format-conforming files

Weaknesses

  • Requires a 010 Editor template for the target format
  • Limited coverage feedback integration
  • Performance overhead from format-aware generation

Use Cases

  • Fuzzing image processors, document parsers, and file format handlers
  • Testing decoders for media formats (PNG, JPEG, MP4)
  • Targeting specific fields in binary protocols

Community & Maintenance

FormatFuzzer is developed by researchers at Saarland University (CISPA). It is available on GitHub and actively maintained as a research tool.


Superion

Superion extends AFL with grammar-aware mutations for structured text formats. It uses ANTLR grammars to parse inputs into ASTs and applies tree-based mutations (subtree replacement, recombination) while maintaining syntactic validity. Superion demonstrated improved coverage over vanilla AFL on targets like JavaScript engines and XML parsers.

Strengths

  • Builds on AFL's proven coverage-guided framework
  • Uses widely available ANTLR grammars
  • Tree-based mutations maintain structural validity

Weaknesses

  • Research prototype with limited maintenance
  • ANTLR grammar parsing adds overhead
  • Superseded by more recent tools (Nautilus, Fuzzilli)

Use Cases

  • Grammar-aware greybox fuzzing for targets with ANTLR-compatible grammars
  • Research baseline for comparing grammar-aware fuzzing strategies

Community & Maintenance

Superion is an academic research prototype. While the code is available, active development has slowed. Its ideas have been incorporated into subsequent tools.

Example Grammar

Grammar-aware fuzzers typically require an input grammar that defines the structure of valid inputs. Here is a simplified example of a context-free grammar for arithmetic expressions, in a format similar to what Nautilus accepts:

{
  "rules": {
    "<start>": [["<expr>"]],
    "<expr>": [
      ["<expr>", " + ", "<expr>"],
      ["<expr>", " * ", "<expr>"],
      ["(", "<expr>", ")"],
      ["<number>"]
    ],
    "<number>": [
      ["<digit>"],
      ["<digit>", "<number>"]
    ],
    "<digit>": [
      ["0"], ["1"], ["2"], ["3"], ["4"],
      ["5"], ["6"], ["7"], ["8"], ["9"]
    ]
  }
}

From this grammar, the fuzzer can generate structurally valid expressions like (3 + 42) * 7 and then mutate them by swapping subtrees; for example, replacing 42 with (1 + 2 * 9), while guaranteeing the result remains a valid arithmetic expression.

Grammar Investment

Writing a complete grammar for a complex format is a significant upfront cost, but it pays dividends. A well-crafted grammar allows the fuzzer to bypass the parser immediately and explore the deeper semantic logic of the target, where the interesting bugs live.

Comparison Matrix

Tool License Target Domain Grammar Format Coverage-Guided Maturity
Nautilus AGPL-3.0 General (structured text) JSON CFG Yes Growing
Fuzzilli Apache 2.0 JavaScript engines FuzzIL (custom IR) Yes Mature
Domato Apache 2.0 Browser DOM Template grammar No Mature
FormatFuzzer GPL-2.0 Binary file formats 010 Editor templates Limited Growing
Superion Apache 2.0 General (ANTLR grammars) ANTLR Yes Research

Grammar Specification Burden

The biggest barrier to adoption of grammar-aware fuzzing is the cost of writing and maintaining input grammars. Automated grammar inference (learning the grammar from example inputs or from the target's parser code) is an active research area. Tools that can automatically extract grammars from source code or binary parsers would dramatically lower the barrier to entry. See LLM Integration for how large language models might help address this challenge.


tags: - glossary


Glossary

Term Definition
AFL American Fuzzy Lop, coverage-guided fuzzer
ASan AddressSanitizer, memory error detector
CVE Common Vulnerabilities and Exposures
AFL++ Community-maintained successor to AFL, the de facto standard coverage-guided fuzzer
AEG Automatic Exploit Generation, automated creation of working exploits from vulnerability information
ANTLR ANother Tool for Language Recognition, parser generator used by grammar-aware fuzzers like Superion
AST Abstract Syntax Tree, tree representation of source code structure used by static analyzers
BOF Buffer Overflow, writing data beyond allocated memory bounds, a common memory safety vulnerability
CFG Control Flow Graph, directed graph representing all possible execution paths through a program
CGC Cyber Grand Challenge, DARPA competition for autonomous vulnerability detection and patching
ClusterFuzz Google's distributed fuzzing infrastructure that powers OSS-Fuzz
CodeQL GitHub's query-based static analysis engine that treats code as a queryable database
Concolic Concrete + Symbolic, execution that runs concrete values while tracking symbolic constraints
Corpus Collection of seed inputs used by a coverage-guided fuzzer as the basis for mutation
Coverity Synopsys commercial static analysis platform with deep interprocedural analysis
CPG Code Property Graph, unified representation combining AST, CFG, and data-flow graph, used by Joern
CVSS Common Vulnerability Scoring System, standard for rating vulnerability severity
CWE Common Weakness Enumeration, categorization of software weakness types
DAST Dynamic Application Security Testing, testing running applications for vulnerabilities
DBI Dynamic Binary Instrumentation, modifying program behavior at runtime without recompilation
DFG Data Flow Graph, graph representing how data values propagate through a program
DPA Differential Power Analysis, extracting cryptographic keys by analyzing power consumption variations
Frida Dynamic instrumentation toolkit for injecting scripts into running processes
Harness Glue code connecting a fuzzer to its target, defining how fuzzed input is delivered
HWASAN Hardware-assisted AddressSanitizer, ARM-based variant of ASan with lower overhead
IAST Interactive Application Security Testing, combines elements of SAST and DAST during testing
Infer Meta's open-source static analyzer based on separation logic and bi-abduction
KLEE Symbolic execution engine built on LLVM for automatic test generation
LLM Large Language Model, neural network trained on text/code, used for bug detection and code generation
LSAN LeakSanitizer, detector for memory leaks, often used alongside AddressSanitizer
Meltdown CPU vulnerability exploiting out-of-order execution to read kernel memory from user space
MITRE Non-profit organization that maintains CVE, CWE, and ATT&CK frameworks
MSan MemorySanitizer, detector for reads of uninitialized memory
NVD National Vulnerability Database, NIST-maintained repository of vulnerability data
NIST National Institute of Standards and Technology, US agency maintaining security standards and NVD
OSS-Fuzz Google's free continuous fuzzing service for open-source software
OWASP Open Worldwide Application Security Project, community producing security guides and tools
RCE Remote Code Execution, vulnerability allowing an attacker to run arbitrary code on a target system
RL Reinforcement Learning, ML paradigm where agents learn through reward-based feedback
S2E Selective Symbolic Execution, whole-system analysis platform combining QEMU with KLEE
SARIF Static Analysis Results Interchange Format, standard for exchanging static analysis findings
SAST Static Application Security Testing, analyzing source code for vulnerabilities without execution
SCA Software Composition Analysis, identifying known vulnerabilities in third-party dependencies
Seed Initial input provided to a fuzzer as the starting point for mutation
Semgrep Lightweight open-source static analysis tool using pattern-matching rules
Side-channel Attack vector exploiting physical implementation artifacts rather than algorithmic flaws
SMT Satisfiability Modulo Theories, solver used by symbolic execution to find inputs satisfying path constraints
Spectre Family of CPU vulnerabilities exploiting speculative execution to leak data across security boundaries
SQLi SQL Injection, injecting malicious SQL into queries via unsanitized user input
SSRF Server-Side Request Forgery, tricking a server into making requests to unintended destinations
SymCC Compilation-based symbolic execution tool that is 2--3 orders of magnitude faster than KLEE
Taint analysis Tracking the flow of untrusted data from sources to security-sensitive sinks
TOCTOU Time-of-Check-Time-of-Use, race condition between validating a resource and using it
TSan ThreadSanitizer, detector for data races in multithreaded programs
UAF Use-After-Free, accessing memory after it has been deallocated
UBSan UndefinedBehaviorSanitizer, detector for undefined behavior in C/C++
Valgrind Dynamic binary instrumentation framework for memory debugging and profiling
XSS Cross-Site Scripting, injecting malicious scripts into web pages viewed by other users
Fine-tuning Adapting a pre-trained ML model to a specific task using additional training data
Abstract interpretation Mathematical framework for approximating program behavior using abstract domains
Dataflow analysis Tracking how values propagate through a program to detect bugs like taint violations