Dynamic Analysis¶
At a Glance
Category: Dynamic Analysis Key Tools: AddressSanitizer, MemorySanitizer, ThreadSanitizer, UBSan, Valgrind, Frida, DynamoRIO, Intel Pin Maturity: Mature Core Value: Detect bugs by observing actual program execution, with no false positives for the bugs you find, because the program really did exhibit the behavior.
Overview¶
Dynamic analysis discovers defects by instrumenting and observing programs as they execute. Unlike static analysis, which reasons about all possible program behaviors, dynamic analysis examines concrete executions, making it precise (low false positive rates) but inherently incomplete (it can only find bugs triggered by the inputs it runs). This trade-off makes dynamic analysis a powerful complement to static techniques.
The field divides into two broad instrumentation strategies. Compile-time instrumentation modifies the program during compilation, inserting checks directly into the generated machine code. This approach (exemplified by the sanitizer family (ASan, MSan, TSan, UBSan)) produces fast, tightly integrated instrumentation but requires access to source code and a recompilation step. Runtime instrumentation modifies a program's behavior at execution time, either by running it under a specialized virtual machine (Valgrind) or by injecting code into a running process (Frida, DynamoRIO, Intel Pin). Runtime instrumentation works on precompiled binaries but typically incurs higher overhead.
The choice between these approaches depends on the use case. Developers working with their own source code generally prefer compile-time sanitizers for their speed and precision. Reverse engineers, malware analysts, and security researchers working with closed-source software rely on runtime instrumentation tools that can operate without source code.
flowchart TB
subgraph CT["Compile-Time Instrumentation"]
A["Source Code"] --> B["Compiler + Sanitizer Flags"]
B --> C["Instrumented Binary"]
C --> D["Execute with Test Inputs"]
D --> E["Runtime Error Reports"]
end
subgraph RT["Runtime Instrumentation"]
F["Binary (no source)"] --> G["Instrumentation Engine"]
G --> H["Dynamic Binary Translation"]
H --> I["Execute Instrumented Code"]
I --> J["Analysis Results"]
end
style CT fill:#0f3460,stroke:#16213e,color:#e0e0e0
style RT fill:#533483,stroke:#16213e,color:#e0e0e0 Key Tools¶
Sanitizer Family (ASan, MSan, TSan, UBSan)¶
The sanitizers are a suite of compile-time instrumentation tools built into the LLVM/Clang and GCC compiler toolchains. Developed primarily by Google engineers (notably Konstantin Serebryany and colleagues), they have become the de facto standard for detecting memory errors, threading bugs, and undefined behavior in C/C++ code. Their integration into the compiler itself makes them fast, precise, and easy to use.
AddressSanitizer (ASan) detects out-of-bounds accesses (heap, stack, and global), use-after-free, use-after-return, use-after-scope, and double-free bugs. It works by maintaining a shadow memory, a compressed mapping that tracks the validity of every byte of application memory. On each memory access, the compiler-inserted instrumentation checks the shadow memory to determine whether the access is valid. ASan typically imposes a 2x slowdown and 2--3x memory overhead, which is low enough for regular use in testing and CI. ASan has been instrumental in finding thousands of bugs in major projects including Chrome, the Linux kernel, and LLVM itself.
MemorySanitizer (MSan) detects reads of uninitialized memory. It tracks the initialization state of every bit of memory using shadow memory, flagging any use of a value that was never written to. MSan is particularly valuable for security because uninitialized memory reads can leak sensitive data or cause unpredictable behavior. It imposes roughly 3x slowdown and 2x memory overhead.
ThreadSanitizer (TSan) detects data races in multithreaded C/C++ and Go programs. It implements a happens-before based race detector that tracks synchronization operations and memory accesses across threads. TSan imposes approximately 5--15x slowdown and 5--10x memory overhead, making it more expensive than ASan but still practical for targeted testing. It has found thousands of concurrency bugs in real-world software, including race conditions that would be extremely difficult to find through manual review.
UndefinedBehaviorSanitizer (UBSan) detects various forms of undefined behavior in C/C++, including signed integer overflow, null pointer dereference, misaligned pointer access, and type confusion. UBSan is lightweight (often adding less than 20% overhead) and can be combined with other sanitizers. Its checks catch bugs that may appear to work correctly on one platform but fail catastrophically on another.
Integration and Workflow. Enabling sanitizers is as simple as adding compiler flags: -fsanitize=address, -fsanitize=memory, -fsanitize=thread, or -fsanitize=undefined. This simplicity has driven widespread adoption. Google runs ASan on its entire codebase in continuous testing, and the OSS-Fuzz project combines sanitizers with fuzzing to find bugs in hundreds of open-source projects automatically.
Sanitizer Combinations
ASan and UBSan can be used together (-fsanitize=address,undefined), but ASan and MSan cannot be combined in a single build because they use conflicting shadow memory schemes. TSan also requires a separate build. Plan your CI matrix accordingly.
Strengths:
- Very low false positive rate; findings represent real bugs
- Low overhead (especially ASan and UBSan), suitable for CI/CD
- Built into major compilers (Clang, GCC), no external dependencies
- Detailed error reports with stack traces and memory state
- Proven at massive scale (Google, Microsoft, Apple)
Weaknesses:
- Require source code and recompilation
- Each sanitizer typically requires a separate build
- MSan and TSan have higher overhead, limiting use in production
- Cannot detect logic bugs or specification violations
- Limited to C/C++ (and Go for TSan); no support for managed languages
Use Cases: Continuous integration testing, fuzz testing harnesses (ASan + fuzzer is the standard combination), pre-release quality assurance, security vulnerability discovery in C/C++ codebases.
Community & Maintenance: Developed primarily by Google within the LLVM project. Actively maintained with regular improvements. The sanitizer infrastructure is also used as a foundation for other tools (e.g., KernelAddressSanitizer for Linux kernel testing).
Valgrind¶
Valgrind is a dynamic binary instrumentation framework that runs programs on a synthetic CPU, enabling deep analysis without requiring source code or recompilation. First released in 2002, Valgrind has been a cornerstone of C/C++ debugging for over two decades.
Memcheck is Valgrind's flagship tool and most common use case. It detects memory errors including use of uninitialized values, reads and writes of freed memory, reads and writes past the end of allocated blocks, memory leaks, and mismatched use of malloc/free vs new/delete. Memcheck tracks every byte of memory and every memory operation, providing detailed diagnostics when errors occur. Its thoroughness comes at a cost: programs typically run 10--30x slower under Memcheck.
Helgrind and DRD are Valgrind's thread error detectors. Helgrind implements a happens-before race detector (similar in concept to TSan), while DRD uses a different algorithm. Both detect data races, lock ordering violations, and misuse of the POSIX pthreads API. They are useful for debugging concurrency issues in programs where recompilation with TSan is not feasible.
Callgrind is a profiling tool that records function call graphs and instruction counts. Combined with the KCachegrind visualization tool, it provides detailed performance analysis. While not a bug-finding tool per se, Callgrind is valuable for understanding program behavior and identifying performance bottlenecks.
Massif profiles heap memory usage over time, helping developers understand memory consumption patterns and find opportunities to reduce memory footprint.
Strengths:
- Works on unmodified binaries; no source code or recompilation required
- Extremely thorough memory error detection (Memcheck)
- Rich ecosystem of analysis tools (profiling, threading, cache simulation)
- Mature, well-documented, and widely trusted
- Free and open-source (GPL v2)
Weaknesses:
- High overhead (10--30x slowdown) limits use in CI/CD
- Linux and macOS only (no Windows support)
- x86/x86-64 and ARM/ARM64 only
- Cannot run concurrently with compile-time sanitizers
- Some false positives with complex C++ code (custom allocators, placement new)
Use Cases: Debugging memory errors in development, memory leak investigation, profiling and performance analysis, testing legacy codebases where recompilation is impractical.
Community & Maintenance: Maintained by a small but dedicated team of core developers. Releases are infrequent but stable. Valgrind remains widely used despite the rise of compile-time sanitizers, particularly for binary-only analysis and profiling.
Frida¶
Frida is a dynamic instrumentation toolkit that allows security researchers and developers to inject JavaScript (or other scripting languages) into running processes on Windows, macOS, Linux, iOS, Android, and QNX. Created by Ole Andr Vadla Ravn s, Frida has become one of the most important tools in the mobile security and reverse engineering ecosystem.
How It Works. Frida operates by injecting a JavaScript engine (based on V8 or QuickJS) into a target process. Researchers write scripts that can intercept function calls, modify arguments and return values, trace execution, read and write process memory, and enumerate loaded modules and their exports. This scriptable approach provides extraordinary flexibility; anything that can be expressed programmatically can be instrumented.
Cross-Platform and Mobile Focus. While Frida supports desktop platforms, it is particularly popular for mobile application security testing. On Android and iOS, Frida can hook into Java/Kotlin methods (via the Java bridge), Objective-C methods (via the ObjC bridge), and native code simultaneously. This makes it invaluable for analyzing mobile apps that mix managed and native code.
Stalker Engine. Frida includes a code-tracing engine called Stalker that can follow execution at the instruction level, providing coverage information and enabling advanced analysis techniques such as in-memory fuzzing and code coverage collection for closed-source targets.
Strengths:
- Cross-platform with excellent mobile support (Android, iOS)
- Scriptable with JavaScript; low barrier to entry
- Can attach to running processes without restarting them
- Hooks at multiple levels: native, Java, Objective-C, Swift
- Active community with extensive tooling built on top (Objection, r2frida)
Weaknesses:
- Detectable by anti-instrumentation techniques in hardened apps
- Performance overhead can be significant for heavy instrumentation
- Requires root/jailbreak for some mobile platform features
- Not designed for automated bug finding; primarily a manual analysis tool
Use Cases: Mobile application security testing, reverse engineering proprietary protocols, bypassing client-side security controls, runtime API monitoring, malware analysis.
Community & Maintenance: Actively developed with regular releases. Strong community producing tools and scripts. Frida CodeShare provides a repository of community-contributed instrumentation scripts.
DynamoRIO¶
DynamoRIO is a runtime code manipulation system that supports code transformations on any part of a program while it executes. Originally developed at MIT and Hewlett-Packard, it provides a framework for building custom dynamic analysis tools.
DynamoRIO's most notable derived tool is Dr. Memory, a memory debugging tool similar to Valgrind's Memcheck but with Windows support. DynamoRIO also serves as the instrumentation backend for various code coverage, profiling, and security analysis tools. It supports x86, x86-64, ARM, and AArch64 on Windows, Linux, macOS, and Android.
Strengths:
- Cross-platform including Windows support
- Flexible API for building custom analysis tools
- Lower overhead than Valgrind for some use cases
- Dr. Memory provides Memcheck-like analysis on Windows
Weaknesses:
- Smaller community and ecosystem than Valgrind or Frida
- Documentation can be sparse for advanced use cases
- Building custom tools requires significant expertise
Use Cases: Windows memory debugging (Dr. Memory), custom binary analysis tools, code coverage collection for closed-source targets, program tracing.
Community & Maintenance: Maintained by Google with contributions from the open-source community. Active development with regular releases.
Intel Pin¶
Intel Pin is a dynamic binary instrumentation framework developed by Intel. It allows researchers to build custom analysis tools (called Pintools) that instrument x86 and x86-64 binaries at runtime.
Pin provides a rich API for instrumenting at the instruction, basic block, and function levels. It handles the complexities of code relocation and register spilling transparently, allowing Pintool authors to focus on their analysis logic. Pin has been widely used in academic research for tasks such as cache simulation, branch prediction analysis, and taint tracking.
Strengths:
- Mature and well-documented framework
- Rich API with multiple instrumentation granularities
- Widely used in academic research with extensive publications
- Handles complex x86 instruction set details transparently
Weaknesses:
- Intel x86/x86-64 only; no ARM support
- Proprietary license limits redistribution
- Higher overhead than compile-time approaches
- Less active community engagement compared to Frida or DynamoRIO
Use Cases: Academic research on program analysis, custom security analysis tools, hardware simulation and modeling, performance analysis.
Community & Maintenance: Developed and maintained by Intel. Regular releases aligned with new processor architectures. Widely cited in academic literature.
Comparison Matrix¶
| Tool | Approach | Overhead | Detection Types | Platforms | Integration |
|---|---|---|---|---|---|
| ASan | Compile-time shadow memory | ~2x slowdown | Memory errors (OOB, UAF, double-free) | Linux, macOS, Windows, Android | Clang/GCC flag |
| MSan | Compile-time shadow memory | ~3x slowdown | Uninitialized memory reads | Linux | Clang flag |
| TSan | Compile-time happens-before | 5--15x slowdown | Data races, deadlocks | Linux, macOS | Clang/GCC flag |
| UBSan | Compile-time checks | <20% slowdown | Undefined behavior (overflow, null deref) | Linux, macOS, Windows | Clang/GCC flag |
| Valgrind | Binary translation (synthetic CPU) | 10--30x slowdown | Memory errors, leaks, threading, profiling | Linux, macOS | Run under valgrind |
| Frida | Process injection + JIT | Variable | Custom (scriptable hooks) | Linux, macOS, Windows, iOS, Android | Scripting API |
| DynamoRIO | Dynamic binary translation | 2--5x typical | Custom (framework-dependent) | Linux, macOS, Windows, Android | C/C++ API |
| Intel Pin | Dynamic binary translation | 2--10x typical | Custom (Pintool-dependent) | Linux, macOS, Windows | C/C++ API |
When to Use What¶
The dynamic analysis landscape offers distinct tools for distinct workflows. Choosing the right tool depends on whether you have source code, what types of bugs you are hunting, and your performance constraints.
For development and testing with source code, the sanitizer family is the clear first choice. ASan should be a standard part of any C/C++ CI/CD pipeline (its low overhead (2x) and high detection rate make it practical for running on every commit. Combine ASan with UBSan for minimal additional cost. Use TSan in a separate build configuration to catch concurrency bugs, and MSan to detect uninitialized memory reads. The sanitizers pair exceptionally well with fuzz testing) see Coverage-Guided Fuzzing for how tools like AFL++ and libFuzzer use sanitizers to amplify bug detection.
For debugging specific issues without recompilation, Valgrind remains the tool of choice. Its Memcheck tool provides the most thorough memory error detection available for binary-only analysis on Linux. Valgrind's profiling tools (Callgrind, Massif) are also valuable for performance optimization, even when source code is available.
For reverse engineering and security research, Frida's scriptable instrumentation is unmatched. Its cross-platform support (especially on mobile) and JavaScript-based scripting API make it accessible to a broad range of researchers. For tasks like analyzing mobile apps, bypassing certificate pinning, or tracing API calls in proprietary software, Frida is the standard tool.
For building custom analysis frameworks, DynamoRIO and Intel Pin provide low-level instrumentation APIs that enable bespoke analysis tools. DynamoRIO's cross-platform support (including Windows) gives it an edge for practical tool building, while Pin's mature API and extensive academic ecosystem make it a common choice for research prototypes.
Binary-Only Analysis Gap
A persistent pain point in dynamic analysis is the gap between source-available and binary-only tooling. Sanitizers are fast and precise but require source code. Binary instrumentation tools work on any binary but are slow and less precise. Closing this gap (through techniques like binary rewriting or hardware-assisted instrumentation) remains an active area of research.
Related Pages¶
- Static Analysis: compile-time approaches that complement dynamic analysis
- Hybrid Approaches: combining static and dynamic techniques
- Coverage-Guided Fuzzing: dynamic testing that leverages sanitizers for bug detection
tags: - glossary
Glossary¶
| Term | Definition |
|---|---|
| AFL | American Fuzzy Lop, coverage-guided fuzzer |
| ASan | AddressSanitizer, memory error detector |
| CVE | Common Vulnerabilities and Exposures |
| AFL++ | Community-maintained successor to AFL, the de facto standard coverage-guided fuzzer |
| AEG | Automatic Exploit Generation, automated creation of working exploits from vulnerability information |
| ANTLR | ANother Tool for Language Recognition, parser generator used by grammar-aware fuzzers like Superion |
| AST | Abstract Syntax Tree, tree representation of source code structure used by static analyzers |
| BOF | Buffer Overflow, writing data beyond allocated memory bounds, a common memory safety vulnerability |
| CFG | Control Flow Graph, directed graph representing all possible execution paths through a program |
| CGC | Cyber Grand Challenge, DARPA competition for autonomous vulnerability detection and patching |
| ClusterFuzz | Google's distributed fuzzing infrastructure that powers OSS-Fuzz |
| CodeQL | GitHub's query-based static analysis engine that treats code as a queryable database |
| Concolic | Concrete + Symbolic, execution that runs concrete values while tracking symbolic constraints |
| Corpus | Collection of seed inputs used by a coverage-guided fuzzer as the basis for mutation |
| Coverity | Synopsys commercial static analysis platform with deep interprocedural analysis |
| CPG | Code Property Graph, unified representation combining AST, CFG, and data-flow graph, used by Joern |
| CVSS | Common Vulnerability Scoring System, standard for rating vulnerability severity |
| CWE | Common Weakness Enumeration, categorization of software weakness types |
| DAST | Dynamic Application Security Testing, testing running applications for vulnerabilities |
| DBI | Dynamic Binary Instrumentation, modifying program behavior at runtime without recompilation |
| DFG | Data Flow Graph, graph representing how data values propagate through a program |
| DPA | Differential Power Analysis, extracting cryptographic keys by analyzing power consumption variations |
| Frida | Dynamic instrumentation toolkit for injecting scripts into running processes |
| Harness | Glue code connecting a fuzzer to its target, defining how fuzzed input is delivered |
| HWASAN | Hardware-assisted AddressSanitizer, ARM-based variant of ASan with lower overhead |
| IAST | Interactive Application Security Testing, combines elements of SAST and DAST during testing |
| Infer | Meta's open-source static analyzer based on separation logic and bi-abduction |
| KLEE | Symbolic execution engine built on LLVM for automatic test generation |
| LLM | Large Language Model, neural network trained on text/code, used for bug detection and code generation |
| LSAN | LeakSanitizer, detector for memory leaks, often used alongside AddressSanitizer |
| Meltdown | CPU vulnerability exploiting out-of-order execution to read kernel memory from user space |
| MITRE | Non-profit organization that maintains CVE, CWE, and ATT&CK frameworks |
| MSan | MemorySanitizer, detector for reads of uninitialized memory |
| NVD | National Vulnerability Database, NIST-maintained repository of vulnerability data |
| NIST | National Institute of Standards and Technology, US agency maintaining security standards and NVD |
| OSS-Fuzz | Google's free continuous fuzzing service for open-source software |
| OWASP | Open Worldwide Application Security Project, community producing security guides and tools |
| RCE | Remote Code Execution, vulnerability allowing an attacker to run arbitrary code on a target system |
| RL | Reinforcement Learning, ML paradigm where agents learn through reward-based feedback |
| S2E | Selective Symbolic Execution, whole-system analysis platform combining QEMU with KLEE |
| SARIF | Static Analysis Results Interchange Format, standard for exchanging static analysis findings |
| SAST | Static Application Security Testing, analyzing source code for vulnerabilities without execution |
| SCA | Software Composition Analysis, identifying known vulnerabilities in third-party dependencies |
| Seed | Initial input provided to a fuzzer as the starting point for mutation |
| Semgrep | Lightweight open-source static analysis tool using pattern-matching rules |
| Side-channel | Attack vector exploiting physical implementation artifacts rather than algorithmic flaws |
| SMT | Satisfiability Modulo Theories, solver used by symbolic execution to find inputs satisfying path constraints |
| Spectre | Family of CPU vulnerabilities exploiting speculative execution to leak data across security boundaries |
| SQLi | SQL Injection, injecting malicious SQL into queries via unsanitized user input |
| SSRF | Server-Side Request Forgery, tricking a server into making requests to unintended destinations |
| SymCC | Compilation-based symbolic execution tool that is 2--3 orders of magnitude faster than KLEE |
| Taint analysis | Tracking the flow of untrusted data from sources to security-sensitive sinks |
| TOCTOU | Time-of-Check-Time-of-Use, race condition between validating a resource and using it |
| TSan | ThreadSanitizer, detector for data races in multithreaded programs |
| UAF | Use-After-Free, accessing memory after it has been deallocated |
| UBSan | UndefinedBehaviorSanitizer, detector for undefined behavior in C/C++ |
| Valgrind | Dynamic binary instrumentation framework for memory debugging and profiling |
| XSS | Cross-Site Scripting, injecting malicious scripts into web pages viewed by other users |
| Fine-tuning | Adapting a pre-trained ML model to a specific task using additional training data |
| Abstract interpretation | Mathematical framework for approximating program behavior using abstract domains |
| Dataflow analysis | Tracking how values propagate through a program to detect bugs like taint violations |