Under-Researched Targets¶

At a Glance

This page identifies critical software that scores Medium or High on the prioritization framework but has received disproportionately little security research attention. These targets represent high-opportunity areas for vulnerability discovery, where the gap between potential impact and actual research coverage creates outsized returns for focused investment.

Identification Criteria¶

Several signals indicate that a software target is under-researched relative to its security importance:

Few published CVEs despite codebase complexity: Large or complex codebases with sparse CVE histories suggest a lack of scrutiny rather than inherent security.
Limited or no fuzzing harness availability: The absence of publicly available fuzzing harnesses or seed corpora signals minimal automated testing by the research community.
No OSS-Fuzz integration: Projects not enrolled in Google's OSS-Fuzz lack continuous, automated fuzzing coverage that has proven highly effective for enrolled projects.
Complex codebases with minimal public security audit history: Projects that have never undergone a published third-party security audit are more likely to harbor systemic vulnerabilities.
Proprietary or hard-to-access source code: Closed-source targets attract fewer independent researchers due to the higher cost of analysis (reverse engineering, licensing).
Specialized hardware requirements for testing: Targets that require specific hardware (PLCs, ECUs, embedded dev boards) impose a significant barrier to entry for most researchers.

Under-Researched Target Catalog¶

EDK II / UEFI Firmware

Why under-researched: UEFI firmware executes before the operating system loads, operating outside the visibility of most security tools. The EDK II reference implementation is open source, but the vendor-specific firmware built on top of it is typically proprietary. Testing requires specialized tooling (QEMU with OVMF, or physical hardware with SPI flash programmers), and the firmware development community is small relative to OS or application-layer security research.

Composite score: ~46/70 (High). High privilege level (pre-OS, ring -2 equivalent), moderate deployment scale (most x86 systems), moderate input exposure (UEFI variables, network boot, capsule updates).

Research difficulty: Medium-High. Open-source base (EDK II) is accessible, but vendor forks require reverse engineering. Hardware-level debugging (JTAG, serial consoles) is sometimes necessary.

Potential impact: Persistent compromise that survives OS reinstallation. UEFI rootkits (such as LoJax, CosmicStrand, BlackLotus) demonstrate real-world exploitation with extremely high severity.

OPC UA Implementations

Why under-researched: OPC UA is the dominant machine-to-machine communication protocol in industrial automation and SCADA systems. Open-source implementations (open62541, Eclipse Milo, node-opcua) exist but have received few public security audits. The protocol specification is complex (over 1,200 pages), and testing requires familiarity with industrial automation concepts.

Composite score: ~42/70 (High). High protocol exposure (network-facing, parses complex binary and XML structures), high privilege context (controls physical processes), moderate deployment scale (industrial environments).

Research difficulty: Medium. Open-source implementations are available. No specialized hardware is strictly required, though realistic testing benefits from simulated PLC environments.

Potential impact: Compromise of industrial control systems, potential for physical damage to manufacturing processes, energy infrastructure, or water treatment systems.

Modbus/DNP3 Protocol Implementations

Why under-researched: Modbus and DNP3 are foundational protocols in power grid, water, and manufacturing control systems. Both protocols were designed without authentication or encryption. Implementations are scattered across proprietary PLCs and RTUs with minimal public source code. The small community of ICS security researchers concentrates on higher-profile targets like Siemens S7.

Composite score: ~38/70 (Medium). High privilege context (controls physical processes), moderate deployment (critical infrastructure), lower input complexity (simpler protocol structures than OPC UA).

Research difficulty: High. Most implementations are proprietary firmware on embedded devices. Open-source libraries (libmodbus, OpenDNP3) are accessible but represent a fraction of deployed code. Hardware acquisition adds cost.

Potential impact: Direct manipulation of physical processes in power grids, water treatment, and manufacturing. Demonstrated by attacks such as Industroyer/CrashOverride.

Automotive CAN Bus Software

Why under-researched: The Controller Area Network (CAN) bus remains the primary in-vehicle communication backbone. CAN lacks authentication, and ECU firmware is proprietary. The growing connectivity of vehicles (telematics, V2X, OTA updates) expands the remote attack surface, but the research community remains small due to hardware costs and legal uncertainty around vehicle security research.

Composite score: ~40/70 (High). High privilege (safety-critical vehicle functions), growing deployment scale (billions of ECUs shipped annually), moderate input exposure (expanding via connected vehicle interfaces).

Research difficulty: High. Requires vehicle hardware or expensive simulators, specialized tools (Vector CANoe, PCAN), and reverse engineering of proprietary ECU firmware. Legal concerns under DMCA and vehicle safety regulations add friction.

Potential impact: Safety-critical consequences including brake, steering, and powertrain manipulation. Remote exploitation demonstrated by Miller and Valasek's Jeep Cherokee research (2015) and subsequent work.

Zephyr RTOS

Why under-researched: Zephyr is a Linux Foundation-backed RTOS with growing adoption in IoT devices, wearables, and industrial sensors. While FreeRTOS has received substantial security attention (including Amazon's investment post-acquisition), Zephyr's security research community is smaller despite its expanding deployment footprint and more complex feature set (native networking stack, Bluetooth, USB).

Composite score: ~36/70 (Medium). Moderate deployment scale (growing IoT adoption), moderate privilege (embedded device control), moderate input exposure (Bluetooth, IP networking, USB).

Research difficulty: Medium. Fully open source with good documentation. QEMU support enables testing without physical hardware for many board targets. Requires embedded systems familiarity.

Potential impact: Compromise of IoT devices at scale, potential for botnet recruitment or pivoting into connected networks. Supply chain impact through Zephyr's use as a reference platform.

systemd-resolved / systemd-networkd

Why under-researched: These systemd components increasingly handle DNS resolution and network configuration on Linux systems, displacing traditional tools. While BIND, Unbound, and dnsmasq have received decades of security scrutiny, systemd's network components are relatively new and have attracted less focused research despite processing untrusted network input at a privileged level.

Composite score: ~42/70 (High). High deployment scale (default on most major Linux distributions), high input exposure (DNS parsing, DHCP, network configuration), moderate privilege level (runs as root or with elevated capabilities).

Research difficulty: Low-Medium. Fully open source, runs on standard Linux systems, standard debugging tools apply. Fuzzing harnesses for systemd components are limited but buildable with moderate effort.

Potential impact: DNS cache poisoning, remote code execution on Linux servers and desktops, network-level compromise affecting cloud infrastructure and containerized environments.

libarchive

Why under-researched: libarchive handles reading and writing of tar, cpio, zip, 7-zip, ISO 9660, and dozens of other archive formats. While individual compression libraries (zlib, xz, zstd) have received focused fuzzing attention, libarchive's format-parsing logic spans a much larger attack surface that has received comparatively less coverage. It is used by FreeBSD's pkg, macOS Archive Utility, CMake, and numerous other tools.

Composite score: ~40/70 (High). High dependency footprint (embedded in OS package managers and build tools), high input exposure (parses untrusted archive files from the network), moderate privilege (often runs in privileged contexts during package installation).

Research difficulty: Low. Fully open source, pure C, runs on standard hardware. Has some OSS-Fuzz coverage, but the breadth of supported formats means many code paths remain under-exercised.

Potential impact: Supply chain attacks through malicious archives, remote code execution via crafted archive files in email attachments, web downloads, or package repositories.

Service Mesh Data Planes (Linkerd-proxy, Cilium)

Why under-researched: Service mesh proxies sit on the critical path of all inter-service communication in Kubernetes environments. Linkerd-proxy (Rust-based) and Cilium's eBPF-based data plane are rapidly deployed but lack the extensive vulnerability research history of older proxies like Envoy. Their relative novelty and the Kubernetes-specific expertise required to test them limit the researcher pool.

Composite score: ~38/70 (Medium). High deployment in cloud-native infrastructure, high input exposure (proxies all network traffic), moderate privilege (network namespace control, eBPF program loading).

Research difficulty: Medium. Open source, but realistic testing requires Kubernetes cluster setup and familiarity with service mesh architecture. Cilium's eBPF components require kernel-level understanding.

Potential impact: Lateral movement across microservice architectures, traffic interception and manipulation in production cloud environments, potential for cluster-wide compromise.

Knowledge Gap

Quantitative vulnerability data for service mesh data planes is limited due to their short deployment history. Composite scores rely on architectural analysis rather than historical CVE data.

Container Image Registries (distribution/distribution)

Why under-researched: The CNCF distribution project (formerly Docker Registry v2) underpins Docker Hub, GitHub Container Registry, and most private container registries. Despite its role as a supply chain chokepoint, it has received limited public security research. The registry API handles image manifests, layer uploads, and content-addressable storage, all of which process untrusted input.

Composite score: ~39/70 (Medium-High). High dependency footprint (foundational to container supply chain), high input exposure (HTTP API processing untrusted manifests and blobs), moderate deployment scale.

Research difficulty: Low-Medium. Fully open source (Go), runs on standard hardware, straightforward to deploy locally. API fuzzing and manifest parsing are accessible research targets.

Potential impact: Supply chain compromise affecting all images distributed through a vulnerable registry. Image substitution, manifest manipulation, or denial of service against container deployment pipelines.

WebAssembly Runtimes (Wasmtime, Wasmer)

Why under-researched: WebAssembly is expanding beyond browsers into server-side (Cloudflare Workers, Fastly Compute), plugin systems (Envoy, Zed editor), and edge computing. Runtimes like Wasmtime and Wasmer implement complex compilation pipelines (Cranelift, LLVM backends) and sandbox enforcement. The security research community has focused primarily on browser-hosted Wasm engines (V8, SpiderMonkey), leaving standalone runtimes with less coverage.

Composite score: ~36/70 (Medium). Growing deployment scale, moderate input exposure (executes untrusted Wasm modules), moderate privilege (sandbox boundary enforcement is the critical security property).

Research difficulty: Low-Medium. Fully open source (Rust), runs on standard hardware, Wasm module generation for fuzzing is well-supported. Cranelift compiler internals require compiler engineering expertise.

Potential impact: Sandbox escape allowing untrusted Wasm modules to execute arbitrary code on host systems. Given deployment in edge computing and serverless platforms, a single runtime vulnerability could affect millions of workloads.

Barriers to Research¶

Several structural factors explain why high-value targets remain under-researched:

Proprietary source code: Closed-source targets (vendor UEFI firmware, PLC firmware, automotive ECU software) require reverse engineering, increasing time-to-finding by an order of magnitude compared to open-source targets.
Specialized hardware requirements: Researching automotive CAN bus, industrial protocols, or firmware targets often requires purchasing specific hardware (development boards, protocol analyzers, vehicles), creating a financial barrier.
Niche expertise required: Industrial control protocols, firmware reverse engineering, compiler internals, and eBPF verification each demand domain-specific knowledge that few security researchers possess.
Legal concerns: DMCA provisions, computer fraud statutes, and vendor hostility toward security researchers create legal risk, particularly for automotive and IoT device research. See the Gaps & Opportunities overview for broader discussion.
Lack of established tooling: Many of these targets lack publicly available fuzzing harnesses, seed corpora, or sanitizer support, requiring researchers to build infrastructure before beginning vulnerability discovery.
Small researcher communities: Domains like ICS security, firmware security, and service mesh internals have small, specialized researcher populations, limiting the rate of vulnerability discovery through sheer numbers.

Recommendations¶

Highest Impact (Score Relative to Coverage)¶

EDK II / UEFI firmware: The combination of pre-OS privilege, persistence across reinstallation, and minimal fuzzing coverage makes this the single highest-impact under-researched target category. Investment in UEFI fuzzing harnesses would yield disproportionate returns.
systemd-resolved / systemd-networkd: High composite score, growing deployment on nearly all Linux distributions, and relatively low research coverage compared to legacy DNS resolvers.
OPC UA implementations: Complex protocol with direct industrial control system impact and few published security audits.

Most Accessible (Open Source, Standard Hardware)¶

libarchive: Pure C, open source, standard hardware, broad format coverage that creates many under-tested code paths. An accessible entry point for researchers new to this target set.
WebAssembly runtimes: Open source (Rust), standard hardware, strong community documentation, and well-supported Wasm module generation for fuzzing input.
Container image registries: Open source (Go), simple local deployment, HTTP-based API suitable for standard web fuzzing techniques.

Strategic Importance (Critical Infrastructure, Supply Chain)¶

Modbus/DNP3 implementations: Direct critical infrastructure impact (power grids, water treatment), minimal built-in security, and very few active researchers. See Embedded & IoT for related target analysis.
Automotive CAN bus software: Safety-critical with expanding remote attack surface. Connected vehicle growth makes this increasingly urgent.
Container image registries and service mesh data planes: Supply chain chokepoints for cloud-native infrastructure. See Cloud Infrastructure for related analysis.

For broader context on tooling gaps and research opportunities across the vulnerability research landscape, see Gaps & Opportunities.

tags: - glossary

Glossary¶

Term	Definition
AFL	American Fuzzy Lop, coverage-guided fuzzer
ASan	AddressSanitizer, memory error detector
CVE	Common Vulnerabilities and Exposures
AFL++	Community-maintained successor to AFL, the de facto standard coverage-guided fuzzer
AEG	Automatic Exploit Generation, automated creation of working exploits from vulnerability information
ANTLR	ANother Tool for Language Recognition, parser generator used by grammar-aware fuzzers like Superion
AST	Abstract Syntax Tree, tree representation of source code structure used by static analyzers
BOD	Binding Operational Directive, mandatory cybersecurity directives issued by CISA
BOF	Buffer Overflow, writing data beyond allocated memory bounds, a common memory safety vulnerability
CFG	Control Flow Graph, directed graph representing all possible execution paths through a program
CGC	Cyber Grand Challenge, DARPA competition for autonomous vulnerability detection and patching
ClusterFuzz	Google's distributed fuzzing infrastructure that powers OSS-Fuzz
CodeQL	GitHub's query-based static analysis engine that treats code as a queryable database
CFAA	Computer Fraud and Abuse Act, US federal law governing computer security violations
CNA	CVE Numbering Authority, organization authorized to assign CVE IDs
CNNVD	China National Vulnerability Database of Information Security
CNVD	China National Vulnerability Database
Concolic	Concrete + Symbolic, execution that runs concrete values while tracking symbolic constraints
Corpus	Collection of seed inputs used by a coverage-guided fuzzer as the basis for mutation
Coverity	Synopsys commercial static analysis platform with deep interprocedural analysis
CPG	Code Property Graph, unified representation combining AST, CFG, and data-flow graph, used by Joern
CVSS	Common Vulnerability Scoring System, standard for rating vulnerability severity
CWE	Common Weakness Enumeration, categorization of software weakness types
DAST	Dynamic Application Security Testing, testing running applications for vulnerabilities
DBI	Dynamic Binary Instrumentation, modifying program behavior at runtime without recompilation
DFG	Data Flow Graph, graph representing how data values propagate through a program
DPA	Differential Power Analysis, extracting cryptographic keys by analyzing power consumption variations
Frida	Dynamic instrumentation toolkit for injecting scripts into running processes
Harness	Glue code connecting a fuzzer to its target, defining how fuzzed input is delivered
HWASAN	Hardware-assisted AddressSanitizer, ARM-based variant of ASan with lower overhead
IAST	Interactive Application Security Testing, combines elements of SAST and DAST during testing
Infer	Meta's open-source static analyzer based on separation logic and bi-abduction
JVN	Japan Vulnerability Notes, Japanese vulnerability information portal
KLEE	Symbolic execution engine built on LLVM for automatic test generation
LLM	Large Language Model, neural network trained on text/code, used for bug detection and code generation
LSAN	LeakSanitizer, detector for memory leaks, often used alongside AddressSanitizer
Meltdown	CPU vulnerability exploiting out-of-order execution to read kernel memory from user space
MITRE	Non-profit organization that maintains CVE, CWE, and ATT&CK frameworks
MTTR	Mean Time to Remediate, average duration from vulnerability disclosure to patch deployment
MSan	MemorySanitizer, detector for reads of uninitialized memory
NVD	National Vulnerability Database, NIST-maintained repository of vulnerability data
NIST	National Institute of Standards and Technology, US agency maintaining security standards and NVD
OpenSSF	Open Source Security Foundation, Linux Foundation project for open-source security
OSS-Fuzz	Google's free continuous fuzzing service for open-source software
OWASP	Open Worldwide Application Security Project, community producing security guides and tools
RCE	Remote Code Execution, vulnerability allowing an attacker to run arbitrary code on a target system
RL	Reinforcement Learning, ML paradigm where agents learn through reward-based feedback
S2E	Selective Symbolic Execution, whole-system analysis platform combining QEMU with KLEE
SARIF	Static Analysis Results Interchange Format, standard for exchanging static analysis findings
SAST	Static Application Security Testing, analyzing source code for vulnerabilities without execution
SCA	Software Composition Analysis, identifying known vulnerabilities in third-party dependencies
Seed	Initial input provided to a fuzzer as the starting point for mutation
Semgrep	Lightweight open-source static analysis tool using pattern-matching rules
Side-channel	Attack vector exploiting physical implementation artifacts rather than algorithmic flaws
SMT	Satisfiability Modulo Theories, solver used by symbolic execution to find inputs satisfying path constraints
Spectre	Family of CPU vulnerabilities exploiting speculative execution to leak data across security boundaries
SQLi	SQL Injection, injecting malicious SQL into queries via unsanitized user input
SSRF	Server-Side Request Forgery, tricking a server into making requests to unintended destinations
SymCC	Compilation-based symbolic execution tool that is 2--3 orders of magnitude faster than KLEE
Taint analysis	Tracking the flow of untrusted data from sources to security-sensitive sinks
VDP	Vulnerability Disclosure Program, formal process for receiving vulnerability reports
TOCTOU	Time-of-Check-Time-of-Use, race condition between validating a resource and using it
TSan	ThreadSanitizer, detector for data races in multithreaded programs
UAF	Use-After-Free, accessing memory after it has been deallocated
UBSan	UndefinedBehaviorSanitizer, detector for undefined behavior in C/C++
Valgrind	Dynamic binary instrumentation framework for memory debugging and profiling
XSS	Cross-Site Scripting, injecting malicious scripts into web pages viewed by other users
Fine-tuning	Adapting a pre-trained ML model to a specific task using additional training data
AUTOSAR	Automotive Open System Architecture, standardized software framework for automotive ECUs
CAN	Controller Area Network, vehicle bus standard for microcontroller communication
DNP3	Distributed Network Protocol, used in SCADA and utility systems
EDK II	EFI Development Kit II, open-source UEFI firmware development environment
OPC UA	Open Platform Communications Unified Architecture, industrial automation protocol
RTOS	Real-Time Operating System, OS designed for real-time applications with deterministic timing
Abstract interpretation	Mathematical framework for approximating program behavior using abstract domains
Dataflow analysis	Tracking how values propagate through a program to detect bugs like taint violations