Skip to content

Under-Researched Targets

At a Glance

This page identifies critical software that scores Medium or High on the prioritization framework but has received disproportionately little security research attention. These targets represent high-opportunity areas for vulnerability discovery, where the gap between potential impact and actual research coverage creates outsized returns for focused investment.

Identification Criteria

Several signals indicate that a software target is under-researched relative to its security importance:

  • Few published CVEs despite codebase complexity: Large or complex codebases with sparse CVE histories suggest a lack of scrutiny rather than inherent security.
  • Limited or no fuzzing harness availability: The absence of publicly available fuzzing harnesses or seed corpora signals minimal automated testing by the research community.
  • No OSS-Fuzz integration: Projects not enrolled in Google's OSS-Fuzz lack continuous, automated fuzzing coverage that has proven highly effective for enrolled projects.
  • Complex codebases with minimal public security audit history: Projects that have never undergone a published third-party security audit are more likely to harbor systemic vulnerabilities.
  • Proprietary or hard-to-access source code: Closed-source targets attract fewer independent researchers due to the higher cost of analysis (reverse engineering, licensing).
  • Specialized hardware requirements for testing: Targets that require specific hardware (PLCs, ECUs, embedded dev boards) impose a significant barrier to entry for most researchers.

Under-Researched Target Catalog

EDK II / UEFI Firmware

Why under-researched: UEFI firmware executes before the operating system loads, operating outside the visibility of most security tools. The EDK II reference implementation is open source, but the vendor-specific firmware built on top of it is typically proprietary. Testing requires specialized tooling (QEMU with OVMF, or physical hardware with SPI flash programmers), and the firmware development community is small relative to OS or application-layer security research.

Composite score: ~46/70 (High). High privilege level (pre-OS, ring -2 equivalent), moderate deployment scale (most x86 systems), moderate input exposure (UEFI variables, network boot, capsule updates).

Research difficulty: Medium-High. Open-source base (EDK II) is accessible, but vendor forks require reverse engineering. Hardware-level debugging (JTAG, serial consoles) is sometimes necessary.

Potential impact: Persistent compromise that survives OS reinstallation. UEFI rootkits (such as LoJax, CosmicStrand, BlackLotus) demonstrate real-world exploitation with extremely high severity.

OPC UA Implementations

Why under-researched: OPC UA is the dominant machine-to-machine communication protocol in industrial automation and SCADA systems. Open-source implementations (open62541, Eclipse Milo, node-opcua) exist but have received few public security audits. The protocol specification is complex (over 1,200 pages), and testing requires familiarity with industrial automation concepts.

Composite score: ~42/70 (High). High protocol exposure (network-facing, parses complex binary and XML structures), high privilege context (controls physical processes), moderate deployment scale (industrial environments).

Research difficulty: Medium. Open-source implementations are available. No specialized hardware is strictly required, though realistic testing benefits from simulated PLC environments.

Potential impact: Compromise of industrial control systems, potential for physical damage to manufacturing processes, energy infrastructure, or water treatment systems.

Modbus/DNP3 Protocol Implementations

Why under-researched: Modbus and DNP3 are foundational protocols in power grid, water, and manufacturing control systems. Both protocols were designed without authentication or encryption. Implementations are scattered across proprietary PLCs and RTUs with minimal public source code. The small community of ICS security researchers concentrates on higher-profile targets like Siemens S7.

Composite score: ~38/70 (Medium). High privilege context (controls physical processes), moderate deployment (critical infrastructure), lower input complexity (simpler protocol structures than OPC UA).

Research difficulty: High. Most implementations are proprietary firmware on embedded devices. Open-source libraries (libmodbus, OpenDNP3) are accessible but represent a fraction of deployed code. Hardware acquisition adds cost.

Potential impact: Direct manipulation of physical processes in power grids, water treatment, and manufacturing. Demonstrated by attacks such as Industroyer/CrashOverride.

Automotive CAN Bus Software

Why under-researched: The Controller Area Network (CAN) bus remains the primary in-vehicle communication backbone. CAN lacks authentication, and ECU firmware is proprietary. The growing connectivity of vehicles (telematics, V2X, OTA updates) expands the remote attack surface, but the research community remains small due to hardware costs and legal uncertainty around vehicle security research.

Composite score: ~40/70 (High). High privilege (safety-critical vehicle functions), growing deployment scale (billions of ECUs shipped annually), moderate input exposure (expanding via connected vehicle interfaces).

Research difficulty: High. Requires vehicle hardware or expensive simulators, specialized tools (Vector CANoe, PCAN), and reverse engineering of proprietary ECU firmware. Legal concerns under DMCA and vehicle safety regulations add friction.

Potential impact: Safety-critical consequences including brake, steering, and powertrain manipulation. Remote exploitation demonstrated by Miller and Valasek's Jeep Cherokee research (2015) and subsequent work.

Zephyr RTOS

Why under-researched: Zephyr is a Linux Foundation-backed RTOS with growing adoption in IoT devices, wearables, and industrial sensors. While FreeRTOS has received substantial security attention (including Amazon's investment post-acquisition), Zephyr's security research community is smaller despite its expanding deployment footprint and more complex feature set (native networking stack, Bluetooth, USB).

Composite score: ~36/70 (Medium). Moderate deployment scale (growing IoT adoption), moderate privilege (embedded device control), moderate input exposure (Bluetooth, IP networking, USB).

Research difficulty: Medium. Fully open source with good documentation. QEMU support enables testing without physical hardware for many board targets. Requires embedded systems familiarity.

Potential impact: Compromise of IoT devices at scale, potential for botnet recruitment or pivoting into connected networks. Supply chain impact through Zephyr's use as a reference platform.

systemd-resolved / systemd-networkd

Why under-researched: These systemd components increasingly handle DNS resolution and network configuration on Linux systems, displacing traditional tools. While BIND, Unbound, and dnsmasq have received decades of security scrutiny, systemd's network components are relatively new and have attracted less focused research despite processing untrusted network input at a privileged level.

Composite score: ~42/70 (High). High deployment scale (default on most major Linux distributions), high input exposure (DNS parsing, DHCP, network configuration), moderate privilege level (runs as root or with elevated capabilities).

Research difficulty: Low-Medium. Fully open source, runs on standard Linux systems, standard debugging tools apply. Fuzzing harnesses for systemd components are limited but buildable with moderate effort.

Potential impact: DNS cache poisoning, remote code execution on Linux servers and desktops, network-level compromise affecting cloud infrastructure and containerized environments.

libarchive

Why under-researched: libarchive handles reading and writing of tar, cpio, zip, 7-zip, ISO 9660, and dozens of other archive formats. While individual compression libraries (zlib, xz, zstd) have received focused fuzzing attention, libarchive's format-parsing logic spans a much larger attack surface that has received comparatively less coverage. It is used by FreeBSD's pkg, macOS Archive Utility, CMake, and numerous other tools.

Composite score: ~40/70 (High). High dependency footprint (embedded in OS package managers and build tools), high input exposure (parses untrusted archive files from the network), moderate privilege (often runs in privileged contexts during package installation).

Research difficulty: Low. Fully open source, pure C, runs on standard hardware. Has some OSS-Fuzz coverage, but the breadth of supported formats means many code paths remain under-exercised.

Potential impact: Supply chain attacks through malicious archives, remote code execution via crafted archive files in email attachments, web downloads, or package repositories.

Service Mesh Data Planes (Linkerd-proxy, Cilium)

Why under-researched: Service mesh proxies sit on the critical path of all inter-service communication in Kubernetes environments. Linkerd-proxy (Rust-based) and Cilium's eBPF-based data plane are rapidly deployed but lack the extensive vulnerability research history of older proxies like Envoy. Their relative novelty and the Kubernetes-specific expertise required to test them limit the researcher pool.

Composite score: ~38/70 (Medium). High deployment in cloud-native infrastructure, high input exposure (proxies all network traffic), moderate privilege (network namespace control, eBPF program loading).

Research difficulty: Medium. Open source, but realistic testing requires Kubernetes cluster setup and familiarity with service mesh architecture. Cilium's eBPF components require kernel-level understanding.

Potential impact: Lateral movement across microservice architectures, traffic interception and manipulation in production cloud environments, potential for cluster-wide compromise.

Knowledge Gap

Quantitative vulnerability data for service mesh data planes is limited due to their short deployment history. Composite scores rely on architectural analysis rather than historical CVE data.

Container Image Registries (distribution/distribution)

Why under-researched: The CNCF distribution project (formerly Docker Registry v2) underpins Docker Hub, GitHub Container Registry, and most private container registries. Despite its role as a supply chain chokepoint, it has received limited public security research. The registry API handles image manifests, layer uploads, and content-addressable storage, all of which process untrusted input.

Composite score: ~39/70 (Medium-High). High dependency footprint (foundational to container supply chain), high input exposure (HTTP API processing untrusted manifests and blobs), moderate deployment scale.

Research difficulty: Low-Medium. Fully open source (Go), runs on standard hardware, straightforward to deploy locally. API fuzzing and manifest parsing are accessible research targets.

Potential impact: Supply chain compromise affecting all images distributed through a vulnerable registry. Image substitution, manifest manipulation, or denial of service against container deployment pipelines.

WebAssembly Runtimes (Wasmtime, Wasmer)

Why under-researched: WebAssembly is expanding beyond browsers into server-side (Cloudflare Workers, Fastly Compute), plugin systems (Envoy, Zed editor), and edge computing. Runtimes like Wasmtime and Wasmer implement complex compilation pipelines (Cranelift, LLVM backends) and sandbox enforcement. The security research community has focused primarily on browser-hosted Wasm engines (V8, SpiderMonkey), leaving standalone runtimes with less coverage.

Composite score: ~36/70 (Medium). Growing deployment scale, moderate input exposure (executes untrusted Wasm modules), moderate privilege (sandbox boundary enforcement is the critical security property).

Research difficulty: Low-Medium. Fully open source (Rust), runs on standard hardware, Wasm module generation for fuzzing is well-supported. Cranelift compiler internals require compiler engineering expertise.

Potential impact: Sandbox escape allowing untrusted Wasm modules to execute arbitrary code on host systems. Given deployment in edge computing and serverless platforms, a single runtime vulnerability could affect millions of workloads.

Barriers to Research

Several structural factors explain why high-value targets remain under-researched:

  • Proprietary source code: Closed-source targets (vendor UEFI firmware, PLC firmware, automotive ECU software) require reverse engineering, increasing time-to-finding by an order of magnitude compared to open-source targets.
  • Specialized hardware requirements: Researching automotive CAN bus, industrial protocols, or firmware targets often requires purchasing specific hardware (development boards, protocol analyzers, vehicles), creating a financial barrier.
  • Niche expertise required: Industrial control protocols, firmware reverse engineering, compiler internals, and eBPF verification each demand domain-specific knowledge that few security researchers possess.
  • Legal concerns: DMCA provisions, computer fraud statutes, and vendor hostility toward security researchers create legal risk, particularly for automotive and IoT device research. See the Gaps & Opportunities overview for broader discussion.
  • Lack of established tooling: Many of these targets lack publicly available fuzzing harnesses, seed corpora, or sanitizer support, requiring researchers to build infrastructure before beginning vulnerability discovery.
  • Small researcher communities: Domains like ICS security, firmware security, and service mesh internals have small, specialized researcher populations, limiting the rate of vulnerability discovery through sheer numbers.

Recommendations

Highest Impact (Score Relative to Coverage)

  1. EDK II / UEFI firmware: The combination of pre-OS privilege, persistence across reinstallation, and minimal fuzzing coverage makes this the single highest-impact under-researched target category. Investment in UEFI fuzzing harnesses would yield disproportionate returns.
  2. systemd-resolved / systemd-networkd: High composite score, growing deployment on nearly all Linux distributions, and relatively low research coverage compared to legacy DNS resolvers.
  3. OPC UA implementations: Complex protocol with direct industrial control system impact and few published security audits.

Most Accessible (Open Source, Standard Hardware)

  1. libarchive: Pure C, open source, standard hardware, broad format coverage that creates many under-tested code paths. An accessible entry point for researchers new to this target set.
  2. WebAssembly runtimes: Open source (Rust), standard hardware, strong community documentation, and well-supported Wasm module generation for fuzzing input.
  3. Container image registries: Open source (Go), simple local deployment, HTTP-based API suitable for standard web fuzzing techniques.

Strategic Importance (Critical Infrastructure, Supply Chain)

  1. Modbus/DNP3 implementations: Direct critical infrastructure impact (power grids, water treatment), minimal built-in security, and very few active researchers. See Embedded & IoT for related target analysis.
  2. Automotive CAN bus software: Safety-critical with expanding remote attack surface. Connected vehicle growth makes this increasingly urgent.
  3. Container image registries and service mesh data planes: Supply chain chokepoints for cloud-native infrastructure. See Cloud Infrastructure for related analysis.

For broader context on tooling gaps and research opportunities across the vulnerability research landscape, see Gaps & Opportunities.


tags: - glossary


Glossary

Term Definition
AFL American Fuzzy Lop, coverage-guided fuzzer
ASan AddressSanitizer, memory error detector
CVE Common Vulnerabilities and Exposures
AFL++ Community-maintained successor to AFL, the de facto standard coverage-guided fuzzer
AEG Automatic Exploit Generation, automated creation of working exploits from vulnerability information
ANTLR ANother Tool for Language Recognition, parser generator used by grammar-aware fuzzers like Superion
AST Abstract Syntax Tree, tree representation of source code structure used by static analyzers
BOD Binding Operational Directive, mandatory cybersecurity directives issued by CISA
BOF Buffer Overflow, writing data beyond allocated memory bounds, a common memory safety vulnerability
CFG Control Flow Graph, directed graph representing all possible execution paths through a program
CGC Cyber Grand Challenge, DARPA competition for autonomous vulnerability detection and patching
ClusterFuzz Google's distributed fuzzing infrastructure that powers OSS-Fuzz
CodeQL GitHub's query-based static analysis engine that treats code as a queryable database
CFAA Computer Fraud and Abuse Act, US federal law governing computer security violations
CNA CVE Numbering Authority, organization authorized to assign CVE IDs
CNNVD China National Vulnerability Database of Information Security
CNVD China National Vulnerability Database
Concolic Concrete + Symbolic, execution that runs concrete values while tracking symbolic constraints
Corpus Collection of seed inputs used by a coverage-guided fuzzer as the basis for mutation
Coverity Synopsys commercial static analysis platform with deep interprocedural analysis
CPG Code Property Graph, unified representation combining AST, CFG, and data-flow graph, used by Joern
CVSS Common Vulnerability Scoring System, standard for rating vulnerability severity
CWE Common Weakness Enumeration, categorization of software weakness types
DAST Dynamic Application Security Testing, testing running applications for vulnerabilities
DBI Dynamic Binary Instrumentation, modifying program behavior at runtime without recompilation
DFG Data Flow Graph, graph representing how data values propagate through a program
DPA Differential Power Analysis, extracting cryptographic keys by analyzing power consumption variations
Frida Dynamic instrumentation toolkit for injecting scripts into running processes
Harness Glue code connecting a fuzzer to its target, defining how fuzzed input is delivered
HWASAN Hardware-assisted AddressSanitizer, ARM-based variant of ASan with lower overhead
IAST Interactive Application Security Testing, combines elements of SAST and DAST during testing
Infer Meta's open-source static analyzer based on separation logic and bi-abduction
JVN Japan Vulnerability Notes, Japanese vulnerability information portal
KLEE Symbolic execution engine built on LLVM for automatic test generation
LLM Large Language Model, neural network trained on text/code, used for bug detection and code generation
LSAN LeakSanitizer, detector for memory leaks, often used alongside AddressSanitizer
Meltdown CPU vulnerability exploiting out-of-order execution to read kernel memory from user space
MITRE Non-profit organization that maintains CVE, CWE, and ATT&CK frameworks
MTTR Mean Time to Remediate, average duration from vulnerability disclosure to patch deployment
MSan MemorySanitizer, detector for reads of uninitialized memory
NVD National Vulnerability Database, NIST-maintained repository of vulnerability data
NIST National Institute of Standards and Technology, US agency maintaining security standards and NVD
OpenSSF Open Source Security Foundation, Linux Foundation project for open-source security
OSS-Fuzz Google's free continuous fuzzing service for open-source software
OWASP Open Worldwide Application Security Project, community producing security guides and tools
RCE Remote Code Execution, vulnerability allowing an attacker to run arbitrary code on a target system
RL Reinforcement Learning, ML paradigm where agents learn through reward-based feedback
S2E Selective Symbolic Execution, whole-system analysis platform combining QEMU with KLEE
SARIF Static Analysis Results Interchange Format, standard for exchanging static analysis findings
SAST Static Application Security Testing, analyzing source code for vulnerabilities without execution
SCA Software Composition Analysis, identifying known vulnerabilities in third-party dependencies
Seed Initial input provided to a fuzzer as the starting point for mutation
Semgrep Lightweight open-source static analysis tool using pattern-matching rules
Side-channel Attack vector exploiting physical implementation artifacts rather than algorithmic flaws
SMT Satisfiability Modulo Theories, solver used by symbolic execution to find inputs satisfying path constraints
Spectre Family of CPU vulnerabilities exploiting speculative execution to leak data across security boundaries
SQLi SQL Injection, injecting malicious SQL into queries via unsanitized user input
SSRF Server-Side Request Forgery, tricking a server into making requests to unintended destinations
SymCC Compilation-based symbolic execution tool that is 2--3 orders of magnitude faster than KLEE
Taint analysis Tracking the flow of untrusted data from sources to security-sensitive sinks
VDP Vulnerability Disclosure Program, formal process for receiving vulnerability reports
TOCTOU Time-of-Check-Time-of-Use, race condition between validating a resource and using it
TSan ThreadSanitizer, detector for data races in multithreaded programs
UAF Use-After-Free, accessing memory after it has been deallocated
UBSan UndefinedBehaviorSanitizer, detector for undefined behavior in C/C++
Valgrind Dynamic binary instrumentation framework for memory debugging and profiling
XSS Cross-Site Scripting, injecting malicious scripts into web pages viewed by other users
Fine-tuning Adapting a pre-trained ML model to a specific task using additional training data
AUTOSAR Automotive Open System Architecture, standardized software framework for automotive ECUs
CAN Controller Area Network, vehicle bus standard for microcontroller communication
DNP3 Distributed Network Protocol, used in SCADA and utility systems
EDK II EFI Development Kit II, open-source UEFI firmware development environment
OPC UA Open Platform Communications Unified Architecture, industrial automation protocol
RTOS Real-Time Operating System, OS designed for real-time applications with deterministic timing
Abstract interpretation Mathematical framework for approximating program behavior using abstract domains
Dataflow analysis Tracking how values propagate through a program to detect bugs like taint violations