Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Systematically detecting packet validation vulnerabilities in embedded network stacks

Published in 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2023

Embedded Network Stacks (ENS) enable low-resource devices to communicate with the outside world, facilitating the development of Internet of Things and Cyber-Physical Systems. Some defects in ENS are thus high-severity cybersecurity vulnerabilities: they are remotely triggerable and can impact the physical world. While prior research has shed light on the characteristics of defects in many classes of software systems, no study has described the properties of ENS defects nor identified a systematic technique to expose them. The most common automated approach to detecting ENS defects is feedback-driven randomized dynamic analysis (“fuzzing”), a costly and unpredictable technique. This paper provides the first systematic characterization of cybersecurity vulnerabilities in ENS. We analyzed 61 vulnerabilities across 6 open-source ENS. Most of these ENS defects are concentrated in the transport and network layers of the network stack, require reaching different states in the network protocol, and can be triggered by only 1–2 modifications to a single packet. We therefore propose a novel systematic testing framework that focuses on the transport and network layers, uses seeds that cover a network protocol’s states, and systematically modifies packet fields. We evaluate this framework on 4 ENS and replicated 12 of the 14 reported IP/TCP/UDP vulnerabilities. On recent versions of these ENSs, it discovered 7 novel defects (6 assigned CVES) during a bounded systematic test that covered all protocol states and made up to 3 modifications per packet. We found defects in 3 of the 4 ENS we tested that had not been found by prior fuzzing research. Our results suggest that fuzzing should be deferred until after systematic testing is employed.

Recommended citation: Paschal C Amusuo, Ricardo Andrés Calvo Méndez, Zhongwei Xu, Aravind Machiry, James C Davis
Download Paper

FalseCrashReducer: Mitigating False Positive Crashes in OSS-Fuzz-Gen Using Agentic AI

Published in arXiv preprint arXiv:2510.02185, 2025

Fuzz testing has become a cornerstone technique for identifying software bugs and security vulnerabilities, with broad adoption in both industry and open-source communities. Directly fuzzing a function requires fuzz drivers, which translate random fuzzer inputs into valid arguments for the target function. Given the cost and expertise required to manually develop fuzz drivers, methods exist that leverage program analysis and Large Language Models to automatically generate these drivers. However, the generated fuzz drivers frequently lead to false positive crashes, especially in functions highly structured input and complex state requirements. This problem is especially crucial in industry-scale fuzz driver generation efforts like OSS-Fuzz-en, as reporting false positive crashes to maintainers impede trust in both the system and the team. This paper presents two AI-driven strategies to reduce false positives in OSS-Fuzz-Gen, a multi-agent system for automated fuzz driver generation. First, constraint-based fuzz driver generation proactively enforces constraints on a function’s inputs and state to guide driver creation. Second, context-based crash validation reactively analyzes function callers to determine whether reported crashes are feasible from program entry points. Using 1,500 benchmark functions from OSS-Fuzz, we show that these strategies reduce spurious crashes by up to 8%, cut reported crashes by more than half, and demonstrate that frontier LLMs can serve as reliable program analysis agents. Our results highlight the promise and challenges of integrating AI into large-scale fuzzing pipelines.

Recommended citation: Paschal C Amusuo, Dongge Liu, Ricardo Andres Calvo Mendez, Jonathan Metzman, Oliver Chang, James C Davis
Download Paper

How Do Agents Perform Code Optimization? An Empirical Study

Published in arXiv preprint arXiv:2512.21757, 2025

Performance optimization is a critical yet challenging aspect of software development, often requiring a deep understanding of system behavior, algorithmic tradeoffs, and careful code modifications. Although recent advances in AI coding agents have accelerated code generation and bug fixing, little is known about how these agents perform on real-world performance optimization tasks. We present the first empirical study comparing agent- and human-authored performance optimization commits, analyzing 324 agent-generated and 83 human-authored PRs from the AIDev dataset across adoption, maintainability, optimization patterns, and validation practices. We find that AI-authored performance PRs are less likely to include explicit performance validation than human-authored PRs (45.7\% vs. 63.6\%, $p = 0.007$ ). In addition, AI-authored PRs largely use the same optimization patterns as humans. We further discuss limitations and opportunities for advancing agentic code optimization.

Recommended citation: Huiyun Peng, Antonio Zhong, Ricardo Andrés Calvo Méndez, Kelechi G Kalu, James C Davis
Download Paper

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.

Ricardo Calvo Mendez

Posts by Collection

portfolio

Portfolio item number 1

Portfolio item number 2

publications

Systematically detecting packet validation vulnerabilities in embedded network stacks

FalseCrashReducer: Mitigating False Positive Crashes in OSS-Fuzz-Gen Using Agentic AI

How Do Agents Perform Code Optimization? An Empirical Study

talks

Talk 1 on Relevant Topic in Your Field

Tutorial 1 on Relevant Topic in Your Field

Talk 2 on Relevant Topic in Your Field

Conference Proceeding talk 3 on Relevant Topic in Your Field

teaching

Teaching experience 1

Teaching experience 2