Adversarial Patch Generator | Ryan Stricklin

Research Context

Adversarial patch research often suffers from a reproducibility problem. Papers describe high-level methods, but implementation details around augmentation policy, scheduler behavior, and multi-GPU scaling are inconsistent across labs. That makes it difficult to compare results or to understand whether a patch is genuinely robust or just overfit to a narrow digital setting.

This project was developed in collaboration with Johns Hopkins Applied Physics Laboratory through the Cyber Leader Development Program context, where repeatable experimentation mattered as much as raw attack success. The objective was to create a practical framework that supports serious iteration, not a one-off script that only works for a single hardware setup.

Technical Approach

The framework targets YOLO detectors and supports multiple adversarial objectives, including hiding and misclassification paths. A key design principle is that physical-world robustness must be modeled during training, not bolted on afterward. To support that, the pipeline applies Expectation Over Transformation style augmentation so patches are optimized across varied viewpoints, scale, and distortion conditions.

The project includes three complementary training modes: normal optimization for baseline attack behavior, style-constrained covert mode, and procedural covert mode driven by Perlin-based structure. Loss components are explicit and configurable so the tradeoff between adversarial force and visual plausibility is measurable.

Implementation

Implementation is split across specialized scripts so experiments can scale with available compute. train_patch.py supports single-node workflows and multi-GPU runs on one machine. train_patch_MG.py provides a dedicated DDP path launched through torchrun for higher-throughput distributed training. automate_tuning.py executes grid-based parameter sweeps for noise and optimization settings, making early-stage exploration faster and more systematic.

Evaluation is handled by evaluate_patch.py, which measures impact using a clear taxonomy: hidden detections, misclassified detections, and disrupted detections. That categorization is important because different failure types carry different operational implications. The project also includes a central configuration model, checkpointing, and run artifacts in a structured output layout so experiments can be reviewed and repeated.

Results

The main outcome is a reproducible research workflow that can run across laptop-scale experiments, workstation-scale multi-GPU runs, and distributed server training without changing the conceptual pipeline. It improved the ability to compare patch strategies under consistent evaluation criteria and made parameter tuning less ad hoc. The framework now functions as a practical baseline for future adversarial patch studies where methodological clarity is required alongside performance.