Over 117 million Americans are currently indexed in police facial recognition networks. That is roughly one in every three adults. Most never consented, were never informed, and have no way to opt out. Their faces were scraped from driver's license photos, mugshot databases, and public records, then fed into AI systems that can identify them in real time from any camera feed.
These systems do not fail equally. Independent audits have consistently found 10 to 100 times higher false match rates for darker-skinned individuals compared to lighter-skinned ones. That is not a minor calibration issue. It means the people who face the most aggressive policing are also the people most likely to be wrongly identified by the technology justifying that policing.
In Kansas City alone, a single square mile contains over 125 smart streetlights with cameras, 25+ dual-camera kiosks, and 200+ building perimeter cameras. Walk down a city block and you are indexed roughly every 10 feet. Thousands of police departments across the United States now use networked surveillance platforms that track protesters, target ethnic minorities, and provide immigration enforcement agencies with access to location data.
This is the world that already exists. Not a dystopian projection, not a policy debate about what might happen. The cameras are installed. The databases are populated. The AI models are running. You didn't opt-in, you cannot opt out. Until now.
Bill Swearingen has been a hacker his entire life. Former Chief Information Security Officer at a major telecommunications company, red team leader for NSA contractors, and speaker at DEF CON and Black Hat. He founded SecKC, the world's largest monthly cybersecurity meetup, right here in Kansas City. When he started researching adversarial patterns for clothing, it was not an academic exercise. It was a hacker building a countermeasure.
The result was noRecognition, a distributed research network designed to brute-force the problem at scale. The concept: generate a visual pattern, apply it to simulated clothing, run it through production-grade AI surveillance models, and record whether the person was detected. Then do it again. Millions of times.
It started as a local fuzzer on a single MacBook, testing 3 models. 2 million local tests. 435,000 anomalies. One in five random patterns caused at least one model to fail. Then it went distributed. Volunteer GPU workers around the world ran the inference on their own hardware. The gauntlet expanded from 3 models to 10 models spanning four architecturally different families: YOLO, SSD, InsightFace, FaceNet, MTCNN, RetinaFace, ArcFace.
The breeding pool concentrated the best patterns found across all 31.7 million tests. Five patterns achieved PERSON_STEALTH, defeating all 4 person detectors simultaneously. But no pattern ever achieved TOTAL_STEALTH, defeating all 10 models at once. The brute-force approach was finding needles in a haystack by examining each straw individually.
Then came the finding that changed the direction of the project.
The pattern recursive_face_tile caused one model to
hallucinate 38 phantom face detections where only
one face existed. That is not evasion. That is
injection. The pattern did not hide the person.
It made the AI see things that were not there.
On March 21, 2026, the noRecognition data was used to train a new deep learning neural network, codenamed "DarkCogswell." The goal: stop testing patterns randomly and train an AI that understands why patterns work.
First, I consolidated everything. Test results were scattered across Google Cloud Storage, Firestore databases, and legacy local storage. 82 GB of training data consolidated onto a single server. After deduplication: 5.7 million unique labeled training examples. Each one says "this recipe of patterns, applied to this person, defeated these specific models." Five million labeled experiments that had never been used for machine learning.
The first question: is the signal learnable at all? I trained XGBoost (a fast, well-understood algorithm) on flat features: which patterns, what blend mode, what opacity. Mean AUC-ROC of 0.77. The signal exists. Patterns are learnable.
But the most important feature was not a pattern. It was the persona. Which person was in the photo mattered more than which pattern was applied. Some people are inherently more vulnerable to adversarial patterns than others. This has profound implications for deployment: the same shirt that makes one person invisible might not work on another.
So I built a transformer. Instead of treating recipes as flat ingredient lists, the transformer reads each recipe as a sequence: "first apply pattern A with overlay blending at 70% opacity, then pattern B with multiply blending at 50%." It learns that order matters, that certain patterns amplify each other, and that persona vulnerability is model-specific.
| Model | XGBoost AUC | Transformer AUC | Delta |
|---|---|---|---|
| Person Detectors | |||
| P1 (YOLOv8n) | 0.635 | 0.764 | +0.129 |
| P2 (YOLOv5s) | 0.705 | 0.722 | +0.017 |
| P3 (SSD-MobileNet) | 0.698 | 0.677 | -0.021 |
| P4 (ResNet-SSD) | 0.725 | 0.700 | -0.025 |
| Face Detectors | |||
| F1 (InsightFace) | 0.907 | 0.993 | +0.086 |
| F2 (FaceNet) | N/A | N/A | Never defeated |
| F3 (MTCNN) | 0.750 | 0.870 | +0.120 |
| F4 (RetinaFace) | 0.906 | 0.969 | +0.063 |
| Recognition Models | |||
| R1 (ArcFace) | 0.844 | 0.949 | +0.105 |
| R2 (FaceNet recog) | 0.760 | 0.812 | +0.052 |
| Mean | 0.770 | 0.829 | +0.059 |
The transformer wins on 7 of 9 models. The biggest gains are face detection and recognition, where recipe interactions matter: defeating InsightFace requires specific pattern combinations in specific configurations, not just having the right patterns present. The two models where XGBoost wins (P3, P4) are both SSD-family detectors, where "which patterns are present" is the dominant signal. This tells us SSD vulnerability is pattern-level while face/recognition vulnerability is recipe-level.
But the breeding pool test told a sobering story. The transformer's breeding pool correlation: 0.009. Essentially zero. Worse than XGBoost. Both architecturally different systems could separate "defeats something" from "defeats nothing," and both were completely blind to what makes the best patterns the best.
Before pivoting entirely to RL, I ran analysis experiments on the breeding pool. The key finding: defeat rates vary by 100x across personas (from 0.000 to 0.113). The v1 transformer knew personas only as opaque IDs. I computed 34 continuous persona features (per-model defeat rates, overall detectability, model variance) and built a v2 transformer that receives these features directly.
The result:
| Model | V1 Correlation | V2 Correlation |
|---|---|---|
| F1 (InsightFace) | -0.028 | 0.835 |
| F4 (RetinaFace) | -0.026 | 0.836 |
| R1 (ArcFace) | -0.043 | 0.657 |
| P4 (ResNet-SSD) | -0.042 | 0.560 |
| P3 (SSD-MobileNet) | -0.071 | 0.513 |
| P1 (YOLOv8n) | 0.089 | 0.207 |
| Mean | 0.009 | 0.460 (51x) |
Mean breeding pool correlation: 0.460. A 51x improvement. F1 and F4 hit 0.835 correlation, meaning the model can now rank elite face-defeating recipes almost perfectly. The signal was in the historical data all along. It was in the persona-level context the model was missing.
The transformer can rank existing recipes but cannot discover new ones. It knows "recipe A is better than recipe B for persona X on model F1," but it cannot propose recipe C that beats both. That requires active experimentation. Enter reinforcement learning.
I built a full PPO (Proximal Policy Optimization) system with the real production gauntlet. Not surrogates. Not mocks. The same 10 ONNX models running on volunteer workers around the world, loaded onto a GTX 3060 alongside the policy network. Every recipe the agent proposes is tested against the actual detection gauntlet.
Once the system was running against real production models, the results were immediate. Within 128 steps (two minutes of training), the agent was defeating 8.2 out of 10 models on average. By step 384, the best recipe achieved 70% stealth across 62 of 88 personas. The system found 42 perfect recipes (all 10 models defeated) in the first 2,500 steps.
The most striking result: F2 (FaceNet). The production noRecognition system tested over 31 million recipes over four months and never defeated FaceNet. Not once. The RL agent defeated it at a 35% rate within 2,000 recipes by discovering pattern and blend combinations the brute-force search never explored.
The key discovery: blend mode is not a cosmetic parameter. It is the attack vector.
The breeding pool's most effective recipes used overlay (71.7%), multiply (22.6%), and add (5.7%). Overlay blending darkens dark regions and lightens light regions simultaneously. Applied to a pattern on clothing, it does not just overlay an image. It modifies the visual structure of the body contour. Person detectors that look for the boundary between a person and their background see a disrupted boundary. The pattern does not hide the person. It restructures how the person's silhouette appears to the model.
This is why overlay and multiply defeat person detectors while simple opacity-based blending cannot. Opacity blending adds a semi-transparent pattern that preserves the underlying body shape. Overlay blending changes the apparent shape itself.
Then I saw a YouTube video where an AI learned to walk using deep reinforcement learning. The researchers gave a simulated creature a body and a reward signal ("move forward"), and let a neural network figure out locomotion on its own. No hand-designed strategies. No pre-programmed gaits. The AI discovered balance, momentum, and recovery from stumbles through pure trial and error.
I immediately understood how that same approach could solve this challenge.
Recipe-based RL hit a ceiling at 23% stealth because it still selected from 64 pre-built pattern generators. I shifted to neural pattern generation: a ConvTranspose2d decoder that produces patterns directly from a 256-dimensional continuous latent space. Instead of selecting from a menu, the AI learns to paint from scratch.
The genetic algorithm explored a discrete space of roughly 1012 recipe combinations. The neural generator operates in a 256-dimensional continuous space, which is effectively infinite. Patterns that the genetic algorithm could never represent (smooth gradients between multiple strategies, subtle frequency-domain perturbations, adaptive structures that respond to pose context) are all reachable.
This is the difference between searching a library for the right book and learning to write.
Every candidate pattern must survive the Gauntlet: 10 production-grade AI models running as ONNX inference sessions on GPU. These models represent the three stages of real-world surveillance: detecting that a person is present, locating their face, and identifying who they are. A pattern only achieves "stealth" when it defeats all 10 models simultaneously.
| ID | Model | Architecture | Role |
|---|---|---|---|
| Person Detectors (detect whether a human body is present) | |||
| P1 | YOLOv8n | Single-shot detector, anchor-free | Fastest. Widely deployed in consumer cameras. |
| P2 | YOLOv5s | Single-shot detector, anchor-based | Balanced speed/accuracy. Industry standard. |
| P3 | SSD-MobileNetV2 | SSD with depthwise separable convolutions | Lightweight. Edge devices and mobile. |
| P4 | ResNet34-SSD | SSD with ResNet-34 backbone | Highest accuracy. High-security installations. |
| Face Detectors (locate faces within detected persons) | |||
| F1 | InsightFace (Buffalo_L) | Multi-task CNN with large backbone | State-of-the-art. Robust to partial occlusion. |
| F2 | FaceNet-MTCNN | Two-stage cascaded detector | Classic detector. Widely researched. |
| F3 | MTCNN | Multi-task cascaded CNN (P-Net, R-Net, O-Net) | Fast and lightweight. Three-stage cascade. |
| F4 | RetinaFace | Anchor-based with feature pyramid network | Aggressive NMS. Strong small-face detection. |
| Recognition Models (identify WHO the person is) | |||
| R1 | ArcFace | Angular margin loss, ResNet backbone | Discriminative face embeddings. Industry leader. |
| R2 | FaceNet | Triplet loss embedding network | 128-dim embeddings. Broadly deployed. |
Raw defeat counts are multiplied by criticality bonuses that encode strategic priorities:
The reward landscape spans roughly 4 orders of magnitude from worst to best. A pattern defeating zero models scores about 1.0. A pattern achieving total stealth with stacked category bonuses can score over 150.
When the production hardware arrives, the pattern generator will not start from scratch. It will be pre-trained from five knowledge sources that represent months of accumulated GPU compute, 169 million model inferences, and thousands of validated adversarial examples. This dataset is DarkCogswell's competitive moat. A competitor starting from zero would need to build and run an entire distributed research network to reproduce it.
DarkCogswell's architecture was validated on modest development hardware. The production system will unlock the full training pipeline at 15-20x throughput.
Training is ongoing. The live dashboard shows real-time progress including per-model defeat rates, stealth percentages, reward curves, and system utilization.