I wrote a “marketing-trick post” on this blog to lay out the public record. It comes now with Anthropic researcher Carlini’s messages to me with confirmations. I have pointed to Calif.io’s four-hour Opus 4.6 exploit, AISLE’s eight-of-eight detection across commodity open-weight models, and the Firefox 4.4% collapse on page 52.
I also wrote a market analysis “cartel post“, a technical “Mozilla 271-versus-3 post“, an industry look-in-the-mirror “SANS amplifier post“, and the “Esage Chrome post” refutation.
When Carlini wrote in to confirm the parts that matter, I felt convergence toward my “boy who cried Mythos post” as the goal posts shrunk. What I had not done myself was run the audits.
Carlini’s point to me has been that their Mythos pitch gives two layers to evaluate.
Discovery. Find the bug in the source. Anthropic’s own red team admits Opus 4.6 had near-zero success at autonomous exploit development, but Carlini and his colleagues used it to find 500-plus validated high-severity vulnerabilities in their February paper. That’s where AISLE comes in, confirming eight of eight open-weight models detect the FreeBSD showcase bug, one at eleven cents per million tokens. Vidoc reproduced it on public Opus 4.6 and on GPT-5.4. Steamedhams reproduced it in three generic prompts and found two extra bugs the Mythos writeup missed. Discovery has clear evidence of being a commodity, repeatedly being demonstrated.
Exploit development. Take a discovered bug and build a working exploit. This is where the 20-gadget FreeBSD ROP chain landed, the four-vulnerability browser sandbox escape, and the 181 Firefox JIT heap-spray exploits. Anthropic claims this as the novel Mythos differentiator, priced at five times Opus. Yet Calif.io built a working exploit on Opus 4.6 in four hours, which is exploit development on commodity inference at one-fifth the price.
Glasswing’s framing rests on the discovery layer being scarce, which it provably is not. So the question becomes whether the exploit-development layer defends five times Opus pricing, and whether it buys something anyone outside a high-priced consortium can verify.
This is an economic structure known as Akerlof’s lemons problem, although it’s been inverted here. In the classic case, a seller knows quality and the buyer does not, and the market collapses toward low quality. Anthropic has structured the market so that quality is unmeasurable to anyone, including the seller’s own external auditors, because the artifacts that would let you measure are not produced. The 20-gadget FreeBSD ROP chain has no public exploit code to review. The browser sandbox escape doesn’t seem to have any CVE, let alone a technical writeup, or independent verification. The system card itself says Mythos “worked with” the red team to escalate severity, which is human-assisted, not autonomous. The 181 Firefox JIT exploits exist as a benchmark number with no replayable harness attached. Mozilla disputed the underlying bugs as “high,” refusing “critical.” NVD then assigned 9.8, in a typical vendor-dispute situation.
That reads to me like someone in a very isolated room of Silicon Valley has a market design in mind. The opacity is desired, like a guarded mansion in a gated community. It is what is being sold.
The same structure is operating at the inference layer. Anthropic has acknowledged intermittent model-quality degradation on their availability/outages blog while denying intent.
We take reports about degradation very seriously. We never intentionally degrade our models…
A denial about intent is a red flag. It is not being honest about degradation. Quality is adjustable by the provider without disclosure, and the buyer cannot independently verify per-call effort allocation. This is the same information structure as Mythos, applied to general inference. Anthropic is arguing the buyer pays for output they opaquely control and the buyer cannot independently verify. That’s a centuries-old mindset of taxation without representation, if you remember what happened to King Charles. In neither case has Anthropic shipped the instrumentation that would let a buyer evaluate what they received against what they paid for. Token waste and tainted outputs, causing harms to a buyer and anyone around them, translates only into Anthropic profit.
I got tired of waiting for better and open instrumentation to push back on monarchist management of models. So I built one, like everyone should. Same code from the launch blog, same public API, my harness.
Cogito ergo hackito.

Lyrik is built on top of my Wirken agentic switchboard. It runs the discovery and scoring pipeline that Mythos presented at the discovery layer. It does not attempt exploit development. The point is the price and the receipt.
Seventy-five cents. That’s it.
Lyrik is free and open-source on GitHub. I have laid out this concept in my talks and podcasts since at least 2018. The repo provides free caching, multiple agents, structured output, and a hash-chained audit log. Given the Anthropic system card itself advises Mythos was not as good as their earlier models on general work, I deployed Haiku-4-5 for recon and then radioed in Sonnet-4-6 for close support.
I am a BIG fan of Haiku. Arguably one of the best engineering models. It easily handled recon, which made the Sonnet targeted bombing runs look generous.
Lyrik dropped eight findings in two minutes. Total mission spend: $0.745. I call that seventy-five cents because I’m all out of half-pennies.
Two of the eight matched bugs the Mythos showcase identified. The other six came up unverified. I am not claiming zero-days here, especially as some may triage out in the fog of false positives. More on that in another post later. Mind you, Lyrik is model agnostic. I frequently use a TEE-based provider, when I’m not running Ollama for the unmistakable smell of my hardware.

The discovery side of the bill is now visible at commodity prices, with chain of custody. The exploit-development side remains the thing in the box you cannot open. Operators are paying five times Opus pricing for a layer that has produced no replayable artifact for any of its headline claims. The launch blog does not produce one. The system card does not produce one. Glasswing does not produce one. The July 6 report is a promise of a document, not of transparent instrumentation.
My cartel post made the obvious case that Glasswing is a private classification regime granting the largest incumbents early access to a capability while tainting disclosure timelines. Set that aside. Even if the velvet-rope consortium did not amount to being a cartel, it points at the wrong adversary.
If code is the asset, then whoever holds the inference has the asset. The Glasswing setup does not move that one inch in the right direction. The code leaves the operator’s boundary in plaintext, and the inference provider reads every line on their compute within a price-gated consortium. Anthropic gets your cleartext codebase, sets the timeline for what gets surfaced, and decides which consortium members see it first.
You wouldn’t pay five times market rate to send your source to your competitor. Have you seen who got seats in the velvet rope consortium? Microsoft. Apple. Google. Amazon. Companies competing against you. They are now inside the team that reads your code on the compute that runs theirs.
The provider has always been the threat. Take it from someone who spent years on the inside hunting and killing backdoor habits.
Lyrik runs on the Wirken abstraction of models for exactly this reason. TEE-based providers can give confidential inference, with a local proxy handling attestation before any code crosses the boundary. Attestation is no guarantee. TEE bypasses are part of life too. What attestation does is raise the cost of attack on the provider, which is the actual threat.
Every phase boundary, every model call, every prompt, every output block in the Lyrik run is hash-linked and signed at the gateway. Anyone holding an artifact can replay the run to verify the chain offline. It is not screenshots. It is not an “Anthropic says” play. It is not a 23MB PDF that uses the word “thousands” once with no verification chain for any individual or aggregate finding.
The PGP signature on the FreeBSD advisory exists for the same reason the Lyrik audit log does. It is an integrity check. The Mythos showcase has nothing equivalent at either layer. A finding without a verifiable chain of custody is mythology in denial of RFC 1305 and the lessons of Monty Python.

Wirken is at wirken.ai. Lyrik is a Wirken skill at lyrik.wirken.ai. Running Wirken 1.0.2 with an Anthropic API key and a checkout, the harness reads code on your machine, with a TEE-based LLM handling inference if you do not want the provider seeing source. Everything the run produces is offline and verifiable.
Discovery has been and is still a commodity. Exploit development is being pitched to us as unverifiable by design. Someone built a pricing model for access behind a velvet rope, not for a capability that anyone outside the rope can check. Anthropic is designing a market so the buyer cannot measure what they paid for.
Call that what it is.
No cartel, thanks.
No evil maid, thanks.








