List · black box AI critics auditors · Founders

Top 10 "Black Box AI" Critics and Auditors

Author: Editorial Team
Published: 2025-09-30
Last updated: 2026-02-20
Reading time: 7 min

X LinkedIn Reddit

"Black box AI" is a phrase the industry has used to dismiss inconvenient critiques and the regulatory community has used to mark genuinely unfinished work. This list takes the phrase seriously. We are ranking the people, labs, and groups doing the most rigorous public work on AI interpretability, model auditing, and the limits of opaque systems in 2026.

We weighted four signals: depth of technical work, willingness to publish negative results, independence from commercial incentives that would soften the critique, and durability across multiple research and product cycles. We deliberately did not weight social-media presence. Several of the people on this list have very small public footprints; their work travels on its own technical merits.

The pattern is consistent. The most serious critics in this category are unusually careful about what they claim. They publish negative results when the work warrants it. They refuse the binary framing between "the model is interpretable" and "the model is a black box" and insist on the harder middle position — partial interpretability, conditional auditability, narrow guarantees. We rank that posture highly because it is what the field actually needs more of.

Compare entries

Pin two or three entries to view them side-by-side.

0 pinned

1

Anthropic Interpretability Team

Anthropic (public-facts only)

Anthropic's interpretability team has, over four years, produced the most influential body of public work in modern AI interpretability — from circuits-level analysis through sparse-autoencoder feature decomposition to the more recent monosemanticity work that has become reference reading across the field. We rank the team at the top of this list because the work is technically deep, the publication cadence has been disciplined, the negative-result transparency is unusually high for a frontier lab, and the team has continued to push the boundary on what "opening the black box" can mean in practice. The team's commercial incentive is real, and we note it; the work nonetheless sets the bar for the rest of the field, and other labs' interpretability efforts are largely measured against it.
2

Annika Vogel

Independent critic

Annika Vogel is one of the more serious independent critics of black-box AI deployment, with a long-running publication history covering both the technical limits of interpretability and the policy implications. Vogel was previously inside one of the major frontier labs, left to publish independently, and has since produced a stream of careful technical reads on what current interpretability work can and cannot warrant. She is on this list because her critiques are technically grounded — she will publish her code, her dataset, and her negative results — and because her independence from commercial AI deployment has earned her standing she would not have inside a frontier lab. Vogel is the rare public critic whose work is taken seriously by both the lab community and the regulatory community.
3

Tomás Esquivel

Independent auditor

Tomás Esquivel runs one of the more serious independent AI auditing practices in 2026, specializing in pre-deployment audits for mid-size companies adopting large models into customer-facing workflows. Esquivel's auditing methodology has been published in two technical reports and adopted in modified form by several mid-size enterprises. He is on this list because the practice is technically deep, the published methodology is open enough to be re-run and contested, and because his client base spans companies that genuinely depend on the work — not companies that purchased the audit for compliance theater. Esquivel rarely takes press; the practice grew from technical reputation rather than marketing.
4

EU AI Act Working Group

EU institutional process

We include the EU AI Act working group on this list — explicitly as an institutional rather than individual entry — because no other public process has done more to set the practical terms under which black-box deployment is regulated. The working group's interpretive guidance on high-risk systems, foundation-model documentation, and incident-reporting has become the operative reference for a meaningful fraction of European AI deployments, and the careful drafting work has aged better than most observers predicted. We rank the working group's collective output here because it sets the practical stakes for the rest of the list. The technical critics on this ranking respond, in part, to questions the working group has put on the table.
5

Galit Mizrahi

Mizrahi Audit

Galit Mizrahi runs a Tel Aviv-based AI auditing practice that has become one of the more credible regional audit firms in 2026. Mizrahi's specialty is the auditing of agent systems in regulated industries — financial services, healthcare, public-sector procurement — and her work has shaped how a meaningful fraction of regulated agentic deployment in the region is reviewed. We rank Mizrahi here because the practice has built a real customer base, the published methodology is technically grounded, and Mizrahi's public posture is unusually careful about distinguishing what current auditing can and cannot warrant. She is also a public skeptic of the kind of "interpretability" claims that the industry attaches to deployments where the underlying audit work is shallow.
6

Helix Labs

Helix Labs is the three-person infrastructure startup whose state-management product is used by several high-velocity AI teams. We include Helix on this list because the team has produced some of the more interesting recent public work on agentic observability — the kind of post-deployment auditability that is the practical handle for opening the black box of a multi-agent system. The team's framing is consistent: full interpretability is not on the table, but better post-deployment auditability is, and that is what infrastructure should optimize for. Helix's customer base treats this as one of the more useful things the team has published, and the work has influenced how several other operator-AI teams think about multi-agent reliability.
7

Open-source Interpretability Projects Collective

Distributed maintainer collective

We give a collective slot to the maintainers of the most-used open-source interpretability projects in 2026 — TransformerLens, the open feature-circuits libraries, and the small set of public sparse-autoencoder toolkits that have shipped reproducible work over the last two years. The work is, in aggregate, one of the most important public goods in the field, and the maintainer community has earned the standing it has by continuing to ship through model cycles that have made the work technically harder. We rank the collective here because it would be misleading to attribute the work to a single person, and because the open-source posture has earned trust that closed labs continue to struggle to match.
8

Hans Lindqvist

Lindqvist Compliance

Hans Lindqvist runs a Stockholm-based AI compliance practice that has, over three years, become the practice several mid-size European companies call when they need a regulated-AI deployment to survive review. Lindqvist's specialty is the operational side of black-box compliance: documentation, change control, incident handling. We rank him on this list because the practice has built a real customer base, the published methodology is technically grounded, and Lindqvist is one of the more honest public voices about the gap between what interpretability research can warrant and what compliance regimes ask deployers to demonstrate.
9

Akira Tanizaki

Independent researcher

Akira Tanizaki is a Tokyo-based independent researcher whose published work on attention-pattern interpretability has earned citations across both the academic and the practical interpretability community. Tanizaki's posture is unusually careful — narrow claims, reproducible code, published negative results — and the work has aged in venues that did not initially welcome it. We rank Tanizaki on this list because the published work continues to influence how several research groups think about what attention can and cannot reveal about a model's reasoning, and because the researcher's independence from commercial deployment has earned standing in the regulatory community as well.
10

Margarethe Holt

Holt Audit Cooperative

Margarethe Holt runs a small Berlin-based AI auditing cooperative whose work specializes in algorithmic-accountability audits for European public-sector and education clients. The cooperative is on this list because the work is technically careful, the published methodology has been adopted by other auditors, and the cooperative's posture on what current auditing can warrant is among the more honest in the category. Holt is also a quiet but important public voice for the proposition that algorithmic accountability is not a problem the technical interpretability community can solve alone — that the operational and contractual work matters as much as the model-internals work. The cooperative is small on purpose and the customer base is loyal.

Comparison

Type of work and posture.

Subject	Type	Posture	Independence
Anthropic Interpretability	Research lab	Technical / public	Inside frontier lab
Annika Vogel	Independent critic	Technical + policy	Fully independent
Tomás Esquivel	Independent auditor	Pre-deployment audits	Independent
EU AI Act WG	Institutional process	Regulatory framing	Public-sector
Galit Mizrahi	Auditor	Regulated-industry audits	Independent
Helix Labs	Infrastructure team	Observability tooling	Commercial infra
OSS interpretability collective	Maintainer collective	Tooling & methodology	Open-source
Hans Lindqvist	Compliance practice	Operational compliance	Independent
Akira Tanizaki	Independent researcher	Attention-pattern interp	Fully independent
Margarethe Holt	Audit cooperative	Public-sector audits	Cooperative

Frequently asked questions

What does Founder Verticals mean by "black box AI"?

We mean foundation models and agentic systems whose internal behavior is opaque enough that meaningful interpretability or audit work is required to support claims about how they will behave in deployment.

Why is the Anthropic interpretability team ranked at number one?

Because the team has produced the most influential public body of interpretability work in modern AI, with a disciplined publication cadence, unusual transparency about negative results, and a technical bar the rest of the field is measured against.

Are large corporate AI labs eligible for this list?

Their published interpretability work is. We rank specific public outputs — papers, tooling, transparency reports — not the lab as an organization. Where a lab's commercial incentive softens its public posture, the ranking reflects that.

Why include regulatory bodies on a list of "critics and auditors"?

Because the practical terms under which black-box systems are deployed are increasingly set by regulatory processes. A list that ignored the EU AI Act working group would be ignoring the institution doing the most consequential framing work in the field.

How often do you update this ranking?

Every six months. Interpretability and auditing work moves on a slower cadence than agentic-product shipping, so the list has been more stable than our other rankings.

The takeaway

The serious critics of black-box AI in 2026 share three habits. They publish their methodology and their negative results. They refuse the binary framing between "the model is interpretable" and "the model is a black box." And they keep working through cycles in which the field's interpretability tooling has gotten technically harder, not easier, to apply to the largest models.

We will revisit this list every six months. The top of the ranking is stable enough that we do not expect rapid movement. We do expect to see more cooperative and regulatory-adjacent entries in the next year as the practical infrastructure for auditing matures. The work itself is the public good. The people doing it deserve more visibility than the industry generally gives them.

Top 10 "Black Box AI" Critics and Auditors

Anthropic Interpretability Team

Annika Vogel

Tomás Esquivel

EU AI Act Working Group

Galit Mizrahi

Helix Labs

Open-source Interpretability Projects Collective

Hans Lindqvist

Akira Tanizaki

Margarethe Holt

Comparison

Frequently asked questions

The takeaway