Six-Person Startup Standard Intelligence Raises $75 Million From Sequoia to Build Computer Use AI

# Six-Person Startup Standard Intelligence Raises $75 Million From Sequoia to Build Computer Use AI

Two teenage founders who dropped out of college have built a foundation model trained on 11 million hours of screen recordings, landing a half-billion-dollar valuation before most startups have shipped a product.

---

Standard Intelligence, a San Francisco-based AI research company with just six employees, has raised $75 million in a round co-led by Sequoia Capital and Spark Capital, valuing the company at $500 million post-money. The deal, announced April 30, is one of the starkest illustrations yet of how aggressively top-tier venture firms are pricing bets on AI agents that can operate software the way humans do -- by looking at the screen.

The round also drew angel investment from Andrej Karpathy, the former OpenAI co-founder and Tesla AI director whose backing carries outsize signal in the research community. For a company with no publicly available product, no revenue, and a headcount that could fit in a minivan, the numbers are remarkable: roughly $12.5 million per employee, and a valuation that places Standard Intelligence among the most richly valued seed-to-Series-A companies in AI history.

The Teenage Founders

Standard Intelligence was founded by Galen Mead and Devansh Pandey, who met at the Atlas Fellowship in 2022 -- a selective program for high-school students interested in AI alignment and existential risk. Both subsequently enrolled in undergraduate programs, then dropped out to pursue what they describe as the goal of training and aligning the first models that can learn like humans do in full generality.

Their thesis is unconventional: rather than wrapping large language models in increasingly elaborate tool-use harnesses, Standard Intelligence is betting that the best path to general computer agents runs through raw video pretraining. Feed a model the continuous pixel stream of someone using a computer, scale the data aggressively, and let the capability emerge.

FDM-1: Learning in Pixel Space

The company’s first major output is FDM-1, a foundation model trained on an 11-million-hour dataset of computer usage videos -- a corpus Standard Intelligence says is multiple orders of magnitude larger than the next best open-source alternative. The model processes pixels rather than relying on APIs, accessibility trees, or brittle scripts that break whenever a developer moves a button.

Two technical innovations underpin the approach. First, Standard Intelligence replaced human-crafted annotations with an inverse dynamics model (IDM), a neural network that generates screenshot explanations automatically, dramatically reducing the cost of labeling action data. Second, FDM-1 uses a masked compression objective in its video encoder that strips unimportant parts of footage, reducing memory requirements without degrading data quality. The company claims the encoder is 100 times more efficient than OpenAI’s alternative.

The results are striking in their breadth. FDM-1 can construct a gear in Blender, identify bugs in software applications, use CAD programs, scan for security vulnerabilities, and -- after a short period of fine-tuning -- even drive a real car through San Francisco using arrow keys in a web interface.

“Burst compute is necessary for us to leverage our capital to train frontier models, by using it at times in our research process where it’s most valuable and getting fast feedback loops for our largest runs,” said Devansh Pandey, co-founder of Standard Intelligence. “Flexible Reservations gives us that freedom, and we’re excited to use it at scale.”

Sequoia’s Conviction Bet

Sequoia’s decision to co-lead the round fits a broader pattern. The firm recently raised a $7 billion fund to expand its AI bets and published an essay titled “2026 -- This Is AGI,” reflecting its conviction that general-purpose AI systems are arriving sooner than most industry observers expected.

In a post accompanying the investment, Sequoia wrote that Standard Intelligence is pursuing “a new pre-training paradigm: feed the model the raw stream of computer use, scale it aggressively, and let the generality emerge from the data.” The firm added that “Galen and Devansh stand out for their combination of taste, scrappiness, technical courage, and ambition. It shows up in the product thinking, in the research direction, and in the FDM-1 report itself.”

The Competitive Landscape

Standard Intelligence enters a crowded and fast-moving field. Anthropic’s Claude already offers computer use capabilities. OpenAI has been building agentic tools into ChatGPT. Google DeepMind has demonstrated similar pixel-level interaction models. But Standard Intelligence’s bet is that purpose-built models trained specifically for visual computer interaction will outperform general-purpose models that bolt on this capability as an afterthought.

The $425 million pre-money valuation -- roughly $71 million per employee -- reflects venture capital’s increasing willingness to fund research-stage AI companies at extraordinary multiples. It also signals that the computer-use layer of the AI stack is emerging as a distinct and investable category, separate from the foundation model race and the application layer above it.

Analysis: The Neolab Premium

Standard Intelligence is the latest beneficiary of what The Information has called “neolab fervor” -- a willingness among top-tier VCs to write enormous checks for tiny research teams pursuing ambitious technical agendas. The logic is that a small group of exceptional researchers can produce outsized returns if the underlying insight is correct, and that the cost of being wrong on any single bet is dwarfed by the cost of missing the next foundational company.

The risk is real. FDM-1 is impressive in demos, but translating research breakthroughs into reliable, commercially deployable products has historically been the graveyard of AI startups. Standard Intelligence will need to demonstrate that its video-pretrained models can handle the messy, unpredictable reality of enterprise software environments -- not just curated benchmarks.

Still, the signal from this round is hard to ignore. When Sequoia, Spark Capital, and Andrej Karpathy collectively decide that six people and a novel training paradigm are worth half a billion dollars, the market is telling you something about where the next wave of AI capability is heading: straight through the pixels on your screen.

“Galen and Devansh stand out for their combination of taste, scrappiness, technical courage, and ambition.”

— Sequoia Capital, Lead Investor

$75MFunding raised

$500MValuation

6Total employees

11M hoursTraining data (screen recordings)

The Teenage Founders

FDM-1: Learning in Pixel Space

Sequoia’s Conviction Bet

The Competitive Landscape

Analysis: The Neolab Premium

Sources