
CellType
The agentic drug company. We simulate human biology.
About
CellType is building an agentic drug company. AI agents that run the full drug discovery pipeline on top of biological foundation models that simulate human biology. Our core technology, developed with Google DeepMind, has already discovered and validated a new cancer treatment signal. We’re working with Top 10 pharma.
Founders
Co-Founder & CEO
Co-Founder & CEO of CellType (W26). Yale Professor, 11,000+ citations. Built biological foundation models with Google. Building the agentic drug company.
Co-Founder
Co-Founder @ CellType (W26). ML @ Yale & EPFL. Building the next-generation agentic drug company. Co-developed our core technology Cell2Sentence at Yale and published at ICML. Previously led large-scale foundation model training at a biotech startup and built software to control CERN's Large Hadron Collider.
AI Research Report
Problem & Solution
Problem / Solution Report
The Core Problem
Drug discovery faces a critical "translational gap" where preclinical models—such as animal studies or simplified in vitro systems—fail to accurately predict human responses. This leads to high failure rates in clinical trials, wasted research budgets, and significant delays in bringing life-saving treatments to market. Current methods often struggle to account for human-specific biological complexities, patient heterogeneity, and organ-specific toxicities.
CellType’s Solution
CellType addresses this by building biological foundation models trained on massive datasets of human cell profiles, including transcriptomics, methylation, and proteomics. Their flagship technology, C2S-Scale, is a 27B-parameter model trained on over 1 billion tokens. These models power agentic AI systems designed to simulate drug responses in human biology with high fidelity. By using human-centric data from the start, CellType aims to provide more accurate predictions of how a drug will behave in a living person.
Value Proposition
The primary value of CellType’s platform lies in de-risking the drug discovery process. By predicting human efficacy and toxicity earlier, pharmaceutical companies can prioritize the most promising compounds and avoid costly late-stage failures. This approach offers significant time and cost savings by identifying biomarkers and responder populations before heavy wet-lab investment. Ultimately, the use of multimodal human data aims to improve the translational fidelity of in-silico experiments, leading to higher success rates in clinical development.
Market & Competitors
Market and Competitors Report
Market Landscape and Trends
CellType operates at the convergence of AI-driven drug discovery, single-cell analytics, and biomarker discovery. Key industry trends include the shift toward biological foundation models, increased pharmaceutical investment in AI partnerships, and the integration of multi-omics data for precision medicine. There is a growing industry-wide emphasis on reducing preclinical attrition by improving the predictive accuracy of translational models.
Key Competitors
The competitive landscape includes several categories of players:
- AI-Native Drug Discovery: Companies like Insilico Medicine, Exscientia, Recursion Pharmaceuticals, and Insitro focus on target discovery and virtual screening.
- Single-Cell/Multi-Omics Platforms: 10x Genomics and Mission Bio provide the experimental assays and data pipelines that generate the data CellType uses.
- Translational Service Providers: Established firms like Charles River and Evotec offer preclinical services and may compete for pharma partnerships.
- Big Tech Labs: Google DeepMind and other major tech research divisions are also active in developing biological foundation models.
Competitive Advantages and Disadvantages
CellType’s primary advantages include its strong academic provenance, with peer-reviewed publications in venues like ICML and Nature Methods. The founders' unique combination of deep biological expertise and large-scale ML engineering experience provides a significant technical edge. Furthermore, early validation from Google Research and participation in Y Combinator (W26) provides high visibility and access to networks.
However, as a very early-stage company with only two employees as of early 2026, CellType faces significant operational risks. It must compete against well-funded, established AI drug discovery firms and navigate the cautious adoption cycles of large pharmaceutical companies. Building the necessary enterprise sales infrastructure and securing proprietary data partnerships will be critical challenges for the company's growth.
Total Addressable Market
Quantitative and TAM Report
Market Figures and Sources
The global AI in Drug Discovery market was estimated at approximately USD 2.35 billion in 2025 and is projected to reach USD 13.77 billion by 2033, representing a CAGR of 24.8%. Other industry estimates, such as those from GMInsights, suggest even higher valuations, reflecting the rapid growth and varying definitions of the sector. The broader drug discovery market, including wet-lab and CRO services, is significantly larger, estimated at USD 71.96 billion in 2025.
Adjacent markets also provide substantial opportunities. The single-cell analysis market is expected to grow from USD 3.81 billion in 2025 to USD 7.56 billion by 2030. Additionally, the global biomarkers market was valued at USD 46.4 billion in 2023 and is projected to reach USD 134.2 billion by 2033. These figures highlight the scale of the data generation and diagnostic sectors that CellType's technology leverages.
Methodology for TAM Estimation
CellType’s Total Addressable Market (TAM) is defined by its core addressable domains: (A) AI-in-drug-discovery platforms for virtual screening and toxicity prediction, (B) single-cell analysis and informatics for model training, and (C) biomarker discovery for clinical stratification. The TAM is derived by triangulating published market reports across these adjacent sectors to capture both the immediate platform market and the larger long-term opportunity in translational services.
Estimates and Interpretation
In the near-term (2025–2027), the core software platform TAM is estimated at USD 2–4 billion. When including addressable adjacent markets like single-cell analytics and biomarker discovery contracts, the near-term addressable market expands to approximately USD 5–8 billion. This represents the immediate revenue opportunity for AI models that de-risk preclinical stages.
In the long-term (by 2030–2035), as biological foundation models gain wider adoption, the combined addressable opportunity could expand into the tens of billions. With AI in drug discovery alone projected to reach nearly USD 14 billion and significant growth in biomarkers, a defensible long-run TAM could range from USD 20 billion to over USD 100 billion, depending on the extent to which model-driven services capture traditional discovery spend.
Founder Analysis
Founders and Background Report
Founders
- David van Dijk, PhD — Co-Founder & CEO
- Ivan Vrkic — Co-Founder & CTO
Professional Backgrounds
David van Dijk is an academic founder who serves as a Yale Professor of Medicine & Computer Science. He possesses a significant academic footprint with over 11,000 citations and has published in top-tier journals and machine learning conferences such as Nature, Cell, NeurIPS, and ICML. His research includes the development of foundational methods like Cell2Sentence, which translates gene-expression data into language-like representations, allowing large language models (LLMs) to learn complex biological patterns. He is credited as the inventor of the core technology that transitioned from Yale into CellType.
Ivan Vrkic is the co-developer of the Cell2Sentence foundational method and brings a background in engineering and machine learning from Yale and EPFL. His professional experience includes leading large-scale foundation model training at a biotech startup and systems engineering work, notably building software to control the CERN Large Hadron Collider. Together, the founders represent a synergy of deep domain expertise in single-cell biology and high-level scale-engineering experience necessary for production-grade biological models.
Their collaborative work is rooted in well-cited academic contributions, including the C2S concept presented at ICML 2024 and a 27B-parameter foundation model (C2S-Scale) trained on over 1 billion tokens of transcriptomic data. Public records from Y Combinator (W26 cohort) and featured coverage by Google Research/DeepMind corroborate their backgrounds and the early recognition of their technical approach.
In summary, the founders' profiles indicate strong domain credibility through academic publications and practical engineering experience at scale. This combination supports the company's technical claims regarding multimodal biological foundation models and positions them as credible leaders in the field of AI-driven drug discovery.
Unlock Full AI Research Report
Enter your email to access the complete analysis.
We'll never spam you. Unsubscribe anytime.