-Optimus

Discover complex signatures, find biomarkers, predict gene mutations, determine protein expression and more from routine H&E slides

A Pathology Foundation Model to Power Your Research

H-Optimus is built on a dataset and architecture that make it an ideal backbone for any use-case leveraging digital pathology.

1 Million+ Slides Trained

Trained on one of the largest pathology datasets available

1.1 Billion Parameters

Built on a Vision Transformer (VIT-g/14) architecture

800,000+ Patients

Ensuring diverse and robust real-world data representation

Over 1 Million Downloads

Thousands of users, trusted discovery across many uses

50 Organs Covered

Providing broad applicability across numerous disease areas

4000+ Clinical Practices

Used by thousands of practices around the globe

H-Optimus-1
6.06
Virchow2
6.34
H-Optimus-0
6.86
UNI2
7.10
mSTAR
7.65

*Chart shows overall rank across all tasks, PathBench (lower is better)

Ranked #1 in benchmarks across 229 tasks.

H-Optimus-1 reaches state-of-the-art performance in a variety of benchmarks and in downstream applications, such as biomarker prediction, spatial gene expression, or survival prediction.

How H-Optimus-1 helps your research

Biomarker Discovery & Patient Stratification
01

Identify novel predictive biomarkers for targeted therapies and precision medicine.

Enhance patient stratification for clinical trials, ensuring the right patients are selected based on histopathological and molecular features.

AI-Powered Drug Discovery & Preclinical Validation
02

Analyze drug-tissue interactions at scale, improving toxicology assessments and response predictions.

Predict drug efficacy across different cancer subtypes, reducing early-stage attrition rates.

Accelerating Clinical Trials & Regulatory Approvals
03

Automate histopathological grading & disease progression tracking to enhance trial endpoints.

Support AI-assisted clinical decision-making by integrating H-Optimus-1 with real-world histology data.

Facilitate regulatory submissions with AI-powered standardization of pathology image assessments, ensuring compliance with FDA and EMA requirements.

Enhancing Digital Pathology & AI-Assisted Diagnosis
04

Leverage AI for automated annotation and analysis of whole-slide images (WSI) to reduce pathologist workload and improve consistency.

Enable real-time pathology analysis in research hospitals to support faster decision-making.

Frequently Asked Questions

Can H-Optimus be used to automate or assist in generating pathology reports?

H-Optimus has been used as a visual backbone for automated slide-to-report systems. For example, researchers at the Institute for Cancer Genetics and Informatics(ICGI) used H-Optimus-1 to power NARWHAL, an AI system that generates standardized clinical reports directly from gigapixel whole-slide images. Demonstrating its robustness across diverse scanners and tissue types, the H-Optimus-backed NARWHAL system recently won first place in the global REG2025 challenge for clinical alignment and linguistic quality. Read the case study.

Importantly, these systems are designed to generate pre-structured drafts to assist pathologists and accelerate workflows, rather than replace expert human validation.

Can H-Optimus be used to predict spatial gene expression or biological pathways?

Yes, researchers are leveraging H-Optimus to infer molecular landscapes directly from standard H&E slides. For example, researchers at The University of Manchester recently developed Deep Pathway, a computational framework that uses H-Optimus-0 to predict pathway-level expressions from H&E images. Using the model, they successfully mapped complex pathways (like Androgen Response) inprostate cancer and predicted hypoxia signatures in glioblastoma. Crucially, these AI-derived hypoxia predictions showed strong visual concordance withactual PIMO staining (the clinical ground-truth). Read the paper.

While highly effective for hypothesis generation and cohort stratification, these predictive maps are exploratory tools and do not replace definitive molecular testing.

Can H-Optimus analyze Immunohistochemistry (IHC), Immunofluorescence (IF), or other special stains, or is it strictly for H&E?

The H-Optimus models were pre-trained exclusively on massive, highly diverse datasets of H&E (Hematoxylin and Eosin) stained whole-slide images.However, researchers are successfully adapting H-Optimus for non-H&E analysis in exploratory settings. For example, a recent study published in Laboratory Investigation demonstrated the model's adaptability for immunofluorescence(IF). Researchers at MedStar Georgetown University Hospital utilized the H-Optimus vision transformer to analyze kidney IF images, fine-tuning the model to automatically screen and classify whole-slide images for immune reactants.

Because the model was pre-trained exclusively on H&E, we recommend that clinical teams rigorously validateits performance when adapting it for non-H&E tasks.

Is H-Optimus available for academic research?

Yes, the H-Optimus family of models is available for academic research and can be accessed directly via Hugging Face.

To support different computational needs and research goals,we offer three distinct models. It is important to note the differences intheir capabilities and licensing:

H-Optimus-0: Our original 1.1 billion parameter foundation model, trained on over 500,000 histology slides. This model is fully open-source and released under the permissive Apache 2.0license, allowing for broad academic and research use.

H-Optimus-1: Our state-of-the-art 1.1 billion parameter model, trained on a massive, highly diverse dataset ofover 1 million slides from more than 800,000 patients. This model is available under a CC-BY-NC-ND 4.0 license, meaning it is strictly available for non-commercial, academic research purposes. Any commercial use or monetization requires a separate licensing agreement.

H0-mini: A lightweight, highly efficient model developed in collaboration with Owkin. It was distilled from H-Optimus-0 to deliver comparable performance to larger foundation models butat a significantly reduced computational (inference) cost. Like H-Optimus-1, H0-mini is released under the CC-BY-NC-ND 4.0 license for non-commercial, academic research.

What data was used to train H–Optimus?

Both H-Optimus models were trained using self-supervised learning on massive, proprietary datasets of routine H&E-stained whole-slide images (WSIs). To ensure the models generalize well across different laboratory environments, the training data was intentionally curated for high patient, disease, and technical diversity.

During pretraining, these whole-slide images are converted into billions of small, standardized image tiles (specifically, 224×224 pixel tiles extracted at approximately 0.5 microns-per-pixel) to teach the model the fundamental visual language of histology.

Here is the specific breakdown of the training cohorts for each model:

H-Optimus-1: Trained on an extensive collection of over 1 million H&E slides from more than 800,000 patients. To ensure robustness to real-world variability, this dataset spans over 50 different organs and was digitized using 3 different scanner types across morethan 4,000 clinical centers. Read more.

H-Optimus-0: Trained on over 500,000 histopathology slides sourced from across 4,000 clinical practices, yielding several hundreds of millions of training tiles. Read more.

By training on such a vast and diverse corpus of real-world clinical data, the models learn rich, generalizable biological features designed to be highly robust to the typical staining, tissue preparation, and scanning variations encountered across diverse clinical centers.

How do I integrate H-Optimus into my lab?

H-Optimus acts as a foundational "embedding layer" for your digital pathology pipeline. It serves as the computational backbone for your data science teams to build clinical-development applications, rather than acting as an out-of-the-box diagnostic.

A standard integration follows three main phases:

  1. Data Preparation: Scan H&E slides into standard Whole-Slide Image (WSI) formats. Preprocess these by masking and extracting tissue tiles that match the model’s specs (e.g., 224×224 pixels at ~0.5 MPP for H-Optimus-1).
  2. Model Deployment: Deploy the model based on your regulatory and infrastructure needs. For commercial use and sensitive trial data, you can deploy securely within your own environment via AWS SageMaker. For academic research, the model is available via Hugging Face for local hardware deployment. Both setups allow you to generate embeddings across multiple slides in a batch.
  3. Downstream Training & Validation: Feed the model's output (embeddings) into lightweight, task-specific models (like MIL heads) to predict your specific clinical endpoints. Before prospective use, rigorously validate this custom pipeline on your lab's retrospective cohorts to account for local scanner and staining variability.

Read more in our model documentation.

Is H-Optimus suitable for clinical development?

H-Optimus is suitable for clinical development, specifically as a powerful analytical tool for translational and exploratory research.

While H-Optimus delivers state-of-the-art predictive precision, it is not currently cleared as a regulated diagnostic or Companion Diagnostic (CDx). If your intended use involves making direct, patient-level clinical decisions, the model would serve as the foundational backbone. You would still need to conduct a regulatory-grade validation package—including external validation on independent cohorts matching your trial's specific parameters, pre-specified endpoints, and prospective bridge studies—while keep ingconfirmatory molecular assays as the gold standard for decision-making.

If you have a specific use case inmind (e.g., enrichment pre-screening, exploratory endpoints, or CDx ambitions), please contact us so we can assess the exact suitability and validation steps required for your program.

How was H-Optimus validated?

The H-Optimus foundation models are validated through arigorous, standardized benchmarking protocol designed to provide an objective comparison across the entire model family. In these evaluations, the foundation model serves as a fixed feature extractor, and lightweight models are trained and evaluated on top of it using a controlled, highly repeatable protocol.

Key components of our validation approach include:

Slide-Level Downstream Tasks: Forclinical endpoints like biomarker prediction and classification, we train multiple attention-based MIL (ABMIL) heads on frozen H-Optimus features. To ensure high robustness and reproducibility, training steps are selected via 5-fold cross-validation, and the process is repeated across multiple random seeds. Final reported metrics are averages across 50 trained heads. We also utilize tile subsampling to efficiently manage compute resources.

Tile-Level Tasks: For tissue and tumor classification tasks, we evaluate the models using linear probing. Similar to our slide-level evaluations, this involves cross-validation andrepeated runs across different seeds to guarantee reliable top-1 accuracy metrics.

External Cohort Validation: We rigorously test our models against independent, external validation cohorts toensure they generalize well to independent, unseen retrospective cohorts.

Public Benchmarks: For publicbenchmarks like HEST, we evaluate our models using the exact, established procedures defined by external researchers. This adherence to independent testing standards ensures fair, transparent, and unbiased industry comparisons. Read more.

Still have questions?

Our team is available to discuss validation, partnerships, academic access, or technical details. Get in touch to start the conversation.

Still have questions?

Our team is available to discuss validation, partnerships, academic access, or technical details. Get in touch to start the conversation.

One Model. Every Scale.

Bioptimus bridges the gap between biological layers. Building on the industry-leading performance of H-Optimus-1, our new M-Optimus model integrates multiple data modalities to provide the definitive multi-scale view of biology.