ICGI researchers build a winning pathology report generation model with H-optimus-1
Executive summary
Researchers from the Institute for Cancer Genetics and Informatics (ICGI) at Oslo University Hospital, won the REG2025 challenge, a three‑month competition run alongside the 28th International Conference on Medical Image Computing and Computer Assisted Intervention. The task required end‑to‑end systems to generate standardized pathology reports directly from gigapixel whole‑slide images, with evaluation emphasizing clinical alignment, linguistic quality, and generalization across data from diverse ethnic and different institutions from five countries.
The winning system, developed by Audun Ljone Henriksen and Sepp De Raedt of the ICGI Team, combined a toolkit for image preprocessing with H‑optimus‑1 as the visual backbone to produce robust tile‑level representations. A tailored component was trained to create a slide-level feature vector from the tile-level representations and generate standardized clinical reports.
This case study highlights the advantages of using H-optimus-1 as a backbone for histopathology and WSI analysis, bringing key benefits to pathologists and their diagnostic workflows.
The challenge
The REG2025 challenge is the first global challenge for end‑to‑end slide‑to‑report systems: given a gigapixel whole‑slide image (WSI), models must produce a clinically meaningful pathology report that mirrors real‑world structure and terminology while demonstrating strong slide‑level understanding1. The aim is to advance practical systems that can be integrated into diagnostic workflows across sites and scanners rather than generic image captioning.
System evaluation was performed by combining both linguistic and clinical relevance. The REG2025 composite ranking score combines standard natural language processing metrics (ROUGE and BLEU), clinical keyword coverage (KEY via Jaccard), and semantic similarity (EMB via embedding cosine on embeddings extracted from a biomedical LLM), with weights: 0.15 × (ROUGE + BLEU) + 0.4 × KEY + 0.3 × EMB1.
The dataset used at REG2025 spans ~10,500 cases from six medical centers across five countries (Korea, Japan, India, Turkey, Germany) to test multi-country and multi-ethnic generalization (figure 1). Leveraging pre-trained models was allowed, but training on additional external datasets were prohibited. Staged test phases included site hold‑outs to probe generalization and preserve fairness.

Figure 1. The REG2025 slide‑to‑report challenge, required teams to build an end‑to‑end system to generate standardized pathology reports directly from gigapixel WSIs. The candidates were evaluated for clinical alignment and linguistic fidelity across multi‑center, multi‑scanner data.
The context: Foundation models for slide understanding
A WSI is a high‑resolution digital scan of an entire pathology microscope slide, enabling pathologists to review tissue specimens on screen with pan‑and‑zoom instead of a traditional microscope. WSIs are gigapixel scale and heterogeneous, capturing subtle tissue patterns, but also staining variation and scanner artifacts across institutions.
To effectively analyze these complex images, foundation models trained on very large and diverse slide corpora are able to learn a reusable "visual language of histology" that transfers across organs and study settings. For WSI analysis, this may bring several concrete advantages:
- Reducing labeled‑data requirements, since the backbone already encodes rich morphology and context.
- Allowing generalization across domains.
- It accelerates iteration speed, because new tasks can be framed as lightweight adapters or heads on top of stable slide features rather than end‑to‑end model retraining.
- It streamlines evaluation and compliance, as consistent features support fair comparisons under challenge protocols and clearer paths to validation.
The solution: NARWHAL with a H‑optimus‑1 backbone
The NARWHAL system design established a robust workflow for slide-to-report generation, composed of three main components: the TRIDENT open-source toolkit for large-scale WSI preprocessing; a foundation model backbone to generate tile-level feature vectors; and a downstream reporting component, designed to combine the tile-level feature vectors to a slide-level embedding and subsequently translate the complex visual features into the precise terminology, formatting, and structure required by clinical standards.
To select the optimal backbone, the team rigorously tested five foundation models. H-optimus-1 had the highest average ranking score on a training hold-out dataset over multiple benchmark trainings, demonstrating the most reliable behavior across diverse sites and scanners2.
H-optimus-1's strong performance is rooted in its combination of unprecedented data diversity and a large, modern vision transformer architecture, yielding proven cross-task performance:
- It is trained on over 1 million H&E slides from more than 800,000 patients, spanning over 4,000 clinical centers and 50 organs, which covers a wide range of diseases.
- Its 1.1 billion-parameter ViT-g/14 was trained using self-supervised learning on billions of tiles, resulting in rich, transferable tile features.
- This multi-center, multi-scanner pretraining specifically aims to mitigate domain shift and facilitate cross-site robustness.
These design choices have already allowed H-optimus-1 to achieve state-of-the-art results on multiple independent histopathology benchmarks, including HEST3 and PathBench4, underscoring its robustness beyond this specific use case.
Crucially, the REG2025 dataset spans seven tissue types, all of which are represented in H-optimus-1’s pre-training corpus. This significant overlap means the backbone already encodes the morphology and context specific to the tissues evaluated in the challenge. This intrinsic knowledge reduces domain shift and allows downstream heads to learn more efficiently from limited task-specific supervision.
Using H-optimus-1, the ICGI team was able to build a system emphasizing three key principles:
- Consistent inputs and quality controls so slides are analyzed in a comparable way across institutions
- Clinically aligned reporting that mirrors standard terminology and formatting expected by pathologists and challenge evaluators
- Generalization across scanners and sites, prioritizing approaches that maintain performance when data sources change
The Impact
1. Data Diversity is Critical for Foundation Model Performance
Thanks to the team’s modular approach utilizing H-optimus-1 as the stable visual backbone, the NARWHAL system won REG2025, leading the final leaderboard with an overall ranking score of 0.839 (figure 2). H-optimus-1 demonstrated it is a robust and generalizable foundation model for WSI analysis. Dataset diversity in model training is key to ensure robustness to both technical and biological variation.

Figure 2. The REG2025 challenge final leaderboard.
2. Foundation Models Overcome Cross-Site Differences by Standardizing Reports
NARWHAL’s performance reflected strong clinical alignment under the composite metric and consistent performance across held-out sites and scanners. This success demonstrates that robust solutions based on foundation models can help standardize pathology analysis regardless of the originating institution or scanning equipment.
3. Foundation Models Enable Faster Diagnostic Workflows for Pathologists
If implemented in clinical contexts, this system could drastically improve histopathology and WSI analysis workflows. It performs faster reads with higher confidence, summarizing gigapixel slides into clinically formatted reports. The pathologist can simply provide the final validation. Thus, time is saved at the bench and more attention is freed for nuanced, high-value diagnostic decisions.
Freeing up pathologists’ time does more than improve efficiency—it can save lives. When routine steps are automated and drafts are pre-structured, reports move faster through the lab, cutting turnaround times so oncologists can start treatment earlier. Consistent, format-correct outputs reduce back-and-forth and prevent small errors that delay care or trigger unnecessary repeat tests. The time returned to experts is reinvested where human judgment matters most: resolving difficult differentials, confirming margins, grading dysplasia accurately, and participating in tumour boards to guide therapy choice. At a higher level, this shift reduces backlogs, smooths cross-site workflows, and accelerates the entire diagnostic pathway—meaning patients get answers sooner, begin the right therapy earlier, avoid disease progression while waiting, and ultimately have a better chance at survival.
The Conclusion
The excellent results achieved by the NARWHAL system illustrates how utilizing the H-Optimus-1 foundation model can accelerate and simplify the development of novel AI tools for histopathology. The system’s success across held-out sites demonstrates how foundation models can be utilized to build robust solutions to help standardize pathology analysis and reporting. If implemented clinically, this technology could help drastically improve diagnostic workflows by performing faster reads and freeing up pathologists' time for high-value decisions, ultimately accelerating diagnosis and saving lives.
Your Next Breakthrough
Bioptimus provides researchers and data scientists in pharma and biotech with the validated, state-of-the-art tools needed to turn ambitious goals into reality. If you are looking to power your research with a powerful, trusted foundation, the Bioptimus team is ready to support your work.
Contact us to learn how our models can accelerate your next project
About H‑optimus‑1
H‑optimus‑1 is a state‑of‑the‑art foundation model for pathology developed by Bioptimus. It was trained using self‑supervised learning on one of the most extensive and diverse datasets of its kind, comprising over 1 million pathology slides from more than 800,000 patients across thousands of clinical centers. This unprecedented patient diversity enables the model to learn a rich, generalizable understanding of human biology, allowing it to recognize a vast array of tissue patterns and disease signals. As a result, H‑optimus‑1 has achieved state‑of‑the‑art performance, outperforming other leading models across a wide range of industry‑standard benchmarks, from predicting gene expression to identifying cancer metastasis.
About Bioptimus
Bioptimus is a global AI tech company that is pioneering the world's first universal foundation model for biology. By combining cutting‑edge AI with massive multimodal, proprietary data generation, Bioptimus is building a unifying framework that connects all scales of biology, from molecules to patients in a framework that delivers interpretable, dynamic, and actionable insights. The first foundation model released by Bioptimus, H‑optimus, is an industry‑leading model being adopted across research, drug discovery, and clinical pipelines.
References
- REG2025 Organizers. 2025. REG2025 Results and Leaderboards. Grand Challenge. Link
- Bioptimus. H-optimus-1, 2025. URL https://huggingface.co/bioptimus/H-optimus-1
- Jaume G., Doucet P., Song A. H., Lu M. Y., Almagro-Perez C., et al. 2024. HEST‑1k: A Dataset for Spatial Transcriptomics and Histology Image Analysis. Advances in Neural Information Processing Systems. Link — Dataset card: HEST‑1k on Hugging Face
- Ma, Jiabo, Yingxue Xu, Fengtao Zhou, Yihui Wang, Cheng Jin, Zhengrui Guo, Jianfeng Wu, et al. 2025. “PathBench: A Comprehensive Comparison Benchmark for Pathology Foundation Models towards Precision Oncology.” doi:10.48550/arXiv.2505.20202.Based on the content and structure of the case study, here is some constructive feedback and suggested changes: