Genomics is fundamentally transforming modern medicine. With sequencing technologies profiling biological systems at unprecedented scale and resolution, we are generating petabytes of diverse, complementary data. Our lab focuses on the exciting challenge of turning this massive scale of data into actionable biological insight. To that end, we develop computational methods that merge biological domain knowledge with discrete algorithms and AI – modeling the structure, composition, and variation captured within this data across genomes, transcriptomes, and other molecular layers – in order to understand biological function and uncover mechanisms that drive disease.

Core focus: structural modeling & variation

Genomic structural variants (such as deletions, duplications, inversions, and complex combinations thereof) and transcriptomic variation (arising through alternative splicing or gene fusions) fundamentally alter biological function. They play a critical role in the evolution of cancer and a wide spectrum of other disorders, including autism, Parkinson’s, Huntington’s, and Alzheimer’s. As such, a central focus of our group is structural modeling: understanding how structural variation shapes regulation, function, and disease by building robust models to accurately discover and interpret these variations.

Our approach

To build these models, we explore novel ways to represent large-scale genomic data and formulate new data-driven approaches:

  • Geometric and relational representations: We recast raw sequencing data into signal-preserving images, graphs, and other structured encodings. This allows us to turn classical genomic problems into computer-vision and graph-learning tasks that operate across multiple platforms and modalities.
  • Deep reinforcement learning: We combine discrete mathematical frameworks with deep reinforcement learning to solve computationally hard problems in genomics, such as haplotype assembly and phylogeny inference in cancer genomes.
  • In-depth data characterization: We believe that accurate biological insight requires a deep understanding of sequencing data from the ground up and the transformations it undergoes at every step of analysis. We develop methods to uncover, model, and mitigate artifacts and biases that arise during library construction, sequencing, and downstream processing.
  • Robust validation: To ensure our methods translate into real-world discovery, we pair algorithm development with the generation of large-scale data resources (truthsets) and the development of new evaluation and visualization tools.

About the PI

Victoria Popic leads the Popic Lab at the Broad Institute, which she founded in 2020 as a Schmidt Fellow, and serves as the Director of Computational R&D at Broad Clinical Labs. She earned her Ph.D. in Computer Science from Stanford University, following B.S. degrees in Computer Science and Mathematics and an M.Eng. in Computer Science from MIT. Her background spans both academia and industry, including full-time roles at Oracle, Illumina, and SambaNova Systems, alongside engineering and research internships at Microsoft, NVIDIA, and Google.

Funding Support

We are incredibly grateful for our funding support:

SchmidtFellows Starr NIH BCL