Seminar: Laura Jarosz
During a recent department seminar December 17, 2025, Laura Jarosz, BEng, MSc, presented her work in a talk titled “Technical heterogeneity in whole-exome sequencing data.” The presentation explored how non-biological factors—such as exome capture kits, population structure, and variant calling strategies—can drive systematic differences in downstream analyses of WES data.
Drawing on large breast cancer cohorts, Laura showed how germline variants can be aggregated at the gene level using CADD-weighted features, and how technical bias can be addressed through cohort genotyping, haplotype-based genotype imputation, and cluster-aware feature imputation, leading to better-integrated datasets and more efficient computation.
The study lays important methodological groundwork for CanAge, a deep-learning framework designed to extract personalized insights into cancer risk from germline exome data, and points toward more robust and scalable cross-cohort analyses.