Seminar: Joanna Żyła
As part of our department seminar on November 12, 2025, we had the pleasure of attending a talk titled “dpGMM: Dynamic Programming in Gaussian Mixture Models and Other Solutions in the R Programming Language.”
The speaker, Joanna Żyła, BEng, PhD, presented the problem of suboptimal initialization in Gaussian Mixture Models (GMM) and a solution that significantly improves the decomposition process.
During the presentation, it was shown that the dpGMM method is characterized by:
- Superior or comparable clustering performance compared to existing R implementations (mclust, ClusterR, mixtools).
- Accurate parameter estimation and high scalability, even for large clustering problems.
- The ability to handle 1D and 2D binned data (histograms, images) — a functionality unique among GMM tools available in R.
- Built-in automatic cluster number selection, failure-prevention mechanisms, and intuitive visualization tools.
- Proven effectiveness on both synthetic and real biological data (e.g., histology, proteomics, cytometry).
Package and documentation: https://github.com/ZAEDPolSl/dpGMM