PathML: An open-source software toolkit for computational pathology research


Summary:

Imaging datasets in cancer research have grown exponentially in size and information density in recent years, driven chiefly by two trends:

  • Increasing adoption of digital pathology workflows at departmental and institutional scale (large n datasets)

  • Emerging technologies in highly multiplexed imaging and spatial omics (high dimensional datasets)

The unprecedented scale of today’s datasets may enable derivation of insights for cancer research and clinical care, but only if researchers are equipped with the tools to leverage advanced computational approaches from machine learning and computer vision. PathML is a software toolkit designed to lower the barrier to entry for computational pathology, enabling researchers to develop streamlined, scalable, fully customized end-to-end image analysis pipelines, with a unified framework for brightfield, multiplexed immunofluorescence, and spatial omics images and support for 160+ file formats. Developed at Dana-Farber Cancer Institute and Weill Cornell Medicine, PathML is currently being used by 7+ research groups and 2 imaging core facilities across the two institutions. PathML is an open-source project freely available on GitHub, with complete documentation, tutorials, and example vignettes and more than 15,000 downloads worldwide. We welcome anyone interested in collaborating or learning more to contact us at PathML@dfci.harvard.edu or visit www.pathml.org for more information.

Presentation recording:

 
 

Download the presentation slides above

 

About the presenters:

Renato Umeton studied computer science for both Master’s and Bachelor’s, after that he got a Ph.D. in Mathematics and Informatics defending a thesis on Optimization and Ontology for Computational and Systems Biology, which brought him to work first at Microsoft and then at Massachusetts Institute of Technology. His Alma Mater is University of Calabria, and he recently completed his Executive Education at Harvard. Currently Renato serves as Associate Director of Artificial Intelligence Operations and Data Science Services in the Informatics & Analytics department of Dana-Farber Cancer Institute, a teaching affiliate of Harvard Medical School. In this position, where he reports to the Chief Data and Analytics Officer, Renato created the departmental AI & data science group. He accrued 15 years of experience across artificial intelligence, data science, and big data working in other hospitals, in academia, in consulting, and in industry, where he operated in roles spanning from postdoc to director. In those contexts, he worked on several scientific publications and patents, some of which were leveraged in clinical trials and others were licensed.

Jacob Rosenthal is a data scientist in the Artificial Intelligence Operations and Data Science Services group at Dana-Farber Cancer Institute, where he leads development of data infrastructure and analytics to enable the Institute’s efforts in digital pathology research and operations. He is also an affiliate data scientist in the Department of Pathology and Laboratory Medicine and Weill Cornell Medicine. He received his M.Sc. in Health Data Science from the Harvard T.H. Chan School of Public Health.

Previous
Previous

JUN 2022 (DPCUS2022)

Next
Next

FEB 2022 (Webinar on AI Grand Challenges)