Our experience hosting New York data scientists and researchers from academia and industry
2018 has been an excellent year for machine learning breakthroughs and the larger data science community!
Our team at Comet.ml began hosting meetups through NYC Artificial Intelligence & Machine Learning with the mission of bringing together data scientists to discuss the latest developments in machine learning across academia and the industry.
Our meetups spanned industries and research topics — in July, we invited Clare Gollnick, Terbium Labs CTO, to share her thoughts on the data science crisis of reproducibility. In October, we hosted the Precision Health AI team to present their machine learning pipeline for managing oncology data and deploying prediction models.
Keep reading for the full recap of the year and what our different speakers covered —
⭐️ Our next meetup with Enigma Principal Data Scientist, Jarrod Parker, will be on January 29, 2019! ⭐️ RSVPs will open the beginning of the month — join the meetup group to stay updated!
Network Analysis, Machine Learning Reproducibility, Oncology Data, and more
Terbium Labs CTO, Clare Gollnick — Machine Learning Reproducibility and Data Philosophy
As a data scientist, scientist, and CTO, there’s no one closer to the reproducibility crisis in data science than Clare Gollnick.
During her presentation, Clare spoke about the misapplication of statistics, the philosophy behind data science and shared a framework that explains (and even predicts) the likelihood of success of a data project.
See Clare’s sides for the meetup here.
Columbia PhD + FAIR Intern, Melanie Weber — Curvature-based Analysis of Complex Networks
Complex networks are a popular means for studying a wide variety of systems across the social and natural sciences and, more recently, representing big data. As the complexity of networks have exponentially increased, efficient evaluation remains a data-analytic challenge with implications for machine learning.
Melanie Weber and her team developed geometric tools using a discrete Ricci curvature for efficiently analyzing the structure and evolution of complex networks. During her presentation, Melanie describes the boost in performance when commonly used node-based approaches were extended to include edge-based information such as edge weights and directionality for a more comprehensive network characterization. Their results identify important structural features, including long-range connections of high curvature acting as bridges between major network components.
Truly an engaging presentation on how curvature-based tools allow for an efficient computation of this core structure and, based on this core structure, more expensive analysis, hypothesis testing and learning of complex models.
See more of Melanie’s work on her website.
Precision Health AI , Victor Wang, Karl Rudeen, and Vito Colano — Machine learning pipelines for cancer detection with Precision Health AI
Precision Health AI’s (PHAI) brought business and technical PHAI team members to speak about the company’s data challenges, the importance of domain expertise in oncology analytics, and managing data science deliverables for pharmaceutical clients.
It was fascinating to see how Karl Rudeen, PHAI data scientist, explained the company’s data model and Vito Colano, PHAI principal software engineer, validated and deployed their OncoStage Last Stage prediction model.
Precision Health AI is hiring. See their open ML Engineer position
Columbia PhD, Yixin Wang — Causal inference in machine learning with Columbia University
Yixin Wang’s joint work with David Blei proposed the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. Essentially, the deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference.
Yixin showed how the deconfounder provides a checkable approach to estimating close-to-truth causal effects for a real dataset about actors and movie revenue.
See the paper at: https://arxiv.org/abs/1805.06826
Try the Github tutorial at: https://github.com/blei-lab/deconfounder_tutorial
Interested in speaking? Email me at email@example.com
Comet.ml helps data scientists and machine learning engineers to automatically track their datasets, code, experiments and results creating efficiency, visibility, and reproducibility.
Learn more & see a demo at https://www.comet.ml/