My experience at ODSC West 2019

Updated: Nov 14, 2019

I attended the Open Data Science Conference (ODSC) West 2019 between October 29th and November 1st, although there was also a workshop help on the 28th. All of the events were hosted in the series of ballrooms and event spaces within the Hyatt Regency San Francisco Airport.

On the 29th and 30th I decided to do a deep dive by attending all four workshop sessions of Andreas Müller’s SciKit Learn talks. The first workshop covered supervised learning, preprocessing, and missing values. The second workshop covered cross validation grid search, linear model regression, linear model classification, trees and forests, and gradient boosting. The third workshop covered the scikit learn API, pipelines, and model evaluation. Finally the fourth workshop covered imbalanced data, feature selection, working with text data, and Stochastic gradient descent. At the end of the final presentation Dr. Müller addressed how to contribute to the library. Around midday on the 30th I briefly attended the Azure ML talk lead by Santhosh Pillai focused on NLP with Azure and had a chance to talk to Software Engineer Abhiram Eswaran about MLOps and Microsoft’s strategies concerning production level data science pipelines, when notebooks are no longer appropriate, and cloud based clusters and resource allocation. Although I did not have time to attend on the 30th, I was able to look through the material of Eric Ma’s nxviz workshop which looks useful for network visualization.

Main Ballroom at Hyatt Regency San Francisco Airport

On the 31st the first key note was given by Sepideh Seifzadeh, Ph.D. and focused on AI life cycle model management and bias. She touched on the Equivalent (formerly NorthPoint) COMPAS algorithm, which was bias towards black skin color vs white skin color defendants in the criminal justice system. She noted that the setting of weights during feature engineering in algorithm design was a moment of power and privilege within a developers workflow. Since it was an IBM talk Sepideh covered their open source tool AI fairness 360 IBM as well as the Watson Open Scale proprietary production platform.

The second keynote was given by Dawn Xiaodong Song Ph.D. who is a member of Berkeley BAIR, BDD, CHAI, and riselab, whose talk and focused on AI security, her chorus tool and her company OASIS LABS.

The third and final keynote was given by Rachel Thomas Ph.D. focused on algorithmic bias and the gender shades project. One quote from her presentation that stuck out was that “The privileged are processed by people, the poor by algorithms”.

After the keynotes I attended five talks. The first titled Chaos DevOps for Data Science addressed the need for MLOps and covered both a currently used stack approach of Docker + Git + Kubernetes + Prometheus + Grafana, as well as an alternative dotscience product. The second talk covered Responsible AI and was given by Amy Hodler from neo4j. The third talk was given by Waymo researcher Chen Wu who was opaque. The fourth talk was given by Shashank Prasanna and covered a few AWS Machine Learning workflows.

The fifth and final workshop of the day was titled Declarative Data Visualization with Vega-Lite and Altair given by Apple Researchers Kanit “Ham” Wongsuphasawat, and Dominik Moritz.

On the final day of November 1st during the morning session I attended the deep learning building blocks workshop and in the afternoon I attended the workshop Integrating Elasticsearch with Analytics Workflows.

I paid $2013.42 (including taxes) for the VIP pass for ODSC West 2019. This pass gave me access to all of the available workshops. The conference's broad array of speakers vary in levels of expertise and polish in their presentation, and as a whole the conference leans heavier on corporate presentations that individual github repositories and papers. Personally, attending was useful because it gave me exposure to a wide array of labs and open source software that I was not previously aware of.

