Loading…
Friday, November 13 • 1:45pm - 2:15pm
Automate the boring ML stuff with pipelines

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Production machine learning systems fail in unexpected ways. The data shifts and the accuracy decreases, or the preprocessing steps don’t match what the model expects. The model looks great when you look at one metric, but you measure it another way and it looks terrible. Tired of writing boring custom code to fix these problems? You need an automated way to stop these potential failures before they happen. In this talk, we’ll describe and demo automated machine learning pipelines using TensorFlow Extended and Kubeflow Pipelines. These pipelines include steps to validate the data that flows into the pipeline, preprocess the data, kick off a training run, analyze the model in-depth, and push the final model to its serving location. All of the steps are orchestrated using Kubeflow Pipelines, which lets you schedule a new pipeline run and makes sure the components are completed in the correct order. We'll show an example project of setting up a simple ML Pipeline which lets us produce consistent ML models. Using public data, we describe how these automated machine learning pipelines solve the problem of a mismatch between feature engineering and model training. We’ll also show how we can analyze our models in depth to ensure they provide a fair experience to all users.

Speakers
avatar for Hannes Hapke

Hannes Hapke

Senior Machine Learning Engineer, SAP
Hannes Hapke is a senior data scientist for Concur Labs at SAP Concur, where he explores innovative ways to use machine learning to improve the experience of a business traveler. Prior to joining SAP Concur, Hannes solved machine learning infrastructure problems in various industries... Read More →
avatar for Catherine Nelson

Catherine Nelson

Data Scientist, Freelance
Catherine Nelson is a freelance data scientist and writer. She is currently working on the forthcoming O’Reilly book "Software Engineering for Data Scientists”. Previously, she was a Principal Data Scientist at SAP Concur, where she delivered production machine learning applications... Read More →


Friday November 13, 2020 1:45pm - 2:15pm PST
data