Skip to content

Latest commit

 

History

History
14 lines (10 loc) · 1.05 KB

README.md

File metadata and controls

14 lines (10 loc) · 1.05 KB

ML in Production - Capstone

This repo corresponds to the course of Machine Learning in Production. This code is done for educational purposes. As such, it is neither a real production code, nor a toy example easy to understand but useless. We tried to make it as similar as possible to real production systems, highlighting some parts and missing others to make it more readable.

2020's Edition

In 2020's edition we will train a model to tag Stackoverflow's questions. Data is publicly available here. Basically

  • We build a pipeline in Airflow to preprocess data in Google's BigQuery.
  • We create Python packages, with their corresponding tests, to preprocess text, train a model and predict it.
  • We create Dockerfiles that runs a Flask app that serves the model.