How to Deploy Machine Learning on Google Cloud Platform
Editor’s Note: Because our bloggers have lots of useful tips, every now and then we update and bring forward a popular post from the past. Today’s post was originally published on August 15, 2019.
In this post, I'll describe a few takeaways for deploying or submitting machine learning (ML) tasks on Google Cloud Platform (GCP). If you have less experience as a ML engineer, or if you're a solution architect, you might be in the right place to learn some tips.
What exactly is an ML task? Before building an ML model, you first need to specify what you're planning to accomplish with the data. Having this in mind helps you to identify what the exact ML tasks are for your use case. Some broad ML tasks could be: data ingestion, feature transformation, supervised learning, semi-supervised learning, unsupervised learning, dimensionality reduction, active learning, reinforcement learning and model prediction. This is too much material to cover in one blog post, so expect another post to cover all those tasks. In the meantime, you can find a general overview on Wikipedia.
There are many options when deploying or submitting ML tasks on GCP:
Kubernetes, Dataproc, AI Platform DefaultIn cases like these, there's less flexibility to change the environment to suit your specific needs. For instance, ML Engine Default is only Python. In Dataproc you need to deploy a Hadoop cluster environment.
2. Offline tasks that take a long time using a flexible environment:
AI Platform Custom Training and Cloud Build
In this approach, you can use the isolation capability of Docker containers to set the environment requirements. Those approaches involve building a container and pushing it to a cloud container registry.
3. Online tasks that take a few minutes to run using a flexible environment:
Cloud Run (Knative), App Engine
These two services are serverless, therefore you only need to worry about your task code. They support several programming languages and they are elastic; scaled from 0. This means, while the endpoint is not being consumed there are no incurred costs. However, you may find some limitations on scalability performance and security.
4. Online tasks that take seconds to run using a less flexible environment:
This is a serverless approach, used to perform Lambda calculations.
Cloud Composer (Airflow), Cloud SchedulerIn this case, we need a scheduling system to run our ML task. Cloud Composer uses Celery for that and Scheduler uses CronJobs.
6. Online tasks requiring triggering by events taking any amount of time:
Cloud Build - through Triggers, Cloud Tasks,Airflow - REST api call to trigger DAG
Cloud Build is not designed for ML tasks, but it can be used to run any kind of event-based automated tasks. Cloud Tasks implements an asynchronous queue of tasks that can target messages to a PubSub topic. After that, we can consume those messages and run different kinds of tasks.
7. Online tasks requiring triggering by events taking a short time by calling REST api:
Cloud Run (Knative), REST API Kubernetes app, AI Platform Custom "Model Serving", Cloud Function, App Engine
In this situation, you have plenty of options to implement the ML task. Choosing between the available options will depend on other requirements such as security, performance and costs.
8. Orchestrated tasks manually triggered that take any amount of time:
Cloud Build - through Triggers, Cloud Composer(Airflow), KubeflowHere, you are implementing an orchestrated workflow that will perform tasks in sequence depending on conditions. Each task may take a different amount of time to finalize. A modern ML workflow / pipeline usually involves running containers in a third container orchestrator environment. On GCP it can be Kubernetes or AI Platform. Note that we now have ML Engine Custom Containers for training and serving, which is the only option where you can be careless about scalable cluster management and environment (e.g. Java, Python, Go, etc.).