Getty Images/iStockphoto
Set up a machine learning pipeline in this Kubeflow tutorial
For teams running machine learning workflows with Kubernetes, using Kubeflow can lead to faster, smoother deployments. Get started with this installation guide.
You don't have to use Kubernetes to power machine learning deployments. But if you do -- and there are many reasons why you might want to -- Kubeflow is the simplest and fastest way to get machine learning workloads up and running on Kubernetes.
Kubeflow is an open source tool that streamlines the deployment of machine learning workflows on top of Kubernetes. Kubeflow's main purpose is to simplify setting up environments for building, testing, training and operating machine learning models and applications for data science and MLOps teams.
It's possible to deploy machine learning tools such as TensorFlow and PyTorch on a Kubernetes cluster directly without using Kubeflow, but Kubeflow automates much of the process required to get these tools up and running. To decide whether it's the right choice for your machine learning projects, learn how Kubeflow works, when to use it and how to install it to deploy a machine learning pipeline.
The pros and cons of Kubernetes and Kubeflow for machine learning
Before deciding whether to use Kubeflow specifically, it's important to understand the pros and cons of running AI and machine learning workflows on Kubernetes in general.
Should you run machine learning models on Kubernetes?
As a platform for hosting machine learning workflows, Kubernetes offers several advantages.
The first is scalability. With Kubernetes, you can easily add or remove nodes from a cluster to modify the total resources available to that cluster. This is particularly beneficial for machine learning workloads, whose resource consumption requirements can fluctuate significantly. For example, you might want to scale your cluster up during model training, which usually requires a lot of resources, then scale back down to reduce infrastructure costs after training is done.
Hosting machine learning workflows on Kubernetes also offers the advantage of providing containers access to bare-metal hardware. This is useful for accelerating the performance of your workloads using GPUs or other hardware that wouldn't be accessible on virtual infrastructure. Although you could access bare-metal infrastructure without using Kubernetes by running workloads in standalone containers, orchestrating containers with Kubernetes makes it easier to manage workloads at scale.
A major reason why you might not want to use Kubernetes to host machine learning workflows, however, is that it adds another layer of complexity to your software stack. For smaller workloads, a Kubernetes-based deployment might be overkill. In such situations, running workloads directly on VMs or bare-metal servers could make more sense.
When should you choose Kubeflow?
The chief advantage of using Kubeflow for machine learning is the tool's fast and simple deployment process. With just a few kubectl commands, you get a ready-to-use environment where you can start deploying machine learning workflows.
On the other hand, Kubeflow restricts you to the tools and frameworks it supports -- and might include some resources that you won't end up using. If you just need one or two specific machine learning tools, you might find it simpler to deploy them individually rather than with Kubeflow. But for anyone who needs a general-purpose machine learning environment on Kubernetes, it's hard to argue against using Kubeflow.
Kubeflow tutorial: Install and setup walkthrough
On most Kubernetes distributions, installing Kubeflow boils down to running just a few commands.
This tutorial demonstrates the process using K3s, a lightweight Kubernetes distribution that you can run on a laptop or PC, but you should be able to follow the same steps on any mainstream Kubernetes platform.
Step 1. Create a Kubernetes cluster
Start by creating a Kubernetes cluster if you don't already have one up and running.
To set up a cluster using K3s, first download K3s with the following command.
curl -sfL https://get.k3s.io | sh -
Next, run the command below to start a cluster.
sudo k3s server &
To check that everything's running as expected, run the following command.
sudo k3s kubectl get node
The output should resemble the following.
NAME STATUS ROLES AGE VERSION chris-gazelle Ready control-plane,master 2m7s v1.25.7+k3s1
Step 2. Install Kubeflow
With your cluster up and running, the next step is to install Kubeflow.
Use the following commands to do this on a local machine using K3s.
sudo -s
export PIPELINE_VERSION=1.8.5
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/platform-agnostic-pns?ref=$PIPELINE_VERSION"
If you're installing Kubeflow on a nonlocal Kubernetes cluster, the commands below will work in most cases.
export PIPELINE_VERSION=<kfp-version-between-0.2.0-and-0.3.0>
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/base/crds?ref=$PIPELINE_VERSION"
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"
Step 3. Verify that containers are running
Even after you install Kubeflow, it's not fully operational until all the containers that comprise it are running. Verify the status of your containers with the following command.
kubectl get pods -n kubeflow
If the containers aren't running successfully after several minutes, check out their logs to determine the cause.
Step 4. Start using Kubeflow
Kubeflow provides a web-based dashboard to create and deploy pipelines. To access that dashboard, first make sure port forwarding is correctly configured by running the command below.
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
If you're running Kubeflow locally, you can access the dashboard by opening a web browser to the URL http://localhost/8080. If you installed Kubeflow to a remote machine, replace localhost with the IP address or server hostname where you're running Kubeflow.