Define deployment function

The purpose of deployment function is to define KServe's InferenceService which Kfops will deploy the ML model to.

Make sure your config.yaml points at the file where the function has been defined. Refer to details on main config page.

The function uses KServe SDK which closely "mimics" the YAML-based manifests. At the first, the SDK defined function might feel more complex, but you will quickly notice that the YAML to Python "translation" process is fast and simple.

Why not YAML manifest? Kfops needs to inject parameters into generated inference service. These parameters are not defined on the function level but instead "managed" by Kfops.

Function signature

Kfops expects function signature to have following inputs, content and return value:

Input parameters

  • Inference service name - defines part of the URL under which your service will be available.

  • Storage URI - the exact location where your trained ML model is stored.

  • Namespace - namespace where the model will be deployed.

  • Canary traffic percent - parameter controlled by Kfops that allows to perform canary/shadow deployment.

In other words, your function's name and input parameters should look like following:

def inference_service_instance(name: str, storage_uri: str, namespace: str = 'default',
                               canary_traffic_percent: int = None):

Return value

Function has to return V1beta1InferenceService object instance. Support for V1alfa has been disabled in Kubeflow v1.5 and so Kfops enforces the "beta".

Function content

Check the example below. Notice that all parameters (apart from those passed as function inputs) are "hardcoded" in the function. Specifically, these are service_account_name and resources.

The service_account_name has to match ServiceAccount that KServe uses to access your model file in MinIO. For details refer to deployment page in Administrator's guide.

Example function

Example below shows the "translation" of a simple Sklearn-based InferenceService manifest into the SDK version.

YAML manifest:

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: <NAME>
  namespace: <NAMESPACE>
spec:
  predictor:
    serviceAccountName: <SERVICE-ACCOUNT-NAME>
    canaryTrafficPercent: <CANARY-TRAFFIC-PERCENT>
    sklearn:
      storageUri: <STORAGE-URI>
      resources:
        requests:
          cpu: 0.02
          memory: 200Mi
        limits:
          cpu: 0.02
          memory: 200Mi     

KServe SDK equivalent:

from kubernetes import client
from kserve import constants
from kserve import utils
from kserve import V1beta1InferenceService
from kserve import V1beta1InferenceServiceSpec
from kserve import V1beta1PredictorSpec
from kserve import V1beta1TFServingSpec, V1beta1SKLearnSpec


def inference_service_instance(name: str, storage_uri: str, namespace: str = 'default',
                               canary_traffic_percent: int = None):
    isvc = V1beta1InferenceService(
        api_version=constants.KSERVE_GROUP + '/' + 'v1beta1',
        kind=constants.KSERVE_KIND,
        metadata=client.V1ObjectMeta(
            name=name,
            namespace=namespace),
        spec=V1beta1InferenceServiceSpec(
            predictor=V1beta1PredictorSpec(
                service_account_name='deployer',
                canary_traffic_percent=canary_traffic_percent,
                sklearn=(V1beta1SKLearnSpec(
                    storage_uri=storage_uri,
                    resources=client.V1ResourceRequirements(
                        limits={'cpu': 0.02, 'memory': '200Mi'},
                        requests={'cpu': 0.02, 'memory': '200Mi'})
                )))))

    return isvc