ZCDP Quick Start

1. Install ZCDP

Download the public resources
Installation scripts and all CRDs are available in the deploy directory of the https://github.com/dataplatformsolutions/zero-copy-data-plane repository.

Either clone the repository or download one of the release bundles at https://github.com/dataplatformsolutions/zero-copy-data-plane/releases
Ensure kubectl is installed and configured
Make sure your KUBECONFIG points at the target cluster where you want to install ZCDP.
Run the installer
Run the installer script to install the controller and webhook. It also installs cert-manager and the CRDs.
```
deploy/install-all.sh
```
Verify by running the health check in
```
deploy/scripts/health-check.sh
```

Note: the docker image version can be configured at the top of the install-all.sh script.

ZCDP_CONTROLLER_IMAGE="ghcr.io/dataplatformsolutions/zcdp-controller:0.1.0"
ZCDP_AGENT_IMAGE="ghcr.io/dataplatformsolutions/zcdp-agent:0.1.0"

the namespaces on which ZCDP works can be configured by editing the webhook.yaml manifest, specifically:


    namespaceSelector:
      matchExpressions:
        - key: kubernetes.io/metadata.name
          operator: In
          values:
            - test-workload

2. Enable Agents

Tag nodes where you want the ZCDP agent to run with the label zcdp.io/enable-agent=true. The agent is responsible for syncing datasets to local cache and managing dataset lifecycle on each node.

This can be done using the following command:

kubectl label nodes <node-name> zcdp.io/enable-agent=true

Once this is done you are all set to use ZCDP. Either follow the next steps to configure storage backend, datasets and dataset claims or skip ahead and run the complete example that ships in the public repo.

3. Complete S3-based walkthrough

This walkthrough demonstrates a minimal but functional deployment that syncs a public dataset from Amazon S3 and mounts it into a training job.

Define object storage credentials.

kubectl -n zcdp-system create secret generic s3-creds \
  --from-literal=AWS_ACCESS_KEY_ID=AKIA... \
  --from-literal=AWS_SECRET_ACCESS_KEY=... \
  --from-literal=AWS_REGION=us-west-2

Define a S3 StorageBackend.

cat <<'EOF' | kubectl apply -f -
apiVersion: zcdp.io/v1alpha1
kind: StorageBackend
metadata:
  name: public-s3-data
spec:
  type: s3
  auth:
    mode: accessKey
    accessKeyIdSecretRef:
      name: s3-creds
      key: AWS_ACCESS_KEY_ID
    secretAccessKeySecretRef:
      name: s3-creds
      key: AWS_SECRET_ACCESS_KEY
  s3:
    region: us-west-2
EOF

Define Dataset to be used

cat <<'EOF' | kubectl apply -f -
apiVersion: zcdp.io/v1alpha1
kind: Dataset
metadata:
  name: dataset-products-2017
spec:
  source:
    storageBackendRef: public-s3-data
    path: datasets/products/2017
  cache:
    maxSize: 80Gi
EOF

Run a sample training workload using the ZCDP lightweight annotation.

cat <<'EOF' | kubectl apply -f -
apiVersion: batch/v1
kind: Job
metadata:
  name: products-ls
  namespace: trainers
spec:
  template:
    metadata:
      annotations:
        zcdp.io/datasets: "dataset-products-2017:/mnt/datasets/products"
    spec:
      restartPolicy: Never
      containers:
        - name: ls
          image: busybox:1.36
          command:
            - sh
            - -c
            - |
              echo "Listing /mnt/datasets/products"
              ls -lah /mnt/datasets/products
EOF

Monitor progress.

kubectl get nodedatasets -n zcdp-system
kubectl logs -n zcdp-system daemonset/zcdp-agent -f
kubectl logs -n trainers job/products-ls

Once the NodeDataset transitions to Ready, subsequent pods on the same node start instantly because the data is already on the NVMe cache.

4. Complete Environment Example

The example/ folder in the dataplaformsolutions/zero-copy-data-plane repository spins up a local k3s cluster, MinIO, and a sample workload that reads a dataset through ZCDP.

Prerequisites

Docker + Docker Compose
kubectl

The scripts assume KUBECONFIG points at INSTALL_PATH/example/kubeconfig/kubeconfig.yaml after the cluster boots.

Ensure the scripts are executable:

chmod +x INSTALL_PATH/example/scripts/*

Run the scripts

# 0. set working dir as scripts.
cd example/scripts

# 1. Start the infrastructure (k3s + MinIO)
./01_start_infra.sh

# 2. Install ZCDP using the /deploy/install-all.sh script.
./02_install_zcdp.sh

# 3. Tag 2 nodes with zcdp.io/enable-agent=true so that they can execute workloads
./03_enable_agents.sh

# 4. Upload the sample datasets to MinIO
./04_seed_minio.sh

# 5. Apply storage backend and dataset manifests
./05_apply_dataset.sh

# 6. Build the job container, push it into k3s, and apply the workload manifest
./06_build_and_deploy_job.sh

# 7. Check the output from the sample job
./07_verify_workload.sh

A good way to learn how to deploy your own workloads is to look at script 05_apply_dataset.sh and the manifests that it applies. Use these as a starting point, but also give them to AI to help generate the variations you need for your jobs.

Observability

Port-forward the status page while the example is running:

kubectl -n zcdp-system port-forward svc/zcdp-status 8080:8080

Clean up

docker compose down -v