Skip to main content
The diagram below depicts the target topology for GCP.
Requirements for GCP:
  • Two GCS storage buckets for gazette and druid data.
  • A GKE cluster with a minimum of two node pools: base pool and druid pool.
  • The base node pool should be labeled with arize=true and arize-base=true.
  • The druid node pool should be labeled with arize=true and druid-historical=true.
  • A GCP service account attached to a role with the following permissions:
    • bigquery.jobs.create
    • storage.objects.create
    • storage.objects.delete
    • storage.objects.get
    • storage.objects.list
    • artifactregistry.repositories.downloadArtifacts
    • aiplatform.endpoints.predict
  • If using Workload Identity (default):
    • The service account must grant permissions to these GKE namespace/service-account pairs with role Workload Identity User:
      • arize/arize
      • arize-operator/arize-operator
      • arize-spark/spark
  • If not using Workload Identity:
    • A JSON key from the GCP service account is required
  • Storage classes premium-rwo and standard-rwo are preferred and used by default.
  • A GCR or docker registry is optional as Arize pulls images from Arize AX’s central image registry by default.
  • Namespaces arize, arize-operator, and arize-spark can be pre-existing or created later by the helm chart.
  • If using workload identity, the GCP service account must have role bindings to <namespace>/<k8-service-account> pairs.
  • If not using workload identity, a JSON key from the service account is required.
Contact Arize for the <sizing option> field. This field controls the size of the deployment and must align with the size of the cluster. Common values used would be small1b or medium2b. values.yaml:
cloud: "gcp"
clusterName: "gke_<project-id>_<region>_<cluster-name>"
hubJwt: "<JWT>" (base64 encoded)
gazetteBucket: "<name of gazette bucket>"
druidBucket: "<name of druid bucket>"
postgresPassword: "<user selected postgres password>" (base64 encoded)
organizationName: "<name of the organization or company>"
cipherKey: "<encryption key>" (base64 encoded)
clusterSizing: "<sizing option>"
gcpProject: "<project-id>"
gcpServiceAccountName: "<name of the service account email>" (base64 encoded)
collectNodeMetrics: true

# The URL used to reach the Arize UI once ingress endpoints are created
appBaseUrl: "https://<arize-app.domain>"

# Only required if using JSON key instead of GKE with Workload Identity
gcpWorkloadIdentityEnabled: false
gcpServiceAccountJsonKey: "<json key file contents>" (base64 encoded)

# Only required if using a private docker registry
pushRegistry: "gcr.io/<project-id>"
pullRegistry: "gcr.io/<project-id>"

# Only required if using a common node pool
historicalNodePoolEnabled: false
Choose an approach based on the deployment, for helm:
$ helm upgrade --install -f values.yaml arize-op arize-operator-chart.tgz