GCP GKE Prerequisites¶
This guide covers all GCP-specific prerequisites for deploying the e6data Kubernetes Operator on Google Kubernetes Engine (GKE).
Quick Reference¶
| Requirement | Status | Notes |
|---|---|---|
| GKE 1.24+ | Required | Kubernetes cluster |
| Workload Identity | Required | For IAM authentication |
| GCS Bucket | Required | Data lake storage |
| IAM Roles | Required | Least-privilege access |
| Karpenter 1.0+ | Recommended | Dynamic node provisioning |
| BigQuery | Optional | If using BigQuery metastore |
1. Workload Identity Setup¶
Workload Identity is the recommended method for GKE workloads to access Google Cloud APIs.
1.1 Enable Workload Identity on Cluster¶
# For new cluster
gcloud container clusters create YOUR_CLUSTER \
--workload-pool=PROJECT_ID.svc.id.goog \
--region=us-central1
# For existing cluster
gcloud container clusters update YOUR_CLUSTER \
--workload-pool=PROJECT_ID.svc.id.goog \
--region=us-central1
1.2 Enable Workload Identity on Node Pool¶
# For new node pool
gcloud container node-pools create e6data-pool \
--cluster=YOUR_CLUSTER \
--workload-metadata=GKE_METADATA \
--region=us-central1
# For existing node pool
gcloud container node-pools update e6data-pool \
--cluster=YOUR_CLUSTER \
--workload-metadata=GKE_METADATA \
--region=us-central1
1.3 Create Google Service Account¶
# Create service account
gcloud iam service-accounts create e6data-workspace \
--display-name="e6data Workspace Service Account" \
--project=PROJECT_ID
# Get service account email
SA_EMAIL="e6data-workspace@PROJECT_ID.iam.gserviceaccount.com"
echo "Service Account: $SA_EMAIL"
1.4 Bind Kubernetes SA to Google SA¶
# Create IAM binding for workload identity
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:PROJECT_ID.svc.id.goog[workspace-prod/analytics-prod]" \
--project=PROJECT_ID
For multiple namespaces:
# Bind for each workspace namespace
for NS in workspace-prod workspace-staging workspace-dev; do
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:PROJECT_ID.svc.id.goog[$NS/analytics-$NS]" \
--project=PROJECT_ID
done
1.5 Annotate Kubernetes ServiceAccount¶
apiVersion: v1
kind: ServiceAccount
metadata:
name: analytics-prod
namespace: workspace-prod
annotations:
iam.gke.io/gcp-service-account: "e6data-workspace@PROJECT_ID.iam.gserviceaccount.com"
Apply with kubectl:
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: analytics-prod
namespace: workspace-prod
annotations:
iam.gke.io/gcp-service-account: "e6data-workspace@PROJECT_ID.iam.gserviceaccount.com"
EOF
2. IAM Roles (Least Privilege)¶
2.1 GCS Access (Required)¶
Create a custom role for GCS access:
# Create custom role definition
cat > e6data-gcs-role.yaml <<EOF
title: "e6data GCS Access"
description: "Least privilege GCS access for e6data workloads"
stage: "GA"
includedPermissions:
- storage.buckets.get
- storage.objects.get
- storage.objects.list
- storage.objects.create
- storage.objects.delete
EOF
# Create the custom role
gcloud iam roles create e6dataGcsAccess \
--project=PROJECT_ID \
--file=e6data-gcs-role.yaml
Or use predefined roles:
# Read access to data bucket
gcloud storage buckets add-iam-policy-binding gs://YOUR-DATA-BUCKET \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/storage.objectViewer"
# Write access to cache/metadata prefixes
gcloud storage buckets add-iam-policy-binding gs://YOUR-DATA-BUCKET \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/storage.objectUser" \
--condition='expression=resource.name.startsWith("projects/_/buckets/YOUR-DATA-BUCKET/objects/e6data-cache/") || resource.name.startsWith("projects/_/buckets/YOUR-DATA-BUCKET/objects/e6data-metadata/"),title=e6data-write-prefix'
2.2 GCS IAM Policy JSON (For Terraform/Infrastructure as Code)¶
Create e6data-gcs-policy.json:
{
"bindings": [
{
"role": "roles/storage.objectViewer",
"members": [
"serviceAccount:e6data-workspace@PROJECT_ID.iam.gserviceaccount.com"
],
"condition": {
"title": "e6data-read-access",
"description": "Read access to data lake bucket",
"expression": "resource.name.startsWith('projects/_/buckets/YOUR-DATA-BUCKET')"
}
},
{
"role": "roles/storage.objectAdmin",
"members": [
"serviceAccount:e6data-workspace@PROJECT_ID.iam.gserviceaccount.com"
],
"condition": {
"title": "e6data-write-access",
"description": "Write access to e6data cache and metadata",
"expression": "resource.name.startsWith('projects/_/buckets/YOUR-DATA-BUCKET/objects/e6data-cache/') || resource.name.startsWith('projects/_/buckets/YOUR-DATA-BUCKET/objects/e6data-metadata/')"
}
}
]
}
2.3 BigQuery Access (If Using BigQuery Metastore)¶
# Grant BigQuery Data Viewer role
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/bigquery.dataViewer"
# Grant BigQuery Job User (for queries)
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/bigquery.jobUser"
Restrict to specific datasets (recommended):
# Grant access to specific dataset only
bq add-iam-policy-binding \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/bigquery.dataViewer" \
PROJECT_ID:analytics_dataset
bq add-iam-policy-binding \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/bigquery.dataViewer" \
PROJECT_ID:sales_dataset
2.4 GreptimeDB GCS Policy¶
GreptimeDB needs write access to its own bucket:
# Create separate service account for GreptimeDB
gcloud iam service-accounts create e6data-greptime \
--display-name="e6data GreptimeDB Service Account" \
--project=PROJECT_ID
GREPTIME_SA="e6data-greptime@PROJECT_ID.iam.gserviceaccount.com"
# Grant full access to GreptimeDB bucket
gcloud storage buckets add-iam-policy-binding gs://YOUR-GREPTIME-BUCKET \
--member="serviceAccount:$GREPTIME_SA" \
--role="roles/storage.objectAdmin"
# Bind to Kubernetes SA
gcloud iam service-accounts add-iam-policy-binding $GREPTIME_SA \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:PROJECT_ID.svc.id.goog[greptime-system/greptime-sa]" \
--project=PROJECT_ID
2.5 Complete IAM Setup Script¶
#!/bin/bash
set -e
PROJECT_ID="your-project-id"
CLUSTER_NAME="your-cluster"
DATA_BUCKET="your-data-bucket"
GREPTIME_BUCKET="your-greptime-bucket"
WORKSPACE_NS="workspace-prod"
WORKSPACE_SA="analytics-prod"
# Create workspace service account
gcloud iam service-accounts create e6data-workspace \
--display-name="e6data Workspace" \
--project=$PROJECT_ID
SA_EMAIL="e6data-workspace@$PROJECT_ID.iam.gserviceaccount.com"
# Grant GCS read access
gcloud storage buckets add-iam-policy-binding gs://$DATA_BUCKET \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/storage.objectViewer"
# Grant GCS write access (with prefix condition)
gcloud storage buckets add-iam-policy-binding gs://$DATA_BUCKET \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/storage.objectCreator" \
--condition="expression=resource.name.startsWith('projects/_/buckets/$DATA_BUCKET/objects/e6data-'),title=e6data-write"
# Bind to Kubernetes SA
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:$PROJECT_ID.svc.id.goog[$WORKSPACE_NS/$WORKSPACE_SA]" \
--project=$PROJECT_ID
echo "Setup complete. Annotate your Kubernetes SA with:"
echo " iam.gke.io/gcp-service-account: $SA_EMAIL"
3. Karpenter Setup (Recommended)¶
GKE supports Karpenter for dynamic node provisioning.
3.1 Install Karpenter on GKE¶
export KARPENTER_VERSION="1.0.0"
export PROJECT_ID="your-project-id"
export CLUSTER_NAME="your-cluster"
export CLUSTER_REGION="us-central1"
# Add Karpenter Helm repo
helm repo add karpenter https://charts.karpenter.sh
helm repo update
# Install Karpenter
helm upgrade --install karpenter karpenter/karpenter \
--namespace karpenter --create-namespace \
--version ${KARPENTER_VERSION} \
--set settings.gcp.projectID=${PROJECT_ID} \
--set settings.gcp.clusterName=${CLUSTER_NAME} \
--set settings.gcp.clusterLocation=${CLUSTER_REGION} \
--wait
3.2 NodePool for e6data (T2A ARM64 - Recommended)¶
GCP offers T2A instances (Ampere Altra ARM64) for cost-effective compute:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: e6data-compute
spec:
template:
metadata:
labels:
e6data.io/node-type: compute
spec:
requirements:
# Prefer ARM64 (T2A) for cost savings
- key: kubernetes.io/arch
operator: In
values: ["arm64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
# Use machine families, not specific types
- key: karpenter.gcp.sh/machine-family
operator: In
values:
- t2a # ARM64 (Ampere Altra) - best price/performance
- t2d # AMD EPYC (fallback)
- key: karpenter.gcp.sh/machine-size
operator: In
values:
- standard-16
- standard-32
- standard-48
# Taints for workload isolation
taints:
- key: e6data-workspace-name
value: "prod"
effect: NoSchedule
# Node expiry
expireAfter: 720h # 30 days
nodeClassRef:
group: karpenter.gcp.sh
kind: GCPNodeClass
name: e6data-arm64
# Resource limits
limits:
cpu: 2000
memory: 8000Gi
# Disruption settings
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 5m
budgets:
- nodes: "10%"
3.3 GCPNodeClass¶
apiVersion: karpenter.gcp.sh/v1
kind: GCPNodeClass
metadata:
name: e6data-arm64
spec:
# Machine image
imageFamily: cos-arm64-stable
# Service account for nodes
serviceAccount: e6data-nodes@PROJECT_ID.iam.gserviceaccount.com
# Network configuration
subnetwork: projects/PROJECT_ID/regions/us-central1/subnetworks/default
# Boot disk
bootDisk:
sizeGb: 100
type: pd-balanced
encryption:
kmsKeyName: "" # Optional: specify KMS key for encryption
# Local SSD for caching (optional)
localSsds:
count: 1
interface: NVME
# Metadata
metadata:
disable-legacy-endpoints: "true"
# Labels
labels:
environment: production
team: data-platform
managed-by: karpenter
# Tags for firewall rules
tags:
- e6data-nodes
- allow-health-checks
3.4 NodePool for AMD64/Intel (If ARM64 Not Suitable)¶
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: e6data-compute-amd64
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
- key: kubernetes.io/os
operator: In
values: ["linux"]
- key: karpenter.gcp.sh/machine-family
operator: In
values:
- n2 # Intel Cascade Lake
- n2d # AMD EPYC Rome
- c2 # Intel Cascade Lake (compute-optimized)
- c2d # AMD EPYC Milan (compute-optimized)
- key: karpenter.gcp.sh/machine-size
operator: In
values:
- standard-16
- standard-32
- highmem-16
- highmem-32
taints:
- key: e6data-workspace-name
value: "prod"
effect: NoSchedule
nodeClassRef:
group: karpenter.gcp.sh
kind: GCPNodeClass
name: e6data-amd64
limits:
cpu: 1000
memory: 4000Gi
4. Alternative: GKE Autopilot¶
GKE Autopilot is a fully managed Kubernetes mode that automatically manages nodes.
4.1 Create Autopilot Cluster¶
gcloud container clusters create-auto YOUR_CLUSTER \
--region=us-central1 \
--workload-pool=PROJECT_ID.svc.id.goog
4.2 Autopilot Considerations¶
| Feature | Autopilot | Standard + Karpenter |
|---|---|---|
| Node management | Automatic | Self-managed |
| ARM64 support | Yes (T2A) | Yes (T2A) |
| Local SSD | Limited | Full control |
| Spot/Preemptible | Yes | Yes |
| Cost | Pay per pod | Pay per node |
| Customization | Limited | Full |
Recommendation: Use Autopilot for simpler deployments, Standard + Karpenter for more control.
5. Verification¶
5.1 Test Workload Identity¶
# Create test pod
kubectl run test-wi --rm -it --restart=Never \
--namespace=workspace-prod \
--serviceaccount=analytics-prod \
--image=google/cloud-sdk:slim \
-- gcloud auth list
# Expected: Shows service account identity
5.2 Test GCS Access¶
kubectl run test-gcs --rm -it --restart=Never \
--namespace=workspace-prod \
--serviceaccount=analytics-prod \
--image=google/cloud-sdk:slim \
-- gsutil ls gs://YOUR-DATA-BUCKET/ | head -5
5.3 Test BigQuery Access (If Applicable)¶
kubectl run test-bq --rm -it --restart=Never \
--namespace=workspace-prod \
--serviceaccount=analytics-prod \
--image=google/cloud-sdk:slim \
-- bq ls
5.4 Verify Karpenter¶
# Check Karpenter pods
kubectl get pods -n karpenter
# Check NodePools
kubectl get nodepools
# Check GCPNodeClasses
kubectl get gcpnodeclasses
# Watch provisioned nodes
kubectl get nodes -l karpenter.sh/nodepool=e6data-compute -w
6. Best Practices¶
6.1 Security¶
- Workload Identity: Always use Workload Identity (never use node SA)
- Least privilege: Only grant required buckets and BigQuery datasets
- Service account separation: Use different SAs for workspace vs GreptimeDB
- VPC Service Controls: Consider using VPC-SC for sensitive data
6.2 Cost Optimization¶
- ARM64 (T2A): 20-40% cheaper than comparable x86 instances
- Spot VMs: Use for fault-tolerant workloads
- Machine families: Let Karpenter choose optimal size within family
- Sustained use discounts: Automatic for long-running workloads
- Committed use discounts: For predictable workloads
6.3 Performance¶
- Machine size: Use standard-16 or larger for query executors
- Memory-optimized: Use highmem machines for large datasets
- Local SSD: Use for caching (n2-standard-* with local SSD)
- Regional clusters: Deploy in same region as data