CRD Catalog¶

This page provides a quick reference to all Custom Resource Definitions (CRDs) managed by the e6data Kubernetes Operator.

CRD Overview¶

CRD	Short Names	Purpose	Dependencies
NamespaceConfig	`nsconfig`, `nsc`	Namespace infrastructure settings	None
MetadataServices	`mds`, `metadata`	Storage and Schema services	NamespaceConfig
QueryService	`qs`, `cluster`	Query execution cluster	NamespaceConfig, MetadataServices
TrafficInfra	`ti`	xDS + Envoy traffic routing	NamespaceConfig, QueryService
AuthGateway	`ag`	Pomerium authentication gateway	NamespaceConfig
E6Catalog	`e6cat`	External catalog registration	MetadataServices
CatalogRefresh	`cr`, `catalogref`	One-time catalog refresh	E6Catalog
CatalogRefreshSchedule	`crs`, `refreshschedule`	Scheduled catalog refresh	E6Catalog
Pool	`pool`	Shared compute for burst	None
Governance	`gov`, `governance`	Data access policies	E6Catalog
GreptimeDBCluster	`gdb`, `greptime`	Time-series database	None
MonitoringServices	`ms`, `monitoring`	Logs and metrics collection	NamespaceConfig
ReleaseManager	`releases`, `rm`	Release version management	None

Dependency Graph¶

                    ┌─────────────────┐
                    │ NamespaceConfig │  (Foundation - deploy first)
                    │   (nsconfig)    │
                    └────────┬────────┘
                             │
         ┌───────────────────┼───────────────────┬─────────────────┐
         │                   │                   │                 │
         ▼                   ▼                   ▼                 ▼
  ┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐ ┌─────────────┐
  │ MetadataServices│ │MonitoringServices│ │     Pool      │ │ AuthGateway │
  │     (mds)       │ │      (ms)        │ │    (pool)     │ │    (ag)     │
  └────────┬────────┘ └──────────────────┘ └───────────────┘ └─────────────┘
           │                                                        │
           ├──────────────┬──────────────┐                          │
           │              │              │                          │
           ▼              ▼              ▼                          │
    ┌────────────┐ ┌────────────┐ ┌────────────┐                    │
    │QueryService│ │  E6Catalog │ │ Governance │                    │
    │    (qs)    │ │  (e6cat)   │ │   (gov)    │                    │
    └─────┬──────┘ └──────┬─────┘ └────────────┘                    │
          │               │                                         │
          ▼               ├──────────────┬──────────────┐           │
    ┌────────────┐        │              │              │           │
    │TrafficInfra│◄───────┼──────────────┼──────────────┼───────────┘
    │    (ti)    │        ▼              ▼              ▼   (routes to Envoy)
    └────────────┘  ┌────────────┐ ┌────────────────┐   │
                    │CatalogRef- │ │CatalogRefresh- │   │
                    │   resh     │ │   Schedule     │   │
                    │   (cr)     │ │    (crs)       │   │
                    └────────────┘ └────────────────┘   │

       ┌─────────────────┐
       │GreptimeDBCluster│ (Independent)
       │     (gdb)       │
       └─────────────────┘

NamespaceConfig¶

API Version: e6data.io/v1alpha1 | Short Names: nsconfig, nsc

Provides shared infrastructure settings for a namespace (cloud, storage, scheduling).

Key Fields:

spec:
  cloud: string              # AWS, GCP, AZURE (auto-detected if not set)
  storageBackend: string     # s3a://, gs://, or abfs://
  s3Endpoint: string         # For S3-compatible storage
  imageRepository: string    # Container registry path
  imagePullSecrets: []string # Private registry secrets
  tolerations: [...]         # Pod tolerations
  nodeSelector: {...}        # Node labels
  serviceAccounts:
    data: string             # For MDS/QS pods
    monitoring: string       # For MonitoringServices pods
  defaultPoolRef: {...}      # Default pool for QueryServices

Status Phases: Pending → Ready | Error

Full Documentation →

MetadataServices¶

API Version: e6data.io/v1alpha1 | Short Names: mds, metadata

Manages storage and schema services for metadata caching and schema inference. Inherits from NamespaceConfig.

Key Fields:

spec:
  workspace: string      # Required: Workspace name
  tenant: string         # Required: Tenant identifier
  storageBackend: string # Required: s3a://, gs://, or abfs://
  storage:
    imageTag: string     # Required: Storage service version
    resources: {...}     # Required: memory, cpu
  schema:
    imageTag: string     # Required: Schema service version
    resources: {...}     # Required: memory, cpu

Status Phases: Pending → Creating → Running | Updating | Failed | Degraded

Full Documentation →

QueryService¶

API Version: e6data.io/v1alpha1 | Short Names: qs, cluster

Deploys query execution components: Planner, Queue, and Executor. Traffic routing is handled by TrafficInfra (Envoy + xDS).

Key Fields:

spec:
  alias: string          # Required: Cluster alias
  workspace: string      # Required: Must match MetadataServices
  planner: {...}         # Required: Query planner config
  queue: {...}           # Required: Queue/coordinator config
  executor:              # Required: Worker config
    replicas: int
    autoscaling: {...}   # Optional: Enable autoscaling
    poolRef: {...}       # Optional: Reference to Pool

Status Phases: Waiting → Deploying → Ready | Updating | Failed | Degraded

Full Documentation →

TrafficInfra¶

API Version: e6data.io/v1alpha2 | Short Names: ti

Manages traffic infrastructure: xDS control plane and Envoy proxy for gRPC routing.

Key Fields:

spec:
  xds:
    replicas: int              # xDS control plane replicas (default: 2)
    image: {...}               # Optional: Custom image
    discovery:
      services: []string       # Optional: Auto-registered by operator
  envoy:
    replicas: int              # Envoy proxy replicas (default: 2)
    maxReplicas: int           # Max replicas for HPA
    hpa:
      enabled: bool            # Enable autoscaling
    service:
      type: string             # LoadBalancer, ClusterIP, NodePort

Status Phases: Pending → Deploying → Ready | Degraded | Failed

Full Documentation →

AuthGateway¶

API Version: e6data.io/v1alpha1 | Short Names: ag

Manages Pomerium-based authentication gateway for protecting services.

Key Fields:

spec:
  domain: string                 # Required: Base domain (e.g., e6data.io)
  replicas: int                  # Pomerium replicas (default: 2)
  authentication:
    enabled: bool                # false = passthrough mode
    idp:
      provider: string           # google, okta, azure, github, oidc
      credentialsSecretRef: {...}
    policy:
      allowedDomains: []string   # Allowed email domains
  services:
    - name: string               # Service route name
      subdomain: string          # e.g., "query" → query.domain.com
      backend:
        serviceName: string      # Backend K8s service
        servicePort: int

Status Phases: Pending → Deploying → Ready | Degraded | Failed

Full Documentation →

E6Catalog¶

API Version: e6data.io/v1alpha1 | Short Names: e6cat

Registers external data catalogs (Hive, Glue, Unity, Iceberg, Delta) with the storage service.

Key Fields:

spec:
  catalogType: string           # Required: HIVE|GLUE|UNITY|ICEBERG|DELTA
  metadataServicesRef: string   # Required: MetadataServices name
  connectionMetadata:
    catalogConnection:
      hiveConnection: {...}     # For HIVE
      glueConnection: {...}     # For GLUE
      unityConnection: {...}    # For UNITY
      icebergConnection: {...}  # For ICEBERG
      deltaConnection: {...}    # For DELTA

Status Phases: Waiting → Creating → Ready | Updating | Failed

Full Documentation →

CatalogRefresh¶

API Version: e6data.io/v1alpha1 | Short Names: cr, catalogref

Triggers a one-time metadata refresh on an E6Catalog (like a Kubernetes Job).

Key Fields:

spec:
  e6CatalogRef:
    name: string         # Required: E6Catalog name
  refreshType: string    # Required: full|delta
  databases: []string    # Optional: Specific databases
  timeout: string        # Optional: Default 30m

Status Phases: Pending → Running → Succeeded | PartialSuccess | Failed | TimedOut

Full Documentation →

CatalogRefreshSchedule¶

API Version: e6data.io/v1alpha1 | Short Names: crs, refreshschedule

Creates recurring catalog refreshes using cron syntax (like a Kubernetes CronJob).

Key Fields:

spec:
  e6CatalogRef:
    name: string              # Required: E6Catalog name
  schedule: string            # Required: Cron expression
  refreshType: string         # Optional: full|delta (default: delta)
  concurrencyPolicy: string   # Optional: Forbid|Allow|Replace
  successfulRefreshHistoryLimit: int  # Optional: Default 3
  failedRefreshHistoryLimit: int      # Optional: Default 1

Common Schedules: - "0 2 * * *" - Daily at 2 AM - "*/30 * * * *" - Every 30 minutes - "0 0 * * 0" - Weekly on Sunday

Full Documentation →

Pool¶

API Version: e6data.io/v1alpha1 | Short Names: pool

Provides shared compute resources for burst capacity across multiple QueryServices.

Key Fields:

spec:
  minExecutors: int           # Baseline capacity
  maxExecutors: int           # Required: Maximum capacity
  executorsPerNode: int       # Optional: Default 1
  instanceConfig:
    instanceType: string      # Explicit instance type
    spotEnabled: bool         # Use spot instances
  queryServiceSelector: {...} # Label selector for allowed QS
  # OR
  allowedQueryServices: []    # Explicit allow list

Status Phases: Pending → Creating → Active | Suspended | Failed

Full Documentation →

Governance¶

API Version: e6data.io/v1alpha1 | Short Names: gov, governance

Defines data access control policies (access grants, column masking, row filtering).

Key Fields:

spec:
  catalogName: string         # Required: Target catalog
  policies:
    - name: string            # Policy name
      type: string            # GRANT_ACCESS|COLUMN_MASKING|ROW_FILTERING
      resources: [...]        # Target resources
      users: []string         # Target users
      groups: []string        # Target groups
      allow: bool             # For GRANT_ACCESS
      maskType: string        # For COLUMN_MASKING
      rowFilter: string       # For ROW_FILTERING

Status Phases: Pending → Syncing → Synced | Failed

Full Documentation →

GreptimeDBCluster¶

API Version: e6data.io/v1alpha1 | Short Names: gdb, greptime

Deploys a GreptimeDB time-series database cluster for query history and metrics.

Key Fields:

spec:
  storage:
    backend: string      # Required: s3|gcs|azure
    bucket: string       # Required: Bucket name
    region: string       # Required: Cloud region
  frontend:
    replicas: int        # Query endpoints
  datanode:
    replicas: int        # Data storage
  meta:
    replicas: int        # Coordination (1 or 3)
  etcd:
    replicas: int        # Metadata (1 or 3)
  grafana:
    enabled: bool        # Optional visualization

Status Phases: Pending → Initializing → Running | Degraded | Failed

Full Documentation →

MonitoringServices¶

API Version: e6data.io/v1alpha2 | Short Names: ms, monitoring

Deploys Vector-based log and metrics collection infrastructure.

Key Fields:

spec:
  workspace: string           # Required: Workspace name
  storageBackend: string      # Required: Object storage path
  logs:
    enabled: bool             # Enable pod log collection
    selector: {...}           # Pod selector for log collection
  metrics:
    enabled: bool             # Enable metrics scraping
    selector: {...}           # Pod selector for metrics
  vector:
    image: {...}              # Vector container image
    resources: {...}          # CPU/Memory limits
  sinks:
    s3: {...}                 # S3 sink configuration
    greptimeRef: {...}        # GreptimeDB sink reference

Status Phases: Pending → Creating → Running | Failed | Degraded

Full Documentation →

ReleaseManager¶

API Version: e6data.io/v1alpha1 | Short Names: releases, rm

Stores release version information for different products. Used for GitOps-based release management and showing available upgrades.

Key Fields:

spec:
  version: string            # Required: Semantic version (e.g., "1.2.0")
  product: string            # Required: engine|laminar|copilot|platform
  subType: string            # For engine: metadata|cluster
  description: string        # Human-readable description
  releaseNotes: string       # Changelog
  releaseDate: time          # When released
  deprecated: bool           # Mark as deprecated
  minUpgradeVersion: string  # Minimum version that can upgrade to this
  engine:                    # For engine product
    cluster:                 # QueryService components
      planner: {...}         # Image, resources, config
      queue: {...}
      executor: {...}
    metadata:                # MetadataServices components
      storage: {...}
      schema: {...}
  platform:                  # For platform product
    console: {...}
    apiServer: {...}

Component Release Spec:

image:
  repository: string         # Container registry path
  name: string               # Required: Image name
  tag: string                # Required: Image tag
  digest: string             # Optional: Immutable reference
resources:
  minimum: {cpu, memory}     # Minimum required
  recommended: {cpu, memory} # Optimal performance
  maximum: {cpu, memory}     # Limits
configVariables: {...}       # Default config.properties
environmentVariables: {...}  # Default env vars
affinity: {...}              # Recommended affinity rules
featureFlags: {...}          # Enabled/disabled features
breaking: bool               # Has breaking changes
breakingChanges: []string    # Description of breaking changes

Status Phases: Active | Deprecated | Superseded | Testing

Full Documentation →

Quick Reference Commands¶

# List all CRDs
kubectl get crd | grep e6data

# Get all resources of a type
kubectl get nsconfig -A               # NamespaceConfig
kubectl get mds -A                    # MetadataServices
kubectl get qs -A                     # QueryService
kubectl get ti -A                     # TrafficInfra
kubectl get e6cat -A                  # E6Catalog
kubectl get catalogrefresh -A         # CatalogRefresh
kubectl get crs -A                    # CatalogRefreshSchedule
kubectl get pool -A                   # Pool
kubectl get gov -A                    # Governance
kubectl get gdb -A                    # GreptimeDBCluster
kubectl get ms -A                     # MonitoringServices
kubectl get releases -A               # ReleaseManager

# Watch status
kubectl get mds -w
kubectl get qs -w

# Describe with events
kubectl describe mds my-metadata -n workspace-prod
kubectl describe qs my-cluster -n workspace-prod

# Get YAML output
kubectl get mds my-metadata -o yaml
kubectl get qs my-cluster -o yaml

# Check operator logs
kubectl logs -n e6-operator-system -l app=e6-operator --tail=100

Version Compatibility¶

Operator Version	CRD API Version	Kubernetes	Karpenter
1.0.x	v1alpha1	1.24+	0.32+
1.1.x	v1alpha1	1.25+	0.37+

Migration Notes¶

From Helm Charts¶

If migrating from standalone Helm charts to the operator:

Export existing values from Helm releases
Create equivalent CRs with the same configuration
Apply CRs to let operator take ownership
Delete Helm releases (operator manages resources now)

API Version Upgrades¶

When upgrading between API versions:

Check release notes for breaking changes
Backup existing CRs: kubectl get mds -o yaml > backup.yaml
Apply any required migrations
Upgrade operator Helm chart