Skip to content

CRD Catalog

This page provides a quick reference to all Custom Resource Definitions (CRDs) managed by the e6data Kubernetes Operator.


CRD Overview

CRD Short Names Purpose Dependencies
NamespaceConfig nsconfig, nsc Namespace infrastructure settings None
MetadataServices mds, metadata Storage and Schema services NamespaceConfig
QueryService qs, cluster Query execution cluster NamespaceConfig, MetadataServices
TrafficInfra ti xDS + Envoy traffic routing NamespaceConfig, QueryService
AuthGateway ag Pomerium authentication gateway NamespaceConfig
E6Catalog e6cat External catalog registration MetadataServices
CatalogRefresh cr, catalogref One-time catalog refresh E6Catalog
CatalogRefreshSchedule crs, refreshschedule Scheduled catalog refresh E6Catalog
Pool pool Shared compute for burst None
Governance gov, governance Data access policies E6Catalog
GreptimeDBCluster gdb, greptime Time-series database None
MonitoringServices ms, monitoring Logs and metrics collection NamespaceConfig
ReleaseManager releases, rm Release version management None

Dependency Graph

                    ┌─────────────────┐
                    │ NamespaceConfig │  (Foundation - deploy first)
                    │   (nsconfig)    │
                    └────────┬────────┘
         ┌───────────────────┼───────────────────┬─────────────────┐
         │                   │                   │                 │
         ▼                   ▼                   ▼                 ▼
  ┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐ ┌─────────────┐
  │ MetadataServices│ │MonitoringServices│ │     Pool      │ │ AuthGateway │
  │     (mds)       │ │      (ms)        │ │    (pool)     │ │    (ag)     │
  └────────┬────────┘ └──────────────────┘ └───────────────┘ └─────────────┘
           │                                                        │
           ├──────────────┬──────────────┐                          │
           │              │              │                          │
           ▼              ▼              ▼                          │
    ┌────────────┐ ┌────────────┐ ┌────────────┐                    │
    │QueryService│ │  E6Catalog │ │ Governance │                    │
    │    (qs)    │ │  (e6cat)   │ │   (gov)    │                    │
    └─────┬──────┘ └──────┬─────┘ └────────────┘                    │
          │               │                                         │
          ▼               ├──────────────┬──────────────┐           │
    ┌────────────┐        │              │              │           │
    │TrafficInfra│◄───────┼──────────────┼──────────────┼───────────┘
    │    (ti)    │        ▼              ▼              ▼   (routes to Envoy)
    └────────────┘  ┌────────────┐ ┌────────────────┐   │
                    │CatalogRef- │ │CatalogRefresh- │   │
                    │   resh     │ │   Schedule     │   │
                    │   (cr)     │ │    (crs)       │   │
                    └────────────┘ └────────────────┘   │

       ┌─────────────────┐
       │GreptimeDBCluster│ (Independent)
       │     (gdb)       │
       └─────────────────┘

NamespaceConfig

API Version: e6data.io/v1alpha1 | Short Names: nsconfig, nsc

Provides shared infrastructure settings for a namespace (cloud, storage, scheduling).

Key Fields:

spec:
  cloud: string              # AWS, GCP, AZURE (auto-detected if not set)
  storageBackend: string     # s3a://, gs://, or abfs://
  s3Endpoint: string         # For S3-compatible storage
  imageRepository: string    # Container registry path
  imagePullSecrets: []string # Private registry secrets
  tolerations: [...]         # Pod tolerations
  nodeSelector: {...}        # Node labels
  serviceAccounts:
    data: string             # For MDS/QS pods
    monitoring: string       # For MonitoringServices pods
  defaultPoolRef: {...}      # Default pool for QueryServices

Status Phases: PendingReady | Error

Full Documentation →


MetadataServices

API Version: e6data.io/v1alpha1 | Short Names: mds, metadata

Manages storage and schema services for metadata caching and schema inference. Inherits from NamespaceConfig.

Key Fields:

spec:
  workspace: string      # Required: Workspace name
  tenant: string         # Required: Tenant identifier
  storageBackend: string # Required: s3a://, gs://, or abfs://
  storage:
    imageTag: string     # Required: Storage service version
    resources: {...}     # Required: memory, cpu
  schema:
    imageTag: string     # Required: Schema service version
    resources: {...}     # Required: memory, cpu

Status Phases: PendingCreatingRunning | Updating | Failed | Degraded

Full Documentation →


QueryService

API Version: e6data.io/v1alpha1 | Short Names: qs, cluster

Deploys query execution components: Planner, Queue, and Executor. Traffic routing is handled by TrafficInfra (Envoy + xDS).

Key Fields:

spec:
  alias: string          # Required: Cluster alias
  workspace: string      # Required: Must match MetadataServices
  planner: {...}         # Required: Query planner config
  queue: {...}           # Required: Queue/coordinator config
  executor:              # Required: Worker config
    replicas: int
    autoscaling: {...}   # Optional: Enable autoscaling
    poolRef: {...}       # Optional: Reference to Pool

Status Phases: WaitingDeployingReady | Updating | Failed | Degraded

Full Documentation →


TrafficInfra

API Version: e6data.io/v1alpha2 | Short Names: ti

Manages traffic infrastructure: xDS control plane and Envoy proxy for gRPC routing.

Key Fields:

spec:
  xds:
    replicas: int              # xDS control plane replicas (default: 2)
    image: {...}               # Optional: Custom image
    discovery:
      services: []string       # Optional: Auto-registered by operator
  envoy:
    replicas: int              # Envoy proxy replicas (default: 2)
    maxReplicas: int           # Max replicas for HPA
    hpa:
      enabled: bool            # Enable autoscaling
    service:
      type: string             # LoadBalancer, ClusterIP, NodePort

Status Phases: PendingDeployingReady | Degraded | Failed

Full Documentation →


AuthGateway

API Version: e6data.io/v1alpha1 | Short Names: ag

Manages Pomerium-based authentication gateway for protecting services.

Key Fields:

spec:
  domain: string                 # Required: Base domain (e.g., e6data.io)
  replicas: int                  # Pomerium replicas (default: 2)
  authentication:
    enabled: bool                # false = passthrough mode
    idp:
      provider: string           # google, okta, azure, github, oidc
      credentialsSecretRef: {...}
    policy:
      allowedDomains: []string   # Allowed email domains
  services:
    - name: string               # Service route name
      subdomain: string          # e.g., "query" → query.domain.com
      backend:
        serviceName: string      # Backend K8s service
        servicePort: int

Status Phases: PendingDeployingReady | Degraded | Failed

Full Documentation →


E6Catalog

API Version: e6data.io/v1alpha1 | Short Names: e6cat

Registers external data catalogs (Hive, Glue, Unity, Iceberg, Delta) with the storage service.

Key Fields:

spec:
  catalogType: string           # Required: HIVE|GLUE|UNITY|ICEBERG|DELTA
  metadataServicesRef: string   # Required: MetadataServices name
  connectionMetadata:
    catalogConnection:
      hiveConnection: {...}     # For HIVE
      glueConnection: {...}     # For GLUE
      unityConnection: {...}    # For UNITY
      icebergConnection: {...}  # For ICEBERG
      deltaConnection: {...}    # For DELTA

Status Phases: WaitingCreatingReady | Updating | Failed

Full Documentation →


CatalogRefresh

API Version: e6data.io/v1alpha1 | Short Names: cr, catalogref

Triggers a one-time metadata refresh on an E6Catalog (like a Kubernetes Job).

Key Fields:

spec:
  e6CatalogRef:
    name: string         # Required: E6Catalog name
  refreshType: string    # Required: full|delta
  databases: []string    # Optional: Specific databases
  timeout: string        # Optional: Default 30m

Status Phases: PendingRunningSucceeded | PartialSuccess | Failed | TimedOut

Full Documentation →


CatalogRefreshSchedule

API Version: e6data.io/v1alpha1 | Short Names: crs, refreshschedule

Creates recurring catalog refreshes using cron syntax (like a Kubernetes CronJob).

Key Fields:

spec:
  e6CatalogRef:
    name: string              # Required: E6Catalog name
  schedule: string            # Required: Cron expression
  refreshType: string         # Optional: full|delta (default: delta)
  concurrencyPolicy: string   # Optional: Forbid|Allow|Replace
  successfulRefreshHistoryLimit: int  # Optional: Default 3
  failedRefreshHistoryLimit: int      # Optional: Default 1

Common Schedules: - "0 2 * * *" - Daily at 2 AM - "*/30 * * * *" - Every 30 minutes - "0 0 * * 0" - Weekly on Sunday

Full Documentation →


Pool

API Version: e6data.io/v1alpha1 | Short Names: pool

Provides shared compute resources for burst capacity across multiple QueryServices.

Key Fields:

spec:
  minExecutors: int           # Baseline capacity
  maxExecutors: int           # Required: Maximum capacity
  executorsPerNode: int       # Optional: Default 1
  instanceConfig:
    instanceType: string      # Explicit instance type
    spotEnabled: bool         # Use spot instances
  queryServiceSelector: {...} # Label selector for allowed QS
  # OR
  allowedQueryServices: []    # Explicit allow list

Status Phases: PendingCreatingActive | Suspended | Failed

Full Documentation →


Governance

API Version: e6data.io/v1alpha1 | Short Names: gov, governance

Defines data access control policies (access grants, column masking, row filtering).

Key Fields:

spec:
  catalogName: string         # Required: Target catalog
  policies:
    - name: string            # Policy name
      type: string            # GRANT_ACCESS|COLUMN_MASKING|ROW_FILTERING
      resources: [...]        # Target resources
      users: []string         # Target users
      groups: []string        # Target groups
      allow: bool             # For GRANT_ACCESS
      maskType: string        # For COLUMN_MASKING
      rowFilter: string       # For ROW_FILTERING

Status Phases: PendingSyncingSynced | Failed

Full Documentation →


GreptimeDBCluster

API Version: e6data.io/v1alpha1 | Short Names: gdb, greptime

Deploys a GreptimeDB time-series database cluster for query history and metrics.

Key Fields:

spec:
  storage:
    backend: string      # Required: s3|gcs|azure
    bucket: string       # Required: Bucket name
    region: string       # Required: Cloud region
  frontend:
    replicas: int        # Query endpoints
  datanode:
    replicas: int        # Data storage
  meta:
    replicas: int        # Coordination (1 or 3)
  etcd:
    replicas: int        # Metadata (1 or 3)
  grafana:
    enabled: bool        # Optional visualization

Status Phases: PendingInitializingRunning | Degraded | Failed

Full Documentation →


MonitoringServices

API Version: e6data.io/v1alpha2 | Short Names: ms, monitoring

Deploys Vector-based log and metrics collection infrastructure.

Key Fields:

spec:
  workspace: string           # Required: Workspace name
  storageBackend: string      # Required: Object storage path
  logs:
    enabled: bool             # Enable pod log collection
    selector: {...}           # Pod selector for log collection
  metrics:
    enabled: bool             # Enable metrics scraping
    selector: {...}           # Pod selector for metrics
  vector:
    image: {...}              # Vector container image
    resources: {...}          # CPU/Memory limits
  sinks:
    s3: {...}                 # S3 sink configuration
    greptimeRef: {...}        # GreptimeDB sink reference

Status Phases: PendingCreatingRunning | Failed | Degraded

Full Documentation →


ReleaseManager

API Version: e6data.io/v1alpha1 | Short Names: releases, rm

Stores release version information for different products. Used for GitOps-based release management and showing available upgrades.

Key Fields:

spec:
  version: string            # Required: Semantic version (e.g., "1.2.0")
  product: string            # Required: engine|laminar|copilot|platform
  subType: string            # For engine: metadata|cluster
  description: string        # Human-readable description
  releaseNotes: string       # Changelog
  releaseDate: time          # When released
  deprecated: bool           # Mark as deprecated
  minUpgradeVersion: string  # Minimum version that can upgrade to this
  engine:                    # For engine product
    cluster:                 # QueryService components
      planner: {...}         # Image, resources, config
      queue: {...}
      executor: {...}
    metadata:                # MetadataServices components
      storage: {...}
      schema: {...}
  platform:                  # For platform product
    console: {...}
    apiServer: {...}

Component Release Spec:

image:
  repository: string         # Container registry path
  name: string               # Required: Image name
  tag: string                # Required: Image tag
  digest: string             # Optional: Immutable reference
resources:
  minimum: {cpu, memory}     # Minimum required
  recommended: {cpu, memory} # Optimal performance
  maximum: {cpu, memory}     # Limits
configVariables: {...}       # Default config.properties
environmentVariables: {...}  # Default env vars
affinity: {...}              # Recommended affinity rules
featureFlags: {...}          # Enabled/disabled features
breaking: bool               # Has breaking changes
breakingChanges: []string    # Description of breaking changes

Status Phases: Active | Deprecated | Superseded | Testing

Full Documentation →


Quick Reference Commands

# List all CRDs
kubectl get crd | grep e6data

# Get all resources of a type
kubectl get nsconfig -A               # NamespaceConfig
kubectl get mds -A                    # MetadataServices
kubectl get qs -A                     # QueryService
kubectl get ti -A                     # TrafficInfra
kubectl get e6cat -A                  # E6Catalog
kubectl get catalogrefresh -A         # CatalogRefresh
kubectl get crs -A                    # CatalogRefreshSchedule
kubectl get pool -A                   # Pool
kubectl get gov -A                    # Governance
kubectl get gdb -A                    # GreptimeDBCluster
kubectl get ms -A                     # MonitoringServices
kubectl get releases -A               # ReleaseManager

# Watch status
kubectl get mds -w
kubectl get qs -w

# Describe with events
kubectl describe mds my-metadata -n workspace-prod
kubectl describe qs my-cluster -n workspace-prod

# Get YAML output
kubectl get mds my-metadata -o yaml
kubectl get qs my-cluster -o yaml

# Check operator logs
kubectl logs -n e6-operator-system -l app=e6-operator --tail=100

Version Compatibility

Operator Version CRD API Version Kubernetes Karpenter
1.0.x v1alpha1 1.24+ 0.32+
1.1.x v1alpha1 1.25+ 0.37+

Migration Notes

From Helm Charts

If migrating from standalone Helm charts to the operator:

  1. Export existing values from Helm releases
  2. Create equivalent CRs with the same configuration
  3. Apply CRs to let operator take ownership
  4. Delete Helm releases (operator manages resources now)

API Version Upgrades

When upgrading between API versions:

  1. Check release notes for breaking changes
  2. Backup existing CRs: kubectl get mds -o yaml > backup.yaml
  3. Apply any required migrations
  4. Upgrade operator Helm chart