CRD Catalog¶
This page provides a quick reference to all Custom Resource Definitions (CRDs) managed by the e6data Kubernetes Operator.
CRD Overview¶
| CRD | Short Names | Purpose | Dependencies |
|---|---|---|---|
| NamespaceConfig | nsconfig, nsc | Namespace infrastructure settings | None |
| MetadataServices | mds, metadata | Storage and Schema services | NamespaceConfig |
| QueryService | qs, cluster | Query execution cluster | NamespaceConfig, MetadataServices |
| TrafficInfra | ti | xDS + Envoy traffic routing | NamespaceConfig, QueryService |
| AuthGateway | ag | Pomerium authentication gateway | NamespaceConfig |
| E6Catalog | e6cat | External catalog registration | MetadataServices |
| CatalogRefresh | cr, catalogref | One-time catalog refresh | E6Catalog |
| CatalogRefreshSchedule | crs, refreshschedule | Scheduled catalog refresh | E6Catalog |
| Pool | pool | Shared compute for burst | None |
| Governance | gov, governance | Data access policies | E6Catalog |
| GreptimeDBCluster | gdb, greptime | Time-series database | None |
| MonitoringServices | ms, monitoring | Logs and metrics collection | NamespaceConfig |
| ReleaseManager | releases, rm | Release version management | None |
Dependency Graph¶
┌─────────────────┐
│ NamespaceConfig │ (Foundation - deploy first)
│ (nsconfig) │
└────────┬────────┘
│
┌───────────────────┼───────────────────┬─────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌───────────────┐ ┌─────────────┐
│ MetadataServices│ │MonitoringServices│ │ Pool │ │ AuthGateway │
│ (mds) │ │ (ms) │ │ (pool) │ │ (ag) │
└────────┬────────┘ └──────────────────┘ └───────────────┘ └─────────────┘
│ │
├──────────────┬──────────────┐ │
│ │ │ │
▼ ▼ ▼ │
┌────────────┐ ┌────────────┐ ┌────────────┐ │
│QueryService│ │ E6Catalog │ │ Governance │ │
│ (qs) │ │ (e6cat) │ │ (gov) │ │
└─────┬──────┘ └──────┬─────┘ └────────────┘ │
│ │ │
▼ ├──────────────┬──────────────┐ │
┌────────────┐ │ │ │ │
│TrafficInfra│◄───────┼──────────────┼──────────────┼───────────┘
│ (ti) │ ▼ ▼ ▼ (routes to Envoy)
└────────────┘ ┌────────────┐ ┌────────────────┐ │
│CatalogRef- │ │CatalogRefresh- │ │
│ resh │ │ Schedule │ │
│ (cr) │ │ (crs) │ │
└────────────┘ └────────────────┘ │
┌─────────────────┐
│GreptimeDBCluster│ (Independent)
│ (gdb) │
└─────────────────┘
NamespaceConfig¶
API Version: e6data.io/v1alpha1 | Short Names: nsconfig, nsc
Provides shared infrastructure settings for a namespace (cloud, storage, scheduling).
Key Fields:
spec:
cloud: string # AWS, GCP, AZURE (auto-detected if not set)
storageBackend: string # s3a://, gs://, or abfs://
s3Endpoint: string # For S3-compatible storage
imageRepository: string # Container registry path
imagePullSecrets: []string # Private registry secrets
tolerations: [...] # Pod tolerations
nodeSelector: {...} # Node labels
serviceAccounts:
data: string # For MDS/QS pods
monitoring: string # For MonitoringServices pods
defaultPoolRef: {...} # Default pool for QueryServices
Status Phases: Pending → Ready | Error
MetadataServices¶
API Version: e6data.io/v1alpha1 | Short Names: mds, metadata
Manages storage and schema services for metadata caching and schema inference. Inherits from NamespaceConfig.
Key Fields:
spec:
workspace: string # Required: Workspace name
tenant: string # Required: Tenant identifier
storageBackend: string # Required: s3a://, gs://, or abfs://
storage:
imageTag: string # Required: Storage service version
resources: {...} # Required: memory, cpu
schema:
imageTag: string # Required: Schema service version
resources: {...} # Required: memory, cpu
Status Phases: Pending → Creating → Running | Updating | Failed | Degraded
QueryService¶
API Version: e6data.io/v1alpha1 | Short Names: qs, cluster
Deploys query execution components: Planner, Queue, and Executor. Traffic routing is handled by TrafficInfra (Envoy + xDS).
Key Fields:
spec:
alias: string # Required: Cluster alias
workspace: string # Required: Must match MetadataServices
planner: {...} # Required: Query planner config
queue: {...} # Required: Queue/coordinator config
executor: # Required: Worker config
replicas: int
autoscaling: {...} # Optional: Enable autoscaling
poolRef: {...} # Optional: Reference to Pool
Status Phases: Waiting → Deploying → Ready | Updating | Failed | Degraded
TrafficInfra¶
API Version: e6data.io/v1alpha2 | Short Names: ti
Manages traffic infrastructure: xDS control plane and Envoy proxy for gRPC routing.
Key Fields:
spec:
xds:
replicas: int # xDS control plane replicas (default: 2)
image: {...} # Optional: Custom image
discovery:
services: []string # Optional: Auto-registered by operator
envoy:
replicas: int # Envoy proxy replicas (default: 2)
maxReplicas: int # Max replicas for HPA
hpa:
enabled: bool # Enable autoscaling
service:
type: string # LoadBalancer, ClusterIP, NodePort
Status Phases: Pending → Deploying → Ready | Degraded | Failed
AuthGateway¶
API Version: e6data.io/v1alpha1 | Short Names: ag
Manages Pomerium-based authentication gateway for protecting services.
Key Fields:
spec:
domain: string # Required: Base domain (e.g., e6data.io)
replicas: int # Pomerium replicas (default: 2)
authentication:
enabled: bool # false = passthrough mode
idp:
provider: string # google, okta, azure, github, oidc
credentialsSecretRef: {...}
policy:
allowedDomains: []string # Allowed email domains
services:
- name: string # Service route name
subdomain: string # e.g., "query" → query.domain.com
backend:
serviceName: string # Backend K8s service
servicePort: int
Status Phases: Pending → Deploying → Ready | Degraded | Failed
E6Catalog¶
API Version: e6data.io/v1alpha1 | Short Names: e6cat
Registers external data catalogs (Hive, Glue, Unity, Iceberg, Delta) with the storage service.
Key Fields:
spec:
catalogType: string # Required: HIVE|GLUE|UNITY|ICEBERG|DELTA
metadataServicesRef: string # Required: MetadataServices name
connectionMetadata:
catalogConnection:
hiveConnection: {...} # For HIVE
glueConnection: {...} # For GLUE
unityConnection: {...} # For UNITY
icebergConnection: {...} # For ICEBERG
deltaConnection: {...} # For DELTA
Status Phases: Waiting → Creating → Ready | Updating | Failed
CatalogRefresh¶
API Version: e6data.io/v1alpha1 | Short Names: cr, catalogref
Triggers a one-time metadata refresh on an E6Catalog (like a Kubernetes Job).
Key Fields:
spec:
e6CatalogRef:
name: string # Required: E6Catalog name
refreshType: string # Required: full|delta
databases: []string # Optional: Specific databases
timeout: string # Optional: Default 30m
Status Phases: Pending → Running → Succeeded | PartialSuccess | Failed | TimedOut
CatalogRefreshSchedule¶
API Version: e6data.io/v1alpha1 | Short Names: crs, refreshschedule
Creates recurring catalog refreshes using cron syntax (like a Kubernetes CronJob).
Key Fields:
spec:
e6CatalogRef:
name: string # Required: E6Catalog name
schedule: string # Required: Cron expression
refreshType: string # Optional: full|delta (default: delta)
concurrencyPolicy: string # Optional: Forbid|Allow|Replace
successfulRefreshHistoryLimit: int # Optional: Default 3
failedRefreshHistoryLimit: int # Optional: Default 1
Common Schedules: - "0 2 * * *" - Daily at 2 AM - "*/30 * * * *" - Every 30 minutes - "0 0 * * 0" - Weekly on Sunday
Pool¶
API Version: e6data.io/v1alpha1 | Short Names: pool
Provides shared compute resources for burst capacity across multiple QueryServices.
Key Fields:
spec:
minExecutors: int # Baseline capacity
maxExecutors: int # Required: Maximum capacity
executorsPerNode: int # Optional: Default 1
instanceConfig:
instanceType: string # Explicit instance type
spotEnabled: bool # Use spot instances
queryServiceSelector: {...} # Label selector for allowed QS
# OR
allowedQueryServices: [] # Explicit allow list
Status Phases: Pending → Creating → Active | Suspended | Failed
Governance¶
API Version: e6data.io/v1alpha1 | Short Names: gov, governance
Defines data access control policies (access grants, column masking, row filtering).
Key Fields:
spec:
catalogName: string # Required: Target catalog
policies:
- name: string # Policy name
type: string # GRANT_ACCESS|COLUMN_MASKING|ROW_FILTERING
resources: [...] # Target resources
users: []string # Target users
groups: []string # Target groups
allow: bool # For GRANT_ACCESS
maskType: string # For COLUMN_MASKING
rowFilter: string # For ROW_FILTERING
Status Phases: Pending → Syncing → Synced | Failed
GreptimeDBCluster¶
API Version: e6data.io/v1alpha1 | Short Names: gdb, greptime
Deploys a GreptimeDB time-series database cluster for query history and metrics.
Key Fields:
spec:
storage:
backend: string # Required: s3|gcs|azure
bucket: string # Required: Bucket name
region: string # Required: Cloud region
frontend:
replicas: int # Query endpoints
datanode:
replicas: int # Data storage
meta:
replicas: int # Coordination (1 or 3)
etcd:
replicas: int # Metadata (1 or 3)
grafana:
enabled: bool # Optional visualization
Status Phases: Pending → Initializing → Running | Degraded | Failed
MonitoringServices¶
API Version: e6data.io/v1alpha2 | Short Names: ms, monitoring
Deploys Vector-based log and metrics collection infrastructure.
Key Fields:
spec:
workspace: string # Required: Workspace name
storageBackend: string # Required: Object storage path
logs:
enabled: bool # Enable pod log collection
selector: {...} # Pod selector for log collection
metrics:
enabled: bool # Enable metrics scraping
selector: {...} # Pod selector for metrics
vector:
image: {...} # Vector container image
resources: {...} # CPU/Memory limits
sinks:
s3: {...} # S3 sink configuration
greptimeRef: {...} # GreptimeDB sink reference
Status Phases: Pending → Creating → Running | Failed | Degraded
ReleaseManager¶
API Version: e6data.io/v1alpha1 | Short Names: releases, rm
Stores release version information for different products. Used for GitOps-based release management and showing available upgrades.
Key Fields:
spec:
version: string # Required: Semantic version (e.g., "1.2.0")
product: string # Required: engine|laminar|copilot|platform
subType: string # For engine: metadata|cluster
description: string # Human-readable description
releaseNotes: string # Changelog
releaseDate: time # When released
deprecated: bool # Mark as deprecated
minUpgradeVersion: string # Minimum version that can upgrade to this
engine: # For engine product
cluster: # QueryService components
planner: {...} # Image, resources, config
queue: {...}
executor: {...}
metadata: # MetadataServices components
storage: {...}
schema: {...}
platform: # For platform product
console: {...}
apiServer: {...}
Component Release Spec:
image:
repository: string # Container registry path
name: string # Required: Image name
tag: string # Required: Image tag
digest: string # Optional: Immutable reference
resources:
minimum: {cpu, memory} # Minimum required
recommended: {cpu, memory} # Optimal performance
maximum: {cpu, memory} # Limits
configVariables: {...} # Default config.properties
environmentVariables: {...} # Default env vars
affinity: {...} # Recommended affinity rules
featureFlags: {...} # Enabled/disabled features
breaking: bool # Has breaking changes
breakingChanges: []string # Description of breaking changes
Status Phases: Active | Deprecated | Superseded | Testing
Quick Reference Commands¶
# List all CRDs
kubectl get crd | grep e6data
# Get all resources of a type
kubectl get nsconfig -A # NamespaceConfig
kubectl get mds -A # MetadataServices
kubectl get qs -A # QueryService
kubectl get ti -A # TrafficInfra
kubectl get e6cat -A # E6Catalog
kubectl get catalogrefresh -A # CatalogRefresh
kubectl get crs -A # CatalogRefreshSchedule
kubectl get pool -A # Pool
kubectl get gov -A # Governance
kubectl get gdb -A # GreptimeDBCluster
kubectl get ms -A # MonitoringServices
kubectl get releases -A # ReleaseManager
# Watch status
kubectl get mds -w
kubectl get qs -w
# Describe with events
kubectl describe mds my-metadata -n workspace-prod
kubectl describe qs my-cluster -n workspace-prod
# Get YAML output
kubectl get mds my-metadata -o yaml
kubectl get qs my-cluster -o yaml
# Check operator logs
kubectl logs -n e6-operator-system -l app=e6-operator --tail=100
Version Compatibility¶
| Operator Version | CRD API Version | Kubernetes | Karpenter |
|---|---|---|---|
| 1.0.x | v1alpha1 | 1.24+ | 0.32+ |
| 1.1.x | v1alpha1 | 1.25+ | 0.37+ |
Migration Notes¶
From Helm Charts¶
If migrating from standalone Helm charts to the operator:
- Export existing values from Helm releases
- Create equivalent CRs with the same configuration
- Apply CRs to let operator take ownership
- Delete Helm releases (operator manages resources now)
API Version Upgrades¶
When upgrading between API versions:
- Check release notes for breaking changes
- Backup existing CRs:
kubectl get mds -o yaml > backup.yaml - Apply any required migrations
- Upgrade operator Helm chart