Skip to content

NamespaceConfig

API Version: e6data.io/v1alpha1 Kind: NamespaceConfig Short Names: nsconfig, nsc


1. Purpose

NamespaceConfig provides shared infrastructure settings for a Kubernetes namespace that are inherited by MetadataServices, QueryService, and MonitoringServices. This CRD centralizes configuration that was previously duplicated across multiple resources:

  • Cloud provider (AWS, GCP, Azure) - auto-detected or explicitly specified
  • Storage backend (S3, GCS, Azure Blob paths)
  • Container registry and image pull secrets
  • Pod scheduling (tolerations, node selectors, affinity)
  • Service accounts for different workload types
  • Default pool reference for QueryServices

Create a NamespaceConfig before creating MetadataServices or QueryService in a namespace. Resources in the namespace will automatically inherit these settings.


2. High-level Behavior

When you create a NamespaceConfig CR, the operator:

  1. Auto-detects cloud provider from cluster nodes (if spec.cloud not specified)
  2. Validates configuration consistency (storage backend matches cloud provider)
  3. Tracks resource counts that inherit from this config
  4. Provides inheritance to MetadataServices, QueryService, and MonitoringServices

Configuration Inheritance

Resources in the same namespace automatically inherit these fields from NamespaceConfig:

Field Inherited By
cloud MetadataServices, QueryService, MonitoringServices
storageBackend MetadataServices, QueryService, MonitoringServices
s3Endpoint MetadataServices, QueryService
imageRepository MetadataServices, QueryService
imagePullSecrets MetadataServices, QueryService, MonitoringServices
tolerations MetadataServices, QueryService, MonitoringServices
nodeSelector MetadataServices, QueryService, MonitoringServices
affinity MetadataServices, QueryService, MonitoringServices
karpenterNodePool MetadataServices, QueryService
serviceAccounts.data MetadataServices, QueryService
serviceAccounts.monitoring MonitoringServices
defaultPoolRef QueryService (if no explicit poolRef)

Override Behavior: Individual resources can override inherited values by specifying them explicitly in their own spec.

Child Resources

NamespaceConfig does not directly create any child resources. It serves as a configuration provider for other CRDs.


3. Spec Reference

3.1 All Fields

Field Type Required Default Description
cloud string No Auto-detected Cloud provider: AWS, GCP, or AZURE
storageBackend string No - Object storage path (e.g., s3a://bucket, gs://bucket, abfs://container@account)
s3Endpoint string No - Custom S3 endpoint for S3-compatible storage (Linode, Wasabi, MinIO)
imageRepository string No us-docker.pkg.dev/e6data-analytics/e6-engine Container registry path
imagePullSecrets []string No [] Secret names for private container registries
tolerations []Toleration No [] Pod tolerations for scheduling
nodeSelector map[string]string No {} Node labels for pod placement
affinity Affinity No nil Advanced scheduling rules
karpenterNodePool string No - Karpenter NodePool name for node affinity
serviceAccounts ServiceAccountsSpec No See defaults ServiceAccount configuration
defaultPoolRef PoolReference No - Default pool for QueryServices

3.2 ServiceAccounts

Field Type Required Default Description
data string No e6data-data ServiceAccount for MetadataServices and QueryService (requires cloud IAM permissions)
monitoring string No e6data-monitoring ServiceAccount for MonitoringServices (requires Kubernetes RBAC)

3.3 PoolReference

Field Type Required Default Description
name string Yes - Pool resource name
namespace string No Same namespace Pool namespace

4. Example Manifests

4.1 Minimal Example

apiVersion: e6data.io/v1alpha1
kind: NamespaceConfig
metadata:
  name: config
  namespace: workspace-analytics-prod
spec:
  storageBackend: s3a://acme-data-lake

4.2 AWS Production Example

apiVersion: e6data.io/v1alpha1
kind: NamespaceConfig
metadata:
  name: config
  namespace: workspace-analytics-prod
  labels:
    e6data.io/environment: production
spec:
  # Cloud configuration
  cloud: AWS
  storageBackend: s3a://acme-data-lake-prod

  # Container registry
  imageRepository: us-docker.pkg.dev/e6data-analytics/e6-engine
  imagePullSecrets:
    - e6data-registry-secret

  # ServiceAccounts with IAM roles
  serviceAccounts:
    data: analytics-prod-data-sa       # Has S3 read access via IRSA
    monitoring: analytics-prod-mon-sa  # Has K8s RBAC for metrics

  # Pod scheduling
  tolerations:
    - key: "e6data-workspace-name"
      operator: "Equal"
      value: "analytics-prod"
      effect: "NoSchedule"

  nodeSelector:
    e6data-workspace-name: analytics-prod

  karpenterNodePool: metadata-services

  # Default pool for burst capacity
  defaultPoolRef:
    name: burst-pool
    namespace: e6-pools

4.3 GCP Example

apiVersion: e6data.io/v1alpha1
kind: NamespaceConfig
metadata:
  name: config
  namespace: workspace-analytics
spec:
  cloud: GCP
  storageBackend: gs://acme-data-lake

  serviceAccounts:
    data: analytics-gcp-sa      # Has GCS access via Workload Identity
    monitoring: monitoring-sa

  nodeSelector:
    cloud.google.com/gke-nodepool: e6data-pool

4.4 Azure Example

apiVersion: e6data.io/v1alpha1
kind: NamespaceConfig
metadata:
  name: config
  namespace: workspace-analytics
spec:
  cloud: AZURE
  storageBackend: abfs://datalake@acmestorage.dfs.core.windows.net

  serviceAccounts:
    data: analytics-azure-sa      # Has Azure Storage access via Workload Identity
    monitoring: monitoring-sa

  tolerations:
    - key: "kubernetes.azure.com/scalesetpriority"
      operator: "Equal"
      value: "spot"
      effect: "NoSchedule"

4.5 S3-Compatible Storage (Linode/Wasabi/MinIO)

apiVersion: e6data.io/v1alpha1
kind: NamespaceConfig
metadata:
  name: config
  namespace: workspace-analytics
spec:
  cloud: AWS  # Use AWS for S3-compatible storage
  storageBackend: s3a://my-bucket
  s3Endpoint: https://us-east-1.linodeobjects.com

  serviceAccounts:
    data: linode-data-sa
    monitoring: monitoring-sa

5. Status & Lifecycle

5.1 Status Fields

Field Type Description
phase string Current lifecycle phase
message string Human-readable status message
conditions []Condition Detailed status conditions
observedGeneration int64 Last observed spec generation
detectedCloud string Auto-detected cloud provider (if spec.cloud empty)
resourceCounts ResourceCountsStatus Resources using this config

5.2 ResourceCounts

Field Type Description
metadataServices int Number of MetadataServices in namespace
queryServices int Number of QueryServices in namespace
monitoringServices int Number of MonitoringServices in namespace

5.3 Phase Values

Phase Description
Pending Resource created, validating configuration
Ready Configuration validated and active
Error Configuration validation failed

CRDs that Inherit from NamespaceConfig

CRD Inherited Fields
MetadataServices cloud, storageBackend, s3Endpoint, imageRepository, imagePullSecrets, tolerations, nodeSelector, affinity, karpenterNodePool, serviceAccounts.data
QueryService cloud, storageBackend, s3Endpoint, imageRepository, imagePullSecrets, tolerations, nodeSelector, affinity, karpenterNodePool, serviceAccounts.data, defaultPoolRef
MonitoringServices cloud, storageBackend, imagePullSecrets, tolerations, nodeSelector, affinity, serviceAccounts.monitoring

Lookup Order

Resources look up NamespaceConfig using this priority: 1. NamespaceConfig named config in same namespace 2. Any NamespaceConfig in same namespace (if only one exists) 3. Fall back to defaults if no NamespaceConfig found


7. Troubleshooting

7.1 Common Issues

NamespaceConfig Not Found

Symptoms:

$ kubectl get mds
NAME             PHASE     MESSAGE
analytics-prod   Pending   Waiting for NamespaceConfig

Cause: No NamespaceConfig exists in the namespace.

Fix:

# Create a NamespaceConfig
kubectl apply -f - <<EOF
apiVersion: e6data.io/v1alpha1
kind: NamespaceConfig
metadata:
  name: config
  namespace: workspace-analytics-prod
spec:
  storageBackend: s3a://your-bucket
EOF

Cloud Detection Failed

Symptoms:

$ kubectl get nsconfig config -o jsonpath='{.status.detectedCloud}'
UNKNOWN

Cause: Operator couldn't detect cloud from node labels/providerID.

Fix: Explicitly specify cloud in spec:

spec:
  cloud: AWS  # or GCP, AZURE

Storage Backend Mismatch

Symptoms:

Error: spec.storageBackend (s3a://) requires cloud=AWS, got GCP

Cause: Storage backend prefix doesn't match cloud provider.

Fix: Ensure consistency: - s3a://cloud: AWS - gs://cloud: GCP - abfs://cloud: AZURE

7.2 Useful Commands

# Get NamespaceConfig status
kubectl get nsconfig -A
kubectl get nsconfig config -n workspace-prod -o yaml

# Check detected cloud
kubectl get nsconfig config -o jsonpath='{.status.detectedCloud}'

# Check resource counts
kubectl get nsconfig config -o jsonpath='{.status.resourceCounts}'

# List all resources inheriting from NamespaceConfig
kubectl get mds,qs,ms -n workspace-prod

# View operator logs
kubectl logs -n e6-operator-system -l app=e6-operator | grep NamespaceConfig

8. Migration Guide

From Per-Resource Configuration

If you have existing MetadataServices/QueryService with inline cloud, storage, and scheduling config:

Before (per-resource):

apiVersion: e6data.io/v1alpha1
kind: MetadataServices
metadata:
  name: analytics-prod
  namespace: workspace-prod
spec:
  workspace: analytics-prod
  tenant: acme
  cloud: AWS                           # Was here
  storageBackend: s3a://bucket         # Was here
  tolerations: [...]                   # Was here
  nodeSelector: {...}                  # Was here
  serviceAccount: my-sa                # Was here
  # ...

After (NamespaceConfig):

# Step 1: Create NamespaceConfig
apiVersion: e6data.io/v1alpha1
kind: NamespaceConfig
metadata:
  name: config
  namespace: workspace-prod
spec:
  cloud: AWS
  storageBackend: s3a://bucket
  tolerations: [...]
  nodeSelector: {...}
  serviceAccounts:
    data: my-sa
---
# Step 2: Simplified MetadataServices (inherits from NamespaceConfig)
apiVersion: e6data.io/v1alpha1
kind: MetadataServices
metadata:
  name: analytics-prod
  namespace: workspace-prod
spec:
  workspace: analytics-prod
  tenant: acme
  storage:
    imageTag: "3.0.217"
    resources:
      memory: "8Gi"
      cpu: "4"
  schema:
    imageTag: "3.0.217"
    resources:
      memory: "16Gi"
      cpu: "8"

Migration Steps

  1. Create NamespaceConfig with shared settings
  2. Update MetadataServices to remove now-inherited fields
  3. Update QueryService to remove now-inherited fields
  4. Update MonitoringServices to remove now-inherited fields
  5. Verify all resources are running correctly

Note: The operator supports gradual migration - resources with explicit values will use those instead of inheriting from NamespaceConfig.