Skip to content

AWS EKS Prerequisites

This guide covers all AWS-specific prerequisites for deploying the e6data Kubernetes Operator on Amazon EKS.


Quick Reference

Requirement Status Notes
EKS 1.24+ Required Kubernetes cluster
OIDC Provider Required For IRSA or Pod Identity
Pod Identity Agent Recommended Simpler than IRSA (except GreptimeDB)
S3 Bucket Required Data lake storage
IAM Policies Required Least-privilege access
Karpenter 1.0+ Recommended ARM64 Graviton instances
AWS Glue Optional If using Glue catalog

1. Authentication Options

e6data supports two authentication methods on AWS:

Method Recommended For Notes
EKS Pod Identity All workloads except GreptimeDB Simpler setup, no OIDC annotation needed
IRSA GreptimeDB, legacy clusters Required for GreptimeDB (uses AWS SDK directly)

EKS Pod Identity is the recommended approach for e6data workloads. It's simpler to configure and doesn't require ServiceAccount annotations.

Note: GreptimeDB requires IRSA because it uses the AWS SDK directly for S3 access.

Step 1: Install Pod Identity Agent

# Check if Pod Identity Agent is installed
aws eks describe-addon --cluster-name YOUR_CLUSTER --addon-name eks-pod-identity-agent

# If not installed, add it
aws eks create-addon \
  --cluster-name YOUR_CLUSTER \
  --addon-name eks-pod-identity-agent \
  --addon-version v1.3.2-eksbuild.2

# Wait for addon to be active
aws eks wait addon-active \
  --cluster-name YOUR_CLUSTER \
  --addon-name eks-pod-identity-agent

Step 2: Create IAM Role for Pod Identity

Create pod-identity-trust-policy.json:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "pods.eks.amazonaws.com"
      },
      "Action": [
        "sts:AssumeRole",
        "sts:TagSession"
      ]
    }
  ]
}

Create the role:

aws iam create-role \
  --role-name e6data-workspace-role \
  --assume-role-policy-document file://pod-identity-trust-policy.json \
  --description "IAM role for e6data workspace pods"

Step 3: Create Pod Identity Association

# Create association for your workspace namespace
aws eks create-pod-identity-association \
  --cluster-name YOUR_CLUSTER \
  --namespace workspace-prod \
  --service-account analytics-prod \
  --role-arn arn:aws:iam::ACCOUNT_ID:role/e6data-workspace-role

# Verify
aws eks list-pod-identity-associations --cluster-name YOUR_CLUSTER

1.2 IRSA (Required for GreptimeDB)

IRSA is required for GreptimeDB and can be used for other workloads if preferred.

Step 1: Verify OIDC Provider

# Get OIDC issuer URL
OIDC_URL=$(aws eks describe-cluster --name YOUR_CLUSTER \
  --query "cluster.identity.oidc.issuer" --output text)

echo "OIDC URL: $OIDC_URL"

# Extract OIDC ID
OIDC_ID=$(echo $OIDC_URL | cut -d'/' -f5)
echo "OIDC ID: $OIDC_ID"

# Check if provider exists
aws iam list-open-id-connect-providers | grep $OIDC_ID

If OIDC provider doesn't exist:

eksctl utils associate-iam-oidc-provider \
  --cluster YOUR_CLUSTER \
  --approve

Step 2: Create IRSA Trust Policy

Create irsa-trust-policy.json (replace placeholders):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.REGION.amazonaws.com/id/OIDC_ID"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "oidc.eks.REGION.amazonaws.com/id/OIDC_ID:aud": "sts.amazonaws.com",
          "oidc.eks.REGION.amazonaws.com/id/OIDC_ID:sub": "system:serviceaccount:NAMESPACE:SERVICE_ACCOUNT"
        }
      }
    }
  ]
}

For multiple namespaces, use StringLike with wildcards:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.REGION.amazonaws.com/id/OIDC_ID"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringLike": {
          "oidc.eks.REGION.amazonaws.com/id/OIDC_ID:sub": "system:serviceaccount:workspace-*:*"
        },
        "StringEquals": {
          "oidc.eks.REGION.amazonaws.com/id/OIDC_ID:aud": "sts.amazonaws.com"
        }
      }
    }
  ]
}

Step 3: Annotate ServiceAccount

apiVersion: v1
kind: ServiceAccount
metadata:
  name: analytics-prod
  namespace: workspace-prod
  annotations:
    eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT_ID:role/e6data-workspace-role"

2. IAM Policies (Least Privilege)

2.1 Base S3 Policy (Required)

Create e6data-s3-policy.json:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "S3ReadAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:GetObjectVersion",
        "s3:GetObjectTagging",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::YOUR-DATA-BUCKET",
        "arn:aws:s3:::YOUR-DATA-BUCKET/*"
      ]
    },
    {
      "Sid": "S3WriteAccess",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::YOUR-DATA-BUCKET/e6data-cache/*",
        "arn:aws:s3:::YOUR-DATA-BUCKET/e6data-metadata/*"
      ],
      "Condition": {
        "StringLike": {
          "s3:prefix": [
            "e6data-cache/*",
            "e6data-metadata/*"
          ]
        }
      }
    }
  ]
}

2.2 AWS Glue Policy (If Using Glue Catalog)

Create e6data-glue-policy.json:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "GlueCatalogReadAccess",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:GetTable",
        "glue:GetTables",
        "glue:GetTableVersion",
        "glue:GetTableVersions",
        "glue:GetPartition",
        "glue:GetPartitions",
        "glue:BatchGetPartition",
        "glue:GetCatalogImportStatus"
      ],
      "Resource": [
        "arn:aws:glue:REGION:ACCOUNT_ID:catalog",
        "arn:aws:glue:REGION:ACCOUNT_ID:database/*",
        "arn:aws:glue:REGION:ACCOUNT_ID:table/*/*"
      ]
    }
  ]
}

Restrict to specific databases (recommended for production):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "GlueCatalogReadAccess",
      "Effect": "Allow",
      "Action": [
        "glue:GetDatabase",
        "glue:GetDatabases",
        "glue:GetTable",
        "glue:GetTables",
        "glue:GetTableVersion",
        "glue:GetTableVersions",
        "glue:GetPartition",
        "glue:GetPartitions",
        "glue:BatchGetPartition"
      ],
      "Resource": [
        "arn:aws:glue:REGION:ACCOUNT_ID:catalog",
        "arn:aws:glue:REGION:ACCOUNT_ID:database/analytics_db",
        "arn:aws:glue:REGION:ACCOUNT_ID:database/sales_db",
        "arn:aws:glue:REGION:ACCOUNT_ID:table/analytics_db/*",
        "arn:aws:glue:REGION:ACCOUNT_ID:table/sales_db/*"
      ]
    }
  ]
}

2.3 GreptimeDB S3 Policy (For MonitoringServices)

GreptimeDB needs write access to its own bucket:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "GreptimeS3Access",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:ListBucket",
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::YOUR-GREPTIME-BUCKET",
        "arn:aws:s3:::YOUR-GREPTIME-BUCKET/*"
      ]
    }
  ]
}

2.4 Create and Attach Policies

# Create S3 policy
aws iam create-policy \
  --policy-name e6data-s3-policy \
  --policy-document file://e6data-s3-policy.json

# Create Glue policy (if using Glue)
aws iam create-policy \
  --policy-name e6data-glue-policy \
  --policy-document file://e6data-glue-policy.json

# Attach policies to role
aws iam attach-role-policy \
  --role-name e6data-workspace-role \
  --policy-arn arn:aws:iam::ACCOUNT_ID:policy/e6data-s3-policy

aws iam attach-role-policy \
  --role-name e6data-workspace-role \
  --policy-arn arn:aws:iam::ACCOUNT_ID:policy/e6data-glue-policy

Karpenter provides dynamic node provisioning. We recommend ARM64 (Graviton) instances for cost-effectiveness.

3.1 Install Karpenter

export KARPENTER_VERSION="1.0.0"
export CLUSTER_NAME="YOUR_CLUSTER"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text)"

helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version ${KARPENTER_VERSION} \
  --namespace karpenter --create-namespace \
  --set settings.clusterName=${CLUSTER_NAME} \
  --set settings.clusterEndpoint=${CLUSTER_ENDPOINT} \
  --set settings.interruptionQueue=${CLUSTER_NAME} \
  --wait
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: e6data-compute
spec:
  template:
    metadata:
      labels:
        e6data.io/node-type: compute
    spec:
      requirements:
        # Prefer ARM64 (Graviton) for cost savings
        - key: kubernetes.io/arch
          operator: In
          values: ["arm64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        # Use instance families, not specific types
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values:
            - r7g    # Graviton3 memory-optimized (best price/performance)
            - r6g    # Graviton2 memory-optimized
            - m7g    # Graviton3 general purpose
            - m6g    # Graviton2 general purpose
        # Instance size range
        - key: karpenter.k8s.aws/instance-size
          operator: In
          values:
            - 4xlarge
            - 8xlarge
            - 12xlarge
            - 16xlarge

      # Taints for workload isolation
      taints:
        - key: e6data-workspace-name
          value: "prod"
          effect: NoSchedule

      # Node expiry
      expireAfter: 720h  # 30 days

      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: e6data-graviton

  # Resource limits
  limits:
    cpu: 2000
    memory: 8000Gi

  # Disruption settings
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
    consolidateAfter: 5m
    budgets:
      - nodes: "10%"

3.3 EC2NodeClass for Graviton

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: e6data-graviton
spec:
  amiSelectorTerms:
    - alias: al2023@latest

  role: KarpenterNodeRole-YOUR_CLUSTER

  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: YOUR_CLUSTER

  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: YOUR_CLUSTER

  # Block devices
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3
        iops: 3000
        throughput: 125
        deleteOnTermination: true
        encrypted: true

  # Instance store for NVMe cache (optional, for instances with local storage)
  instanceStorePolicy: RAID0

  # Metadata options
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 2
    httpTokens: required  # IMDSv2

  tags:
    Environment: production
    Team: data-platform
    ManagedBy: karpenter

3.4 NodePool for AMD64 (If ARM64 Not Suitable)

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: e6data-compute-amd64
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values:
            - r7i    # Intel memory-optimized
            - r6i    # Intel memory-optimized
            - r5     # Intel memory-optimized (older)
        - key: karpenter.k8s.aws/instance-size
          operator: In
          values:
            - 4xlarge
            - 8xlarge
            - 12xlarge
            - 16xlarge

      taints:
        - key: e6data-workspace-name
          value: "prod"
          effect: NoSchedule

      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: e6data-amd64

  limits:
    cpu: 1000
    memory: 4000Gi

4. Verification

4.1 Test Pod Identity

# Create test pod
kubectl run test-pod-identity --rm -it --restart=Never \
  --namespace=workspace-prod \
  --serviceaccount=analytics-prod \
  --image=amazon/aws-cli \
  -- sts get-caller-identity

# Expected: Shows assumed role ARN

4.2 Test S3 Access

kubectl run test-s3 --rm -it --restart=Never \
  --namespace=workspace-prod \
  --serviceaccount=analytics-prod \
  --image=amazon/aws-cli \
  -- s3 ls s3://YOUR-DATA-BUCKET/ --max-items 5

4.3 Test Glue Access

kubectl run test-glue --rm -it --restart=Never \
  --namespace=workspace-prod \
  --serviceaccount=analytics-prod \
  --image=amazon/aws-cli \
  -- glue get-databases

4.4 Verify Karpenter

# Check Karpenter pods
kubectl get pods -n karpenter

# Check NodePools
kubectl get nodepools

# Check EC2NodeClasses
kubectl get ec2nodeclasses

# Watch provisioned nodes
kubectl get nodes -l karpenter.sh/nodepool=e6data-compute -w

5. Best Practices

5.1 Security

  • Use Pod Identity for most workloads (simpler, more secure)
  • Use IRSA only for GreptimeDB
  • Least privilege: Only grant required S3 buckets and Glue databases
  • IMDSv2: Always use httpTokens: required in EC2NodeClass
  • Encryption: Enable EBS encryption

5.2 Cost Optimization

  • ARM64 (Graviton): 20-40% cheaper than comparable x86 instances
  • Spot instances: Use for fault-tolerant workloads
  • Instance families: Let Karpenter choose optimal size within family
  • Consolidation: Enable WhenEmptyOrUnderutilized for auto-rightsizing

5.3 Performance

  • Instance size: Use 4xlarge or larger for query executors
  • Memory-optimized: Use r-family instances for data workloads
  • NVMe: Use instances with local NVMe for caching (r5d, r6gd)

6. S3 Bucket Setup

e6data requires an S3 bucket for metadata storage. Follow these security best practices when creating the bucket.

6.1 Create Bucket

# Set variables
BUCKET_NAME="e6-workspace-metadata"
REGION="us-east-1"

# Create bucket (us-east-1 doesn't need LocationConstraint)
aws s3api create-bucket \
  --bucket ${BUCKET_NAME} \
  --region ${REGION}

# For other regions, use:
# aws s3api create-bucket \
#   --bucket ${BUCKET_NAME} \
#   --region ${REGION} \
#   --create-bucket-configuration LocationConstraint=${REGION}

6.2 Enable Security Settings

# Block all public access
aws s3api put-public-access-block \
  --bucket ${BUCKET_NAME} \
  --public-access-block-configuration '{
    "BlockPublicAcls": true,
    "IgnorePublicAcls": true,
    "BlockPublicPolicy": true,
    "RestrictPublicBuckets": true
  }'

# Enable server-side encryption (AES256)
aws s3api put-bucket-encryption \
  --bucket ${BUCKET_NAME} \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "AES256"
      }
    }]
  }'

6.3 Add Bucket Policy

Deny any requests that don't use HTTPS:

aws s3api put-bucket-policy \
  --bucket ${BUCKET_NAME} \
  --policy '{
    "Version": "2012-10-17",
    "Statement": [{
      "Sid": "DenyInsecureTransport",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::'${BUCKET_NAME}'",
        "arn:aws:s3:::'${BUCKET_NAME}'/*"
      ],
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }]
  }'

6.4 Optional: VPC Endpoint Restriction

For enhanced security, restrict bucket access to your VPC endpoint:

aws s3api put-bucket-policy \
  --bucket ${BUCKET_NAME} \
  --policy '{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "AllowAccessFromVPCEOnly",
        "Effect": "Allow",
        "Principal": "*",
        "Action": "s3:*",
        "Resource": [
          "arn:aws:s3:::'${BUCKET_NAME}'",
          "arn:aws:s3:::'${BUCKET_NAME}'/*"
        ],
        "Condition": {
          "StringEquals": {
            "aws:SourceVpce": "YOUR_VPC_ENDPOINT_ID"
          }
        }
      },
      {
        "Sid": "DenyOutsideVPCE",
        "Effect": "Deny",
        "Principal": "*",
        "Action": "s3:*",
        "Resource": [
          "arn:aws:s3:::'${BUCKET_NAME}'",
          "arn:aws:s3:::'${BUCKET_NAME}'/*"
        ],
        "Condition": {
          "StringNotEquals": {
            "aws:SourceVpce": "YOUR_VPC_ENDPOINT_ID"
          }
        }
      }
    ]
  }'

7. EC2NodeClass with NVMe Instance Store

For instances with NVMe instance store (like c7gd, r7gd, i8g), configure the userData script to automatically set up the local storage.

7.1 NVMe RAID0 Configuration

This userData script automatically detects NVMe instance store drives and configures them as RAID0 for optimal performance:

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: e6data-nvme
spec:
  amiSelectorTerms:
    - alias: al2023@latest

  role: KarpenterNodeRole-YOUR_CLUSTER

  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: YOUR_CLUSTER

  securityGroupSelectorTerms:
    - tags:
        aws:eks:cluster-name: YOUR_CLUSTER

  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 100Gi
        volumeType: gp3

  kubelet:
    maxPods: 18

  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 1
    httpTokens: required

  userData: |
    mount_location="/app/tmp"
    mkdir -p $mount_location
    yum install nvme-cli -y

    # Check if NVMe instance store drives are present
    if nvme list | grep -q "Amazon EC2 NVMe Instance Storage"; then
        nvme_drives=$(nvme list | grep "Amazon EC2 NVMe Instance Storage" | cut -d " " -f 1 || true)
        readarray -t nvme_drives <<< "$nvme_drives"
        num_drives=${#nvme_drives[@]}

        if [ $num_drives -gt 1 ]; then
            # Multiple NVMe drives - create RAID0 array for maximum performance
            yum install mdadm -y
            mdadm --create /dev/md0 --level=0 --name=md0 --raid-devices=$num_drives "${nvme_drives[@]}"
            mkfs.ext4 /dev/md0
            mount /dev/md0 $mount_location
            mdadm --detail --scan >> /etc/mdadm.conf
            echo /dev/md0 $mount_location ext4 defaults,noatime 0 2 >> /etc/fstab
        else
            # Single NVMe drive - format and mount directly
            for disk in "${nvme_drives[@]}"; do
                mkfs.ext4 -F $disk
                mount $disk $mount_location
                echo $disk $mount_location ext4 defaults,noatime 0 2 >> /etc/fstab
            done
        fi
    else
        echo "No NVMe drives detected. Skipping NVMe configuration."
    fi

    chmod 777 $mount_location

  tags:
    Environment: production
    ManagedBy: karpenter
Instance Family Architecture NVMe Storage Use Case
c7gd ARM64 Yes Compute-intensive workloads
r7gd ARM64 Yes Memory-intensive workloads
i8g ARM64 Yes Storage-intensive workloads
c6gd ARM64 Yes General compute
r6gd ARM64 Yes General memory
m7gd ARM64 Yes Balanced workloads

7.3 NodePool for NVMe Instances

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: e6data-nvme-pool
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["arm64"]
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values:
            - c7gd
            - r7gd
            - i8g
            - m7gd
        - key: karpenter.k8s.aws/instance-size
          operator: NotIn
          values:
            - metal
        - key: karpenter.sh/capacity-type
          operator: In
          values:
            - spot
            - on-demand

      taints:
        - key: workspace-name
          value: "YOUR_WORKSPACE"
          effect: NoSchedule

      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: e6data-nvme

  limits:
    cpu: 5000

  disruption:
    consolidationPolicy: WhenEmpty
    consolidateAfter: 30s