본문 바로가기

AWS

[AWS] EKS에 Karpenter 적용

  • Karpenter 란
    • Cluster Autoscaler와 유사하게 Kubernetes 클러스터에서 자동 스케일링을 관리하기 위한 오픈 소스 도구
  • Karpenter 동작
    • HPA에 의해 자동확장 또는 재배포하여 새로운 Pod가 생성.
    • Kube-scheduler에 의해 기존에 존재하는 Worker node에 새로운 Pod를 할당
    • 기존 Worker node에 새로운 Pod를 할당할 자원이 부족하여 Pod는 Pending 상태가 됨
    • Karpenter는 Pending 상태의 Pod를 감지하고 새로운 Worker node를 생성
    • Kube-scheduler에 의해 새로 생성된 Worker node에 Pending 상태인 Pod가 배포

 

실습

  • Node IAM Role 생성
    • Karpenter의 IAM Role 생성 및 정책 연결
cat << EOF > node-trust-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ec2.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}
EOF

CLUSTER_NAME=ydy-eks

aws iam create-role --role-name "KarpenterNodeRole-${CLUSTER_NAME}" \
    --assume-role-policy-document file://node-trust-policy.json
	
aws iam attach-role-policy --role-name "KarpenterNodeRole-${CLUSTER_NAME}" \
    --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy

aws iam attach-role-policy --role-name "KarpenterNodeRole-${CLUSTER_NAME}" \
    --policy-arn arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy

aws iam attach-role-policy --role-name "KarpenterNodeRole-${CLUSTER_NAME}" \
    --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly

aws iam attach-role-policy --role-name "KarpenterNodeRole-${CLUSTER_NAME}" \
    --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
  • 생성한 IAM Role을 EC2 인스턴스 프로필에 연결
aws iam create-instance-profile \
    --instance-profile-name "KarpenterNodeInstanceProfile-${CLUSTER_NAME}"

aws iam add-role-to-instance-profile \
    --instance-profile-name "KarpenterNodeInstanceProfile-${CLUSTER_NAME}" \
    --role-name "KarpenterNodeRole-${CLUSTER_NAME}"
  • Controller IAM Role 생성
    • 신뢰관계 생성 및 IAM Role 생성
AWS_PARTITION=aws
AWS_ACCOUNT_ID=759320821027
OIDC_ENDPOINT=https://oidc.eks.ap-northeast-2.amazonaws.com/id/B6A0F899A0158143FA9FCC8A85D3CFA5

cat << EOF > controller-trust-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT#*//}"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "${OIDC_ENDPOINT#*//}:aud": "sts.amazonaws.com",
                    "${OIDC_ENDPOINT#*//}:sub": "system:serviceaccount:karpenter:karpenter"
                }
            }
        }
    ]
}
EOF

aws iam create-role --role-name KarpenterControllerRole-${CLUSTER_NAME} \
    --assume-role-policy-document file://controller-trust-policy.json
  • 나의 OIDC 확인 방법
aws eks describe-cluster --name ydy-eks --query "cluster.identity.oidc.issuer" --output text
  • Karpenter 컨트롤러용 IAM Policy 생성 및 연결
cat << EOF > controller-policy.json
{
    "Statement": [
        {
            "Action": [
                "ssm:GetParameter",
                "ec2:DescribeImages",
                "ec2:RunInstances",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeInstances",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeInstanceTypeOfferings",
                "ec2:DescribeAvailabilityZones",
                "ec2:DeleteLaunchTemplate",
                "ec2:CreateTags",
                "ec2:CreateLaunchTemplate",
                "ec2:CreateFleet",
                "ec2:DescribeSpotPriceHistory",
                "pricing:GetProducts"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "Karpenter"
        },
        {
            "Action": "ec2:TerminateInstances",
            "Condition": {
                "StringLike": {
                    "ec2:ResourceTag/karpenter.sh/provisioner-name": "*"
                }
            },
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "ConditionalEC2Termination"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}",
            "Sid": "PassNodeIAMRole"
        },
        {
            "Effect": "Allow",
            "Action": "eks:DescribeCluster",
            "Resource": "arn:${AWS_PARTITION}:eks:${AWS_REGION}:${AWS_ACCOUNT_ID}:cluster/${CLUSTER_NAME}",
            "Sid": "EKSClusterEndpointLookup"
        }
    ],
    "Version": "2012-10-17"
}
EOF

aws iam put-role-policy --role-name KarpenterControllerRole-${CLUSTER_NAME} \
    --policy-name KarpenterControllerPolicy-${CLUSTER_NAME} \
    --policy-document file://controller-policy.json
  • karpenter가 사용할 서브넷에 태그 설정
    • for 반복문 형태로 나의 EKS Cluster가 사용하고 있는 서브넷에 태그 추가
for NODEGROUP in $(aws eks list-nodegroups --cluster-name ${CLUSTER_NAME} \
    --query 'nodegroups' --output text); do aws ec2 create-tags \
    --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}" \
    --resources $(aws eks describe-nodegroup --cluster-name ${CLUSTER_NAME} \
    --nodegroup-name $NODEGROUP --query 'nodegroup.subnets' --output text )
done

  • Karpenter가 사용할 보안그룹에 태그 설정
    • 이 명령은 클러스터의 첫 번째 노드 그룹에 대한 보안 그룹에만 태그를 지정
NODEGROUP=$(aws eks list-nodegroups --cluster-name ${CLUSTER_NAME} \
    --query 'nodegroups[0]' --output text)LAUNCH_TEMPLATE=$(aws eks describe-nodegroup --cluster-name ${CLUSTER_NAME} \
    --nodegroup-name ${NODEGROUP} --query 'nodegroup.launchTemplate.{id:id,version:version}' \
    --output text | tr -s "\t" ",")

# If your EKS setup is configured to use only Cluster security group, then please execute -
# 클러스터에서 하나의 보안그룹만 사용한다면 아래 명령어 사용
SECURITY_GROUPS=$(aws eks describe-cluster \
    --name ${CLUSTER_NAME} --query "cluster.resourcesVpcConfig.clusterSecurityGroupId" --output text)

# If your setup uses the security groups in the Launch template of a managed node group, then :
# 노드그룹의 시작 템플릿에 있는 보안그룹을 사용한다면 아래 명령어 사용
SECURITY_GROUPS=$(aws ec2 describe-launch-template-versions \
    --launch-template-id ${LAUNCH_TEMPLATE%,*} --versions ${LAUNCH_TEMPLATE#*,} \
    --query 'LaunchTemplateVersions[0].LaunchTemplateData.[NetworkInterfaces[0].Groups||SecurityGroupIds]' \
    --output text)

aws ec2 create-tags \
    --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}" \
    --resources ${SECURITY_GROUPS}

  • ConfigMap 업데이트
    • 방금 생성한 IAM Role을 사용하는 노드가 클러스터에 조인하도록 허용
kubectl edit configmap aws-auth -n kube-system
밑의 사진 빨간 박스 안의 내용 추가

apiVersion: v1
data:
  mapRoles: |
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::759320821027:role/eksctl-ydy-eks-nodegroup-ydy-eks-n-NodeInstanceRole-nZTyJ6uDTSaB
      username: system:node:{{EC2PrivateDNSName}}
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:aws:iam::759320821027:role/KarpenterNodeRole-ydy-eks
      username: system:node:{{EC2PrivateDNSName}}
kind: ConfigMap
metadata:
  creationTimestamp: "2024-07-29T05:01:36Z"
  name: aws-auth
  namespace: kube-system
  resourceVersion: "919026"
  uid: 650287a9-dc63-4565-8e9c-abb933b9ca49

  • Karpenter 배포
    • Karpenter 버전
KARPENTER_VERSION=v0.31.1
  • Helm 차트 배포
helm template karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace karpenter \
    --set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \
    --set settings.aws.clusterName=${CLUSTER_NAME} \
    --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole-${CLUSTER_NAME}" \
    --set controller.resources.requests.cpu=1 \
    --set controller.resources.requests.memory=1Gi \
    --set controller.resources.limits.cpu=1 \
    --set controller.resources.limits.memory=1Gi > karpenter.yaml
  • Karpenter ns 생성
kubectl create ns karpenter
  • provisioners / awsnodetemplate 생성
kubectl create -f \
    https://raw.githubusercontent.com/aws/karpenter/${KARPENTER_VERSION}/pkg/apis/crds/karpenter.sh_provisioners.yaml
kubectl create -f \
    https://raw.githubusercontent.com/aws/karpenter/${KARPENTER_VERSION}/pkg/apis/crds/karpenter.k8s.aws_awsnodetemplates.yaml
kubectl create -f \
    https://raw.githubusercontent.com/aws/karpenter/${KARPENTER_VERSION}/pkg/apis/crds/karpenter.sh_machines.yaml
kubectl apply -f karpenter.yaml
  • Karpenter Node 설정
    • Provisioner.yaml 생성 및 배포
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: ydytest-provisioner
spec:
  labels:
    purpose: test
    nodegroup: test

  requirements:
      # instance type을 정의 여러개를 한번에 정의할 수 있습니다.
    - key: node.kubernetes.io/instance-type
      operator: In
      values: [ t4g.medium ]
      # WorkerNode를 생성할 Zone
    - key: topology.kubernetes.io/zone
      operator: In
      values: [ ap-northeast-2a, ap-northeast-2c ]
      # on-demand, spot 중 원하는 인스턴스를 선언, 둘다 정의할 수 있습니다.
    - key: karpenter.sh/capacity-type
      operator: In
      values: [ on-demand ]
    - key: kubernetes.io/os
      operator: In
      values:
        - linux
    - key: kubernetes.io/arch
      operator: In
      values:
        - arm64
  consolidation:
    enabled: true
  providerRef:
    name: ydytest-node-template
  ttlSecondsUntilExpired: 2592000
  • NodeTemplate.yaml 생성 및 배포
apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: ydytest-node-template
spec:
  # subnet, securityGroup tag
  subnetSelector:
    karpenter.sh/discovery: "ydy-eks"
  securityGroupSelector:
    karpenter.sh/discovery: "ydy-eks"
  amiFamily: AL2
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        volumeSize: 50G
        volumeType: gp3
        iops: 3000
        throughput: 125
        deleteOnTermination: true
  tags:
    Name: ydy-karpenter-server
    nodegroup-role: worker
    Team: ydy

  • Karpenter Test
    • TEST용 nginx-scaleout.yaml 파일 생성 및 배포
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-to-scaleout
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        service: nginx
        app: nginx
    spec:
      containers:
      - image: nginx
        name: nginx-to-scaleout
        resources:
          limits:
            cpu: 500m
            memory: 512Mi
          requests:
            cpu: 500m
            memory: 512Mi
  • Scale out
kubectl scale --replicas=10 deployment/nginx-to-scaleout

  • Karpenter log 확인
 kubectl logs -n karpenter deployment/karpenter -f

  • Pod 상태 확인