Sign in
agent:

AWS EKS Version Update 1.29 to 1.30 via terraform

There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
  1. 1

    Make changes for AWS EKS version 1.30

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Check Terraform configuration for the below changes

    • All managed Node groups will default to using Amzon Linux 2023 (as default node OS).
    • Default EBS class changed to gp3, so use that by default to avoid issues
    • Minimum required IAM policy for the Amazon EKS cluster IAM role also requires: "ec2:DescribeAvailabilityZones"
    • Check for deprecated api versions for kubernetes and replace them if used anywhere.
    • Check the versions of Terraform aws eks module and other modules if they are compatible with the new EKS version
    • Check version and upgrade managed Add-ons for EKS cluster(not applicable in our case, we use a helm chart based deployment)

    Kubernetes Deprecation API guide: https://kubernetes.io/docs/reference/using-api/deprecation-guide/

    1
  2. 2

    Minimum required IAM policy for the Amazon EKS cluster IAM role also requires: "ec2:DescribeAvailabilityZones"

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Edit aws_iam_policy_eks in eks/user.tf to include "ec2:DescribeAvailabilityZones" as well.

    Basic requirement of policy was increased to include this policy.

    resource "aws_iam_policy" "aws_iam_policy_eks" { name = "eks-policy-${var.environment}-${data.terraform_remote_state.vpc.outputs.random_id}" path = "/" description = "eks-policy-${var.environment}-${data.terraform_remote_state.vpc.outputs.random_id}" policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = [ "eks:*", "ec2:DescribeAvailabilityZones", ] Effect = "Allow" Resource = "*" }, ] }) }
    copied
    2
  3. 3

    Edit eks/provider.tf to replace a deprecated api

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Change from v1beta1 to v1 for the api_version

    provider "kubernetes" { host = module.eks.cluster_endpoint cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data) exec { api_version = "client.authentication.k8s.io/v1" command = "aws" args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name] } }
    copied
    3
  4. 4

    Backup Statefiles before upgrade for the EKS cluster

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
    aws s3api create-bucket --bucket <backup_bucket_name> --region <region_name> --create-bucket-configuration LocationConstraint=<region_name> aws s3api put-bucket-versioning --bucket backup_bucket_name --region <region_name> --versioning-configuration Status=Enabled aws s3 sync <backend_s3_bucket_s3_uri> <backup_bucket_name_s3_uri>
    copied
    4
  5. 5

    Ensure OIDC provider URL and Service Account issuer URL are different before upgrading to EKS v1.30

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Before upgrading an EKS cluster to v1.30, verify that the OIDC provider URL used for IAM authentication is different from the Service Account issuer URL. If they are the same, disassociate the identity provider to avoid API server startup failures due to new validation in Kubernetes v1.30.

    By default both have the same value: A AWS managed OIDC Provider [which leads to version update issues >>> Kube API server failing]

    5
    1. 5.1

      Get current Service Account Issuer URL to EKS Cluster

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      This is generated by default when the EKS cluster is created

      aws eks describe-cluster --region <region_name> --name <cluster_name> \ --query "cluster.identity.oidc.issuer" --output text
      copied
      5.1
    2. 5.2

      List IAM OIDC Provider ARN for the custer

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      aws iam list-open-id-connect-providers --region <region_name>
      copied
      5.2
    3. 5.3

      Backup IAM OIDC Provider ARN(optional)

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      aws iam get-open-id-connect-provider \ --open-id-connect-provider-arn <oidc_provider_arn> \ --region <region_name> \ --output json > <filename>
      copied
      5.3
    4. 5.4

      List IAM roles using OIDC provider

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      For Prod Cluster: (Roles with OIDC usage)

      eks-prod-341-alb-ingress

      eks-prod-341-efs-csi-driver

      aws iam list-roles --query 'Roles[*].{RoleName:RoleName,OIDC_Provider:AssumeRolePolicyDocument.Statement[].Principal.Federated}' --output json | jq .
      copied
      5.4
    5. 5.5

      List Identity Provider Config

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      This should come up as empty for now.

      aws eks list-identity-provider-configs \ --region <region_name> \ --cluster-name <cluster_name>
      copied
      5.5
    6. 5.6

      Delete old IAM oidc identity provider

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      aws iam delete-open-id-connect-provider \ --open-id-connect-provider-arn <oidc_provider_arn>
      copied
      5.6
    7. 5.7

      Creating a new OIDC provider using AWS Cognito

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      5.7
      1. 5.7.1

        Create user pool in AWS Cognito

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
        aws cognito-idp create-user-pool \ --pool-name <oidc_pool_name> \ --region <region_name>
        copied
        5.7.1
      2. 5.7.2

        Create an app client for AWS cognito using the user_id from previously created user-pool

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
        aws cognito-idp create-user-pool-client \ --user-pool-id <user_pool_id> \ --client-name eks-client \ --no-generate-secret \ --region <region_name>
        copied
        5.7.2
      3. 5.7.3

        Create an IAM OIDC Provider Using AWS Cognito

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
        aws iam create-open-id-connect-provider \ --url <provider_url_cognito> \ --client-id-list "sts.amazonaws.com" \ --thumbprint-list $(openssl s_client -servername cognito-idp.us-east-2.amazonaws.com -connect cognito-idp.us-east-2.amazonaws.com:443 </dev/null 2>/dev/null | openssl x509 -fingerprint -sha1 -noout | cut -d"=" -f2) \ --region <region_name>
        copied
        5.7.3
      4. 5.7.4

        Associating Cognito OIDC provider with EKS Cluster

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
        aws eks associate-identity-provider-config \ --region <region_name> \ --cluster-name <cluster_name> \ --oidc identityProviderConfigName="eks-oidc-cognito",issuerUrl=<provider_url_cognito>,clientId=<client_id>
        copied
        5.7.4
    8. 5.8

      Remove old oidc from required eks statefile

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      Use terraform state list | grep "oidc_provider" to find the required state file items.

      terraform state rm module.eks.aws_iam_openid_connect_provider.oidc_provider[0]
      copied
      5.8
    9. 5.9

      Run Terraform Import to sync the manually created Cognito User Pool into Terraform state

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      Use arn to import if facing issues, correlate name from main.tf file

      terraform import \ -var environment=<env_name> \ -var aws_region=<region_name> \ -var remote_state_bucket=<s3_backend_bucket> \ -var remote_state_region=<backend_region> \ aws_cognito_user_pool.eks_user_pool <user_pool_name>
      copied
      5.9
    10. 5.10

      Add the following code blocks in eks/main.tf outside the eks module

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      data "tls_certificate" "cognito_oidc_thumbprint" { url = "https://cognito-idp.us-east-2.amazonaws.com/${data.aws_cognito_user_pool.eks_user_pool.id}" } data "aws_cognito_user_pool" "eks_user_pool" { user_pool_id = <user_pool_id> } resource "aws_iam_openid_connect_provider" "eks_oidc" { client_id_list = ["sts.amazonaws.com"] thumbprint_list = [data.tls_certificate.cognito_oidc_thumbprint.certificates[0].sha1_fingerprint] url = "https://cognito-idp.us-east-2.amazonaws.com/${data.aws_cognito_user_pool.eks_user_pool.user_pool_id}" } resource "aws_eks_identity_provider_config" "eks_oidc" { cluster_name = module.eks.cluster_name oidc { identity_provider_config_name = "eks-oidc-cognito" issuer_url = "https://${aws_iam_openid_connect_provider.eks_oidc.url}" client_id = <client_id> } depends_on = [aws_iam_openid_connect_provider.eks_oidc] }
      copied
      5.10
    11. 5.11

      Import the existing Cognito-EKS association into Terraform

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      terraform import \ -var environment=<env_name> \ -var aws_region=<region_name> \ -var remote_state_bucket=<s3_backend_bucket> \ -var remote_state_region=<backend_region> \ aws_eks_identity_provider_config.eks_oidc <cluster_name>:<oidc_name_cognito>
      copied
      5.11
    12. 5.12

      Add below lines in eks/main.tf in the eks module to not let terraform create irsa roles and provider by default

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      enable_irsa = false cluster_identity_providers = {}
      copied
      5.12
    13. 5.13

      Do a terraform init, plan and apply cycle for eks module so new outputs of eks module are propogated

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      cluster_oidc_issuer_url: aws based oidc

      oidc_provider_arn: cognito based

      Should be different now.

      terraform init \ -backend-config=<backend_s3_bucket> \ -backend-config=<dynamo_db_lock_name> \ -backend-config=<statefile_key> \ -backend-config="encrypt=true" \ -backend-config="region=us-east-2" terraform apply \ -var environment=<env_name> \ -var aws_region=<region_name> \ -var remote_state_bucket=<backend_s3_bucket> \ -var remote_state_region=<backend_region> # Same apply usage for plan as well
      copied
      5.13
    14. 5.14

      Do a terraform init, plan and apply cycle for eks-services module so new outputs of eks module are used for IAM Role creation

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      terraform init \ -backend-config=<backend_s3_bucket> \ -backend-config=<dynamo_db_lock_name> \ -backend-config=<statefile_key> \ -backend-config="encrypt=true" \ -backend-config="region=us-east-2" terraform apply \ -var environment=<env_name> \ -var aws_region=<region_name> \ -var remote_state_bucket=<backend_s3_bucket> \ -var remote_state_region=<backend_region> # Same apply usage for plan as well
      copied
      5.14
  6. 6

    Edit the eks/variable.tf to edit the cluster_version for eks update 1.29 to 1.30

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Ensure the version is in double quotes.

    variable "cluster_version" { type = string default = "1.30" }
    copied
    6
  7. 7

    Upgrade AWS EKS Cluster to 1.30

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Control Plane upgrade requires ~8mins to update.

    Another 10-20 mins for worker node update to eks 1.30

    terraform apply \ -var environment=<env_name> \ -var aws_region=<region_name> \ -var cluster_version="1.30" \ -var remote_state_bucket=<backend_s3_bucket> \ -var remote_state_region=<backend_region>
    copied
    7
    1. 7.1

      Can manually drain nodes to achieve instant version update effect

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      Easy Way: Double the desired count of nodes in node group and then bring it back to original

      7.1
    2. 7.2

      To check current updates to AWS EKS

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      aws eks list-updates --name <cluster_name> --region <region_name>
      copied
      7.2
    3. 7.3

      Check cluster update status for each update id

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      aws eks describe-update --name <cluster_name> --region <region_name> --update-id <update_id> --query 'update.status'
      copied
      7.3
  8. 8

    This is basically reverting back to old AWS managed OIDC provider otherwise we face authentication issues for OIDC related roles like, efs-csi, alb-ingress, cluster-autoscaler etc.

    terraform init \ -backend-config=<backend_s3_bucket> \ -backend-config=<dynamo_db_lock_name> \ -backend-config=<statefile_key> \ -backend-config="encrypt=true" \ -backend-config="region=us-east-2" terraform apply \ -var environment=<env_name> \ -var aws_region=<region_name> \ -var remote_state_bucket=<backend_s3_bucket> \ -var remote_state_region=<backend_region> # Same apply usage for plan as well
    copied
    8
  9. 9

    After the above changes verify whether the old AWS managed OIDC has been added as a open id connect provider

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Old AWS managed OIDC should show up here now for the relevant cluster.

    aws iam list-open-id-connect-providers
    copied
    9