Sign in

Disk Space Monitoring on EC2 Instances

There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

•Use Case: Monitor disk usage on EC2 instances and trigger alerts when usage exceeds thresholds.

•Integrate with CloudWatch to monitor disk space usage.

•Automatically trigger cleanup scripts or scale up disk space when thresholds are exceeded.

  1. 1

    Health check for a host

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    This task performs some basic health checks on a host. Three important things to check for any host is its CPU, memory, and disk space. While there are basic commands to check each of these, the following runbook processes the outputs of those commands and extract the relevant information to see if this host needs attention.

    1
    1. 1.1

      CPU check

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      This is a command to check the processes in the Linux host.

      1.1
    2. 1.2

      Get idle CPU percentage

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      The top output above contains what percentage is idle CPU. Split the output and get the part that contains 'id'. Then extract the number.

      1.2
    3. 1.3

      Get CPU utilization of an instance

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      This task fetches the CPU utilization data points from AWS CloudWatch and plots it for you.

      1.3
      1. 1.3.1

        Get instance ID from instance label

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
        1.3.1
      2. 1.3.2

        Parse the period string

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
        1.3.2
      3. 1.3.3

        Plot CPU utilization

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
        1.3.3
    4. 1.4

      Check memory

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      Check how much free memory is available on this host.

      1.4
    5. 1.5

      Get free memory percentage

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      Calculate the percentage of the free memory and if it is below a threshold, create an alert.

      1.5
    6. 1.6

      Get the top memory consumers

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      Identify the culprits in memory consumption.

      1.6
      1. 1.6.1

        Plot top memory consumers

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

        Visualize the memory consumption by plotting it against the consumers.

        1.6.1
    7. 1.7

      Check disk space

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      This command checks for the consumed disk space at the root.

      1.7
      1. 1.7.1

        Get available disk space

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

        Process the output of df and extract the available disk space percentage.

        1.7.1
      2. 1.7.2

        Perform disk cleanup if needed

        There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

        Check the available disk space percentage against a threshold and if it drops below it then trigger disk cleanup.

        1.7.2
        1. 1.7.2.1

          Notify about disk space before cleaning up

          There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
          1.7.2.1
          1. 1.7.2.1.1

            Post a message to a Slack channel

            There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

            Post the formatted message to a given Slack channel. Use the cred_label to get the right credentials stored in the backed.

            1.7.2.1.1
        2. 1.7.2.2

          Clean up disk

          There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

          This command prunes all the unused images that bloat our storage. It does not touch the ones that are in use. Deletes stopped containers too.

          1.7.2.2
        3. 1.7.2.3

          Notify again after cleaning up the disk

          There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
          1.7.2.3
          1. 1.7.2.3.1

            Post a message to a Slack channel

            There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

            Post the formatted message to a given Slack channel. Use the cred_label to get the right credentials stored in the backed.

            1.7.2.3.1