Sign in

App health check

There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

This runbook is meant to perform basic health checks on the application running in docker-compose. And wherever possible, the runbook will fix the issues.

  1. 1

    Get the instance ID from hostname

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Get the AWS instance id from the hostname we specify. It is a lot easier to do operations with the instance ID for all the aws CLI commands.

    cmd = f'aws ec2 describe-instances --filters "Name=tag:Name,Values={hostname}" --query "Reservations[].Instances[].InstanceId" --output text' op = _exe(None, cmd) instance_id = op.strip() print(instance_id)
    copied
    1
  2. 2

    Check if an instance is running

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Given an instance ID, this command will check if it is in a running state.

    cmd = f'aws ec2 describe-instances --instance-ids {instance_id} --query "Reservations[].Instances[].State.Name" --output text' op = _exe(None, cmd) _problem = True if "running" in op: _problem = False _proceed = not _problem host_is_up = not _problem print(host_is_up)
    copied
    2
    1. 2.1

      Start the host if it is down

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.
      if not host_is_up: cmd = f'aws ec2 start-instances --instance-ids {instance_id}' op = _exe(None, cmd) print(op)
      copied
      2.1
  3. 3

    Check if the app is running

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Ensure that all the services constituting the app are running

    import json docker_compose_file = "dagknows_src/app_docker_compose_build_deploy/localdev-saas-docker-compose.yml" cmd = f'sudo docker-compose -f {docker_compose_file} ps' op = _exe(instance_id, cmd) lines = op.split('\n') services = [ "postgres", "adminer", "elasticsearch", "documentation", "req_router", "conv-mgr", "apigateway", "ansi_processing", "settings", "conv_sse", "proxy_sse", "nlp", "dag", "nginx" ] broken_services = [] _problem = False for service in services: cmd1 = cmd + f' {service}' op1 = _exe(instance_id, cmd1) if 'Up' not in op1: broken_services.append(service) _problem = True print(json.dumps(broken_services, indent=4)) app_is_up = not _problem
    copied
    3
    1. 3.1

      Restart the app if services are down

      There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

      Just see if the restarting the broken services helps. This is just a remediation, not a solution. We may still need to root-cause the services going down.

      docker_compose_file = "dagknows_src/app_docker_compose_build_deploy/localdev-saas-docker-compose.yml" import time _proceed = False if not app_is_up: cmd = f'sudo docker-compose -f {docker_compose_file} up -d' op = _exei(instance_id, cmd) time.sleep(30) _proceed = True msg = "Restarting the application" print(msg) cmd = f'sudo docker-compose -f {docker_compose_file} ps' op1 = _exei(instance_id, cmd) print(op1)
      copied
      3.1
  4. 4

    Run a Linux command

    There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.

    Try running a Linux command. If you don't specify any hostname or IP address, the command gets executed on the docker container provisioned for you.

    <command>
    copied
    4