agent: |
App health check
This runbook is meant to perform basic health checks on the application running in docker-compose. And wherever possible, the runbook will fix the issues.
- 1ioE4lan7TAEt8TE7lgVDGet the instance ID from hostname
1
Get the instance ID from hostname
There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.Get the AWS instance id from the hostname we specify. It is a lot easier to do operations with the instance ID for all the aws CLI commands.
inputsoutputscmd = f'aws ec2 describe-instances --filters "Name=tag:Name,Values={hostname}" --query "Reservations[].Instances[].InstanceId" --output text' op = _exe(None, cmd) instance_id = op.strip() print(instance_id)copied1 - 2wQYav3PfBN7WsOlqiLE0Check if an instance is running
2
Check if an instance is running
There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.Given an instance ID, this command will check if it is in a running state.
inputsoutputscmd = f'aws ec2 describe-instances --instance-ids {instance_id} --query "Reservations[].Instances[].State.Name" --output text' op = _exe(None, cmd) _problem = True if "running" in op: _problem = False _proceed = not _problem host_is_up = not _problem print(host_is_up)copied2- 2.1qc7tj24vkLzZuYWbHhhAStart the host if it is down
2.1
There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.inputsoutputsif not host_is_up: cmd = f'aws ec2 start-instances --instance-ids {instance_id}' op = _exe(None, cmd) print(op)copied2.1
- 3ihbOWSx78OXtcNrHRaQBCheck if the app is running
3
There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.Ensure that all the services constituting the app are running
inputsoutputsimport json docker_compose_file = "dagknows_src/app_docker_compose_build_deploy/localdev-saas-docker-compose.yml" cmd = f'sudo docker-compose -f {docker_compose_file} ps' op = _exe(instance_id, cmd) lines = op.split('\n') services = [ "postgres", "adminer", "elasticsearch", "documentation", "req_router", "conv-mgr", "apigateway", "ansi_processing", "settings", "conv_sse", "proxy_sse", "nlp", "dag", "nginx" ] broken_services = [] _problem = False for service in services: cmd1 = cmd + f' {service}' op1 = _exe(instance_id, cmd1) if 'Up' not in op1: broken_services.append(service) _problem = True print(json.dumps(broken_services, indent=4)) app_is_up = not _problemcopied3- 3.1cbxNZB4PilxlZxGKPlE2Restart the app if services are down
3.1
Restart the app if services are down
There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.Just see if the restarting the broken services helps. This is just a remediation, not a solution. We may still need to root-cause the services going down.
inputsoutputsdocker_compose_file = "dagknows_src/app_docker_compose_build_deploy/localdev-saas-docker-compose.yml" import time _proceed = False if not app_is_up: cmd = f'sudo docker-compose -f {docker_compose_file} up -d' op = _exei(instance_id, cmd) time.sleep(30) _proceed = True msg = "Restarting the application" print(msg) cmd = f'sudo docker-compose -f {docker_compose_file} ps' op1 = _exei(instance_id, cmd) print(op1)copied3.1
- 4YF4nNvDtapa9EbFLT0cnRun a Linux command
4
There was a problem that the LLM was not able to address. Please rephrase your prompt and try again.Try running a Linux command. If you don't specify any hostname or IP address, the command gets executed on the docker container provisioned for you.
inputsoutputs<command>copied4