Monitoring and Troubleshooting Your Installation
This topic broadly outlines techniques for troubleshooting your Concourse OLM installation.
Concourse OLM Log Collection
Find the logs for a specific job in the VM on which that job was running. Those logs are stored at
Troubleshooting With Fly Commands
Concourse OLM environment troubleshooting
containers: Lists active containers
This confirms which container or task got placed on which worker
workers: Lists registered workers
This helps you verify that the number of containers aren’t exceeding maximum allowable number of containers on a worker.
prune-worker: Reaps a non-running worker
Stops Concourse OLM from tracking an out-of-commission worker.
volumes: Lists active volumes
Checks disk usage across workers.
pipelines: Lists configured pipelines
builds: Shows build history
This is useful for getting build IDs of one-off tasks you’ve run using
validate-pipeline: Validates a pipeline’s configuration
Checks pipeline for validity without calling
check-resource: Checks for new versions
This is useful when developing a new resource.
watch: View logs of in-progress builds
intercept: Accesses a running or recent build’s steps
execute: Submits local tasks
This is useful for spinning up a task quickly to test before putting it in a job.
Common Concourse OLM Issues
|The worker is out of disk space||An error displays about inability to create volume. It may say permissions are denied.||Increase persistent disk for worker or increase number of worker VMs.||N/A|
|Container limit reached||Cannot create container: limit of 250 containers reached||Check fly containers
Increase number of worker VMs
|This error state is unlikely to appear.|
|Job doesn’t start||This error may present as the build getting stuck in Pending state||Restart the ATC job.
To restart the ATC, log in as a root user on Concourse OLM web VMs (the ones on which ATC job is located), then running
|Updating Concourse OLM in Concourse OLM job fails||When a build fails after BOSH deploying a Concourse OLM update from a job running on that Concourse OLM instance, typically the job will error with “worker for container not found.” This is expected behavior; the BOSH Director will recreate the worker VM.||Run the job again.||N/A|
|Bosh Can’t Finish Worker Upgrade While Tasks Running||If you have a long running task, BOSH won’t be able to finalize the upgrade by restarting the worker job until all of the work has stopped.||Wait for the work to complete.
If you need to accomplish this quickly, cancel running tasks and jobs.
Other Troubleshooting Resources
For information about metrics for monitoring Concourse, see Metrics. For information about enabling syslog forwarding and about getting logs for other Concourse components, see VM Logs. For information about common BOSH issues, see the BOSH tips.