Monitoring
Anchore Enterprise exposes prometheus metrics in the API of each service if the
config.yaml
that is used by that service has the metrics.enabled
key set to
true.
Each service exports its own metrics and is typically scraped by a Prometheus installation to gather the metrics. Anchore does not aggregate or distribute metrics between services. You should configure your Prometheus deployment or integration to check each Anchore service’s API using the same port it exports for the /metrics route.
Example with Quickstart docker-compose.yaml
The following example prometheus.yaml
file is used to configure prometheus
jobs that corresponds to the docker-compose.yaml
that is bundled in the
Enterprise container.
To use this example, do the following.
- Uncomment the anchore-prometheus service in the docker-compose.yaml, as shown.
anchore-prometheus:
image: docker.io/prom/prometheus:latest
depends_on:
- engine-api
volumes:
- ./anchore-prometheus.yml:/etc/prometheus/prometheus.yml:z
logging:
driver: "json-file"
options:
max-size: 100m
ports:
- "9090:9090"
- Enable metrics for each enterprise service as shown.
sed -i 's/ANCHORE_ENABLE_METRICS=false/ANCHORE_ENABLE_METRICS=true/g' docker-compose.yaml
- Create an
anchore-prometheus.yaml
file in the same directory as thedocker-compose.yaml
with the following content:
# This Prometheus configuration corresponds to the service names and ports in the docker-compose.yaml provided in the Enterprise docker image. Adjust names and ports accordingly for other environments.
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: []
scheme: http
timeout: 10s
scrape_configs:
- job_name: anchore-api
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- api:8228
basic_auth:
username: admin
password: foobar
- job_name: anchore-catalog
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- catalog:8228
basic_auth:
username: admin
password: foobar
- job_name: anchore-simplequeue
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- queue:8228
basic_auth:
username: admin
password: foobar
- job_name: anchore-analyzer
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- analyzer:8228
basic_auth:
username: admin
password: foobar
- job_name: enterprise-rbac-manager
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- rbac-manager:8228
basic_auth:
username: admin
password: foobar
- job_name: anchore-policy-engine
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- policy-engine:8228
basic_auth:
username: admin
password: foobar
- job_name: enterprise-feeds
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- feeds:8228
basic_auth:
username: admin
password: foobar
- job_name: enterprise-ui
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- ui:3000
- Type:
docker-compose up -d
. - Open a browser to: http://localhost:9090 to see metrics from the default prometheus console.
Monitoring in Kubernetes and/or Helm Chart
Prometheus is very commonly used for monitoring Kubernetes clusters. Prometheus is supported by core Kuberenetes services. There are many guides on using Prometheus to monitor a cluster and services deployed within, and also many other monitoring systems can consume Prometheus metrics.
The Anchore Helm Chart includes a quick way to enable the Prometheus metrics on each service container:
Set:
helm install --name myanchore anchore/anchore-engine --set anchoreGlobal.enableMetrics=true
Or, set it directly in your customized values.yaml
The specific strategy for monitoring services with prometheus is outside the scope of this document. But, because Anchore exposes metrics on the /metrics route of all service ports, it should be compatible with most monitoring approaches (daemon sets, side-cars, etc).
Metrics of Note
Anchore services export a range of metrics. The following list shows some Anchore services that can help you determine the health and load of an Anchore deployment.
- anchore_queue_length, specifically for queuename: “images_to_analyze”
- This is the number of images pending analysis, in the not_analyzed state.
- As this number grows you can expect longer analysis times.
- Adding more analyzers to a system can help drain the queue faster and keep wait times to a minimum.
- Example: anchore_queue_length{instance=“engine-simpleq:8228”,job=“anchore-simplequeue”,queuename=“images_to_analyze”}.
- This metric is exported from all simplequeue service instances, but is based on the database state so they should all present a consistent view of the length of the queue.
- anchore_monitor_runtime_seconds_count
- These metrics, one for each monitor, record the duration of the async processes as they execute on a duty cycle.
- As the system grows, these will become longer to account for more tags to check for updates, repos to scan for new tags, and user notifications to process.
- anchore_tmpspace_available_bytes
- This metric tracks the available space in the “tmp_dir” location for each container. This is most important for the instances that are analyzers where this can indicate how much disk is being used for analysis and how much overhead there is for analyzing large images.
- This is expected to be consumed in cycles, with usage growing during analysis and then flushing upon completion. A consistent growth pattern here may indicate left over artifacts from analysis failures or a large layer_cache setting that is not yet full. The layer cache (see Layer Caching) is located in this space and thus will affect the metric.
- process_resident_memory_bytes
- This is the memory actually consumed by the instance, where each instance is a service process of Anchore. Anchore is fairly memory intensive for large images and in deployments with lots of analyzed images due to lots of json parsing and marshalling, so monitoring this metric will help inform capacity requirements for different components based on your specific workloads. Lots of variables affect memory usage, so while we give recommendations in the Capacity Planning document, there is no substitute for profiling and monitoring your usage carefully.