Overview
Anchore Enterprise Reports aggregates data to provide insightful analytics and metrics for account-wide artifacts. The service employs GraphQL to expose a rich API for querying the aggregated data and metrics.
NOTE: This service captures a snapshot of artifacts in Anchore Enterprise at a given point in time. Therefore, analytics and metrics computed by the service are not in real time, and may not reflect most up-to-date state in Anchore Enterprise.
Installation
Anchore Enterprise Reports is included with Anchore Enterprise, and is installed by default when deploying a trial quickstart with Docker Compose, or a production deployment Kubernetes.
How it works
One of the main functions of Anchore Enterprise Reports is aggregating data. The service keeps a summary of all current and historical images and tags for every account known to Anchore Enterprise. It also maintains vulnerability reports and policy evaluations generated using the active bundle for all the images and tags respectively.
WARNING: Anchore Enterprise Reports shares a persistence layer with Anchore Enterprise. Ensure sufficient storage is provisioned.
Configuration
Anchore Enterprise Reports are broken up into two services:
- The
reports_worker
service which is responsible for the ingress and egress of data into our reports. - The
reports
service which is responsible for the report generation.
Each service has a configuration section in the values file. Below are sample configurations and the default values.
...
services:
reports_worker:
# Set enable_data_ingress to true for periodically syncing data from anchore enterprise into the reports service
enable_data_ingress: true
# Set enable_data_egress to true to periodically remove reporting data that has been removed in other parts of system
enable_data_egress: false
# data_egress_window defines a number of days to keep reporting data following its deletion in the rest of system.
# Default value of 0 will remove it on next task run
data_egress_window: 0
# data_refresh_max_workers is the maximum number of concurrent threads to refresh existing results (etl vulnerabilities and evaluations) in reports service. Set non-negative values greater than 0, otherwise defaults to 10
data_refresh_max_workers: 10
# data_load_max_workers is the maximum number of concurrent threads to load new results (etl vulnerabilities and evaluations) to reports service. Set non-negative values greater than 0, otherwise defaults to 10
data_load_max_workers: 10
cycle_timers:
# Timers that describe how often each operation should run
reports_image_load: 600 # MIN 300 MAX 100000 Default 600
reports_tag_load: 600 # MIN 300 MAX 100000 Default 600
reports_runtime_inventory_load: 600 # MIN 300 MAX 100000 Default 600
reports_extended_runtime_vuln_load: 1800 # MIN 300 MAX 100000 Default 1800
reports_image_refresh: 7200 # MIN 3600 MAX 100000 Default 7200
reports_tag_refresh: 7200 # MIN 3600 MAX 100000 Default 7200
reports_metrics: 3600 # MIN 1800 MAX 100000 Default 3600
reports_image_egress: 600 # MIN 300 MAX 100000 Default 600
reports_tag_egress: 600 # MIN 300 MAX 100000 Default 600
runtime_report_generation:
# Provides the ability to enable/disable individual runtime report loading.
inventory_images_by_vulnerability: true
vulnerabilities_by_k8s_namespace: true
vulnerabilities_by_k8s_container: true
vulnerabilities_by_ecs_container: true
reports:
# GraphiQL is a GUI for editing and testing GraphQL queries and mutations.
# Set enable_graphiql to true and open http://<host>:<port>/v2/reports/graphql in a browser for reports API
enable_graphiql: true
# This is the number of execution threads which will be used during report generation.
max_async_execution_threads: 1
# Configure async_execution_timeout to adjust how long a scheduled query must be running for before it is considered timed out
# This may need to be adjusted if the system has large amounts of data and reports are being prematurely timed out.
# The value should be a number followed by "w", "d", or "h" to represent weeks, days or hours
async_execution_timeout: "48h"
# Set use_volume to `true` to have the reports worker buffer report generation to disk instead of in memory. This should be configured
# in production systems with large amounts of data (10s of thousands of images or more). Scratch volumes should be configured for the reports pods
# when this option is enabled.
use_volume: false
NOTE: Any changes to the configuration requires a restart of the service for the updates to take effect.
In an Anchore Enterprise deployment, any non-admin account user must at least have listImages
permission
to execute queries against Reports API. There RBAC Role available called report-admin
which provides permissions to administer reports and schedules. Please see Role-Based Access Control
for more information.
Data ingress
Reports_worker service handles data ingress from Anchore Enterprise via the following asynchronous processes triggered periodically:
Loader: Compares the working set of images and tags in Anchore Enterprise with its own records. Based on the difference, images and tags along with the vulnerability report and policy evaluations are loaded into the service. Artifacts deleted from Anchore Enterprise are marked inactive in the service.
This process is triggered periodically as described by the cycle timers listed above.
Refresher: Refreshes the vulnerability report and policy evaluations of all the images and tags actively maintained by the service.
This process is triggered periodically as described by the cycle timers listed above.
WARNING: Reports service may miss updates to artifacts if they are added and deleted in between consecutive ingress processes.
Data ingress is enabled by default. It can be turned off with enable_data_ingress: false
in the config.yaml snippet
shown previously. In a quickstart deployment, add ANCHORE_ENTERPRISE_REPORTS_ENABLE_DATA_INGRESS=false
to the
environment variables section of the reports service in docker-compose.yaml. When the ingress is turned off, Reports
service will no longer aggregate data from Anchore Enterprise, metric computations will also come to a halt. However,
the service will continue to serve API requests/queries with the existing data.
Data egress
Provides the ability to remove data which is no longer active in Anchore Enterprise from the stored report data.
This process is disabled by default and controlled by the value enable_data_egress
. A configuration setting to determine how old this data is prior to its removal data_egress_window
is also available.
Metrics
Reports service comes loaded with a few pre-defined/canned metric definitions. A metric definition consists of an identifier, readable name, description and the type of the metric. The type is loosely based on statsd metric types. Currently, all the pre-defined metrics are of type ‘counter’ - a measure of the number of items that match certain criteria. A value for each of these metric definitions is computed using the data aggregated by the service.
All metric values are computed periodically every hour (3600 seconds). To modify the interval, update
cycle_timers
-> reports_metrics
in the config.yaml snippet above. In a quickstart deployment, add
ANCHORE_ENTERPRISE_REPORTS_METRICS_INTERVAL_SEC=<interval-in-seconds>
to the environment variables section of the
reports service in docker-compose.yaml.
See it in action
To see Reports service in the Enterprise UI, see Dashboard or Reports view. The dashboard view utilizes metrics generated by the service and renders customizable widgets. The reports view employs graphQL queries and aggregates the results into multiple formats (CSV, JSON, and so on).
For using the API directly, see API Access.