Reports

Overview

Anchore Enterprise Reports aggregates data to provide insightful analytics and metrics for account-wide artifacts. The service employs GraphQL to expose a rich API for querying the aggregated data and metrics.

NOTE: This service captures a snapshot of artifacts in Anchore Enterprise at a given point in time. Therefore, analytics and metrics computed by the service are not in real time, and may not reflect most up-to-date state in Anchore Enterprise.

Installation

Anchore Enterprise Reports is included with Anchore Enterprise, and is installed by default when deploying a trial quickstart with Docker Compose, or a production deployment Kubernetes.

How it works

One of the main functions of Anchore Enterprise Reports is aggregating data. The service keeps a summary of all current and historical images and tags for every account known to Anchore Enterprise. It also maintains vulnerability reports and policy evaluations generated using the active bundle for all the images and tags respectively.

WARNING: Anchore Enterprise Reports shares a persistence layer with Anchore Enterprise. Ensure sufficient storage is provisioned.

Configuration

Anchore Enterprise Reports are broken up into two services:

  • The reports_worker service which is responsible for the ingress and egress of data into our reports.
  • The reports service which is responsible for the report generation.

Each service has a configuration section in the values file. Below are sample configurations and the default values.

...
services:
  reports_worker:
    # Set enable_data_ingress to true for periodically syncing data from anchore enterprise into the reports service
    enable_data_ingress: true
    
    # Set enable_data_egress to true to periodically remove reporting data that has been removed in other parts of system
    enable_data_egress: false
    
    # data_egress_window defines a number of days to keep reporting data following its deletion in the rest of system.
    # Default value of 0 will remove it on next task run
    data_egress_window: 0
    
    # data_refresh_max_workers is the maximum number of concurrent threads to refresh existing results (etl vulnerabilities and evaluations) in reports service. Set non-negative values greater than 0, otherwise defaults to 10
    data_refresh_max_workers: 10
    
    # data_load_max_workers is the maximum number of concurrent threads to load new results (etl vulnerabilities and evaluations) to reports service. Set non-negative values greater than 0, otherwise defaults to 10
    data_load_max_workers: 10
    
    cycle_timers:
      # Timers that describe how often each operation should run
      reports_image_load: 600  # MIN 300 MAX 100000 Default 600
      reports_tag_load: 600  # MIN 300 MAX 100000 Default 600
      reports_runtime_inventory_load: 600  # MIN 300 MAX 100000 Default 600
      reports_extended_runtime_vuln_load: 1800 # MIN 300 MAX 100000 Default 1800
      reports_image_refresh: 7200  # MIN 3600 MAX 100000 Default 7200
      reports_tag_refresh: 7200  # MIN 3600 MAX 100000 Default 7200
      reports_metrics: 3600  # MIN 1800 MAX 100000 Default 3600
      reports_image_egress: 600  # MIN 300 MAX 100000 Default 600
      reports_tag_egress: 600  # MIN 300 MAX 100000 Default 600
      
    runtime_report_generation:
      # Provides the ability to enable/disable individual runtime report loading.
      inventory_images_by_vulnerability: true
      vulnerabilities_by_k8s_namespace: true
      vulnerabilities_by_k8s_container: true
      vulnerabilities_by_ecs_container: true

  reports:
    # GraphiQL is a GUI for editing and testing GraphQL queries and mutations.
    # Set enable_graphiql to true and open http://<host>:<port>/v2/reports/graphql in a browser for reports API
    enable_graphiql: true
    
    # This is the number of execution threads which will be used during report generation.
    max_async_execution_threads: 1
    
    # Configure async_execution_timeout to adjust how long a scheduled query must be running for before it is considered timed out
    # This may need to be adjusted if the system has large amounts of data and reports are being prematurely timed out.
    # The value should be a number followed by "w", "d", or "h" to represent weeks, days or hours
    async_execution_timeout: "48h"

    # Set use_volume to `true` to have the reports worker buffer report generation to disk instead of in memory. This should be configured
    # in production systems with large amounts of data (10s of thousands of images or more). Scratch volumes should be configured for the reports pods
    # when this option is enabled.
    use_volume: false

NOTE: Any changes to the configuration requires a restart of the service for the updates to take effect.

In an Anchore Enterprise deployment, any non-admin account user must at least have listImages permission to execute queries against Reports API. There RBAC Role available called report-admin which provides permissions to administer reports and schedules. Please see Role-Based Access Control for more information.

Data ingress

Reports_worker service handles data ingress from Anchore Enterprise via the following asynchronous processes triggered periodically:

  • Loader: Compares the working set of images and tags in Anchore Enterprise with its own records. Based on the difference, images and tags along with the vulnerability report and policy evaluations are loaded into the service. Artifacts deleted from Anchore Enterprise are marked inactive in the service.

    This process is triggered periodically as described by the cycle timers listed above.

  • Refresher: Refreshes the vulnerability report and policy evaluations of all the images and tags actively maintained by the service.

    This process is triggered periodically as described by the cycle timers listed above.

WARNING: Reports service may miss updates to artifacts if they are added and deleted in between consecutive ingress processes.

Data ingress is enabled by default. It can be turned off with enable_data_ingress: false in the config.yaml snippet shown previously. In a quickstart deployment, add ANCHORE_ENTERPRISE_REPORTS_ENABLE_DATA_INGRESS=false to the environment variables section of the reports service in docker-compose.yaml. When the ingress is turned off, Reports service will no longer aggregate data from Anchore Enterprise, metric computations will also come to a halt. However, the service will continue to serve API requests/queries with the existing data.

Data egress

Provides the ability to remove data which is no longer active in Anchore Enterprise from the stored report data. This process is disabled by default and controlled by the value enable_data_egress. A configuration setting to determine how old this data is prior to its removal data_egress_window is also available.

Metrics

Reports service comes loaded with a few pre-defined/canned metric definitions. A metric definition consists of an identifier, readable name, description and the type of the metric. The type is loosely based on statsd metric types. Currently, all the pre-defined metrics are of type ‘counter’ - a measure of the number of items that match certain criteria. A value for each of these metric definitions is computed using the data aggregated by the service.

All metric values are computed periodically every hour (3600 seconds). To modify the interval, update cycle_timers -> reports_metrics in the config.yaml snippet above. In a quickstart deployment, add ANCHORE_ENTERPRISE_REPORTS_METRICS_INTERVAL_SEC=<interval-in-seconds> to the environment variables section of the reports service in docker-compose.yaml.

See it in action

To see Reports service in the Enterprise UI, see Dashboard or Reports view. The dashboard view utilizes metrics generated by the service and renders customizable widgets. The reports view employs graphQL queries and aggregates the results into multiple formats (CSV, JSON, and so on).

For using the API directly, see API Access.

Last modified September 30, 2024