This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Storage Overview

Storage During Analysis

Scratch Space

Anchore uses a local directory for image analysis operations including downloading layers and unpacking the image content for the analysis process. This space is necessary on each analyzer worker service and should not be shared. The scratch space is ephemeral and can have its lifecycle bound to that of the service container.

Layer Cache

The layer cache is an extension of the analyzer’s scratch space that is used to cache layer downloads to reduce analysis time and network usage during the analysis process itself. For more informaiton, see, Layer Caching.

Storing Analysis Results

Anchore Enterprise is a data intensive system and uses external storage systems for all data persistence. None of the services are stateful in themselves.

For structured data that must be quickly queried and indexed, Anchore relies on PostgreSQL as its primary data store. Any database that is compatible with PostgresSQL 13 or higher should work, such as Amazon Aurora and Google Cloud SQL.

For more information, see, Database

For less structured data, Anchore implements an internal object store that can be overlayed on different backend providers, but defaults to also using the main postgres db to reduce the out-of-the-box dependencies. However, S3 is supported for leveraging external systems.

For more information on configuration and requirements for the core database and object stores see, Object Storage.

Analysis Archive

To aid in capacity management, Anchore provides a separate storage location where completed image analysis can be moved to. This reduces consumption of database capacity and primary object storage. It also removes the analysis from most API actions but makes it available to restore into the primary storage systems as needed. The analysis archive is configured as an alternate object store. For more information, see: Configuring Analysis Archive.

1 - Analysis Archive Storage Configuration

For information on what the analysis archive is and how it works, see Concepts: Analysis Archive

The Analysis Archive is an object store with specific semantics and thus is configured as an object store using the same configuration options, just with a different config key: analysis_archive

Example configuration snippet for using the db for working set object store and S3 for the analysis archive:

...
services:
  ...
  catalog:
  ...
  object_store:
    compression:
      enabled: false
      min_size_kbytes: 100
    storage_driver:
      name: db
      config: {}      
  analysis_archive:
      compression:
        enabled: False
        min_size_kbytes: 100
      storage_driver:
        name: 's3'
        config:
          access_key: 'MY_ACCESS_KEY'
          secret_key: 'MY_SECRET_KEY'
          #iamauto: True
          url: 'https://S3-end-point.example.com'
          region: False
          bucket: 'anchorearchive'
          create_bucket: True

Default Configuration

By default, if no analysis_archive config is found or the property is not present in the config.yaml, the analysis archive will use the object_store or archive (for backwards compatibility) config sections and those defaults (e.g. db if found).

Anchore stores all of the analysis archive objects in an internal logical bucket: analysis_archive that is distinct in the configured backends (e.g a key prefix in the s3 bucket)

Changing Configuration

Unless there are image analyses actually in the archive, there is no data to move if you need to update the configuration to use a different backend, but once an image analysis has been archived to update the configuration you must follow the object storage data migration process found here. As noted in that guide, if you need to migrate to/from an analysis_archive config you’ll need to use the –from-analysis-archive/–to-analysis-archive options as needed to tell the migration process which configuration to use in the source and destination config files used for the migration.

Common Configurations

  1. Single shared object store backend: omit the analysis_archive config, or set it to null or {}

  2. Different bucket/container: the object_store and analysis_archive configurations are both specified and identical with the exception of the bucket or container values for the analysis_archive so that its data is split into a different backend bucket to allow for lifecycle controls or cost optimization since its access is much less frequent (if ever).

  3. Primary object store in DB, analysis_archive in external S3: this keeps latency low as no external service is needed for the object store and active data but lets you use more scalable external object storage for archive data. This approach is most beneficial if you can keep the working set of images small and quickly transition old analysis to the archive to ensure the db is kept small and the analysis archive handles the data scaling over time.

2 - Database Storage

Anchore stores all metadata in a structured format in a PostgreSQL database to support API operations and searches.

Examples of data persisted in the database:

  • Image metadata (distro, version, layer counts, …)
  • Image digests to tag mapping (docker.io/nginx:latest is hash sha256:abcd at time t)
  • Image analysis content indexed for policy evaluation (files, packages, ..)
  • Feed data
    • vulnerability info
    • package info from upstream (gem/npm)
  • Accounts, users…

If the object store is not explicitly set to an external provider, then that data is also persisted in the database but can be migrated

Reducing Database Storage Usage

Beyond enabling a non-DB object store there are some configuration options to reduce database storage and IO used by Anchore.

Configuration of Indexed DB Storage for Package DB File Entries

There is a configuration option for the policy engine service to disable the usage of the database for storing indexed package database entries from each analyzed image. This data represents the files in each distro package and their metadata (digests and permissions) from each scanned image in the image_package_db_entries table. That table is only used by the policy engine to deliver the policy trigger [‘packages.verify’], but if you do not use that trigger then the use of the storage can be disabled thereby reducing database load and resource usage. The data can be quite large, often in the thousands of rows per analyzed image, so for some customers that do not use this data for policy, disabling the loading of this data can reduce database consumption significantly.

Disabling Indexed DB Storage for Package DB File Entries

In each policy engine’s config.yaml file, change:

enable_package_db_load: true

to

enable_package_db_load: false

Note that disabling the table usage will also disable support for the packages.verify trigger and any policies that have the trigger in a rule will be considered invalid and return errors on evaluation. Any new policies that attempt to use the trigger will be rejected on upload as invalid if the trigger is included.

Once this configuration is set, you may delete data in that db table to reclaim some database storage capacity. If you’re interested in this option please contact support for guidance on this process.

Enabling Indexed DB Storage for Package DB File Entries

If you find that you do need the trigger, you can change the configuration to use the table then support will be restored. However, any images analyzed while the setting was ‘false’ will need to be re-analyzed in order to populate their data in that table correctly.

3 - Layer Caching

Once an image is submitted to Anchore Enterprise for analysis the system will attempt to retrieve metadata about the image from the Docker registry and if successful will download the image and queue the image for analysis.

Anchore Enterprise can run one or more analyzer services to scale out processing of images. The next available analyzer worker will process the image.

Docker Images are made up of one or more layers, which are described in the manifest. The manifest lists the layers which are typically stored as gzipped compressed TAR files.

As part of image analysis Anchore Enterprise will:

  • Download all layers that comprise an image
  • Extract the layers to a temporary file system location
  • Perform analysis on the contents of the image including:
    • Digest of every file (SHA1, SHA256 and MD5)
    • File attributes (size, owner, permissions, etc)
    • Operating System package manifest
    • Software library package manifest (NPM, GEM, Java, Python, NuGet)
    • Scan for secret materials (api keys, private keys, etc

Following the analysis the extracted layers and downloaded layer tar files are deleted.

In many cases the images will share a number of common layers, especially if images are built form a consistent set of base images. To speed up Anchore Enterprise can be configure to cache image layers to eliminate the need to download the same layer for many different images. The layer cache is displayed in the default Anchore Enterprise configuration. To enable the cache the following changes should be made:

  1. Define temporary directory for cache data

It is recommended that the cache data is stored in an external volume to ensure that the cache does not use up the ephemeral storage space allocated to the container host.

By default Anchore Enterprise uses the /tmp directory within the container to download and extract images. Configure a volume to be mounted into the container at a specified path and configure this path in config.yaml

tmp_dir: '/scratch'

In this example a volume has been mounted as /scratch within the container and config.yaml updated to use /scratch as the temporary directory for image analysis.

With the cache disabled the temporary directory should be sized to at least 3 times the uncompressed image size to be analyzed. To enable layer caching the layer_cache_enable parameter and layer_cache_max_gigabytes parameter should be added to the analyzer section of the Anchore Enterprise configuration file config.yaml.

analyzer:
    enabled: True
    require_auth: True
    cycle_timer_seconds: 1
    analyzer_driver: 'nodocker'
    endpoint_hostname: '${ANCHORE_HOST_ID}'
    listen: '0.0.0.0'
    port: 8084
    layer_cache_enable: True
    layer_cache_max_gigabytes: 4

In this example the cache is set to 4 gigabytes. The temporary volume should be sized to at least 3 times the uncompressed image size + 4 gigabytes.

  • The minimum size for the cache is 1 gigabyte.
  • The cache users a least recently used (LRU) policy.
  • The cache files will be stored in the anchore_layercache directory of the /tmp_dir volume.

4 - Object Storage

Anchore Enterprise uses a PostgreSQL database to store structured data for images, tags, policies, subscriptions and metdata about images, but other types of data in the system are less structured and tend to be larger pieces of data. Because of that, there are benefits to supporting key-value access patterns for things like image manifests, analysis reports, and policy evaluations. For such data, Anchore has an internal object storage interface that, while defaulted to use the same Postgres database for storage, can be configured to use external object storage providers to support simpler capacity management and lower costs. The options are:

  • PostgreSQL database (default)
  • S3 Object Store

The configuration for the object store is set in the catalog’s service configuration in the config.yaml.

4.1 - Migrating Data to New Drivers

Overview

To cleanly migrate data from one archive driver to another, Anchore Enterprise includes some tooling that automates the process in the ‘anchore-manager’ tool packaged with the system.

The migration process is an offline process; Anchore Enterprise is not designed to handle an online migration.

For the migration process you will need:

  1. The original config.yaml used by the services already, if services are split out or using different config.yaml for different services, you need the config.yaml used by the catalog services
  2. An updated config.yaml (named dest-config.yaml in this example), with the archive driver section of the catalog service config set to the config you want to migrate to
  3. The db connection string from config.yaml, this is needed by the anchore-manager script directly
  4. Credentials and resources (bucket etc) for the destination of the migration

At a high-level the process is:

  1. Shutdown all anchore enterprise services and components. The system should be fully offline, but the database must be online and available. For a docker compose install, this is achieved by simply stopping the engine container, but not deleting it.
  2. Prepare a new config.yaml that includes the new driver configuration for the destination of the migration (dest-config.yaml) in the same location as the existing config.yaml
  3. Test the new dest-config.yaml to ensure correct configuration
  4. Run the migration
  5. Get coffee… this could take a while if you have a lot of analysis data
  6. When complete, view the results
  7. Ensure the dest-config.yaml is in place for all the components as config.yaml
  8. Start anchore-engine

Migration Example Using Docker Compose Deployed Anchore Engine

The following is an example migration for an anchore-engine deployed via docker compose on a single host with a local postgresql container–basically the example used in ‘Installing Anchore Engine’ documents. At the end of this section, we’ll cover the caveats and things to watch for a multi-node install of anchore engine.

This process requires that you run the command in a location that has access to both the source archive driver configuration and the new archive driver configuration.

Step 1: Shutdown all services

All services should be stopped, but the postgresql db must still be available and running.

docker compose stop anchore-engine

Step 2: Prepare a new config.yaml

Both the original and new configurations are needed, so create a copy and update the archive driver section to the configuration you want to migrate to

cd config
cp config.yaml dest-config.yaml
<edit dest-config.yaml>

Step 3: Test the destination config

Assuming that config is dest-config.yaml:

[user@host aevolume]$ docker compose run anchore-engine /bin/bash
[root@3209ad44d7bb ~]# anchore-manager objectstorage --db-connect ${db} check /config/dest-config.yaml 
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB params: {"db_pool_size": 30, "db_connect": "postgresql+pg8000://postgres:postgres.dev@postgres-dev:5432/postgres", "db_connect_args": {"ssl": false, "connect_timeout": 120, "timeout": 30}, "db_pool_max_overflow": 100}
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connection configured: True
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB attempting to connect...
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connected: True
[MainThread] [anchore_manager.cli.objectstorage/check()] [INFO] Using config file /config/dest-config.yaml
[MainThread] [anchore_engine.subsys.object_store.operations/initialize()] [INFO] Archive initialization complete
[MainThread] [anchore_manager.cli.objectstorage/check()] [INFO] Checking existence of test document with user_id = test, bucket = anchorecliconfigtest and archive_id = cliconfigtest
[MainThread] [anchore_manager.cli.objectstorage/check()] [INFO] Creating test document with user_id = test, bucket = anchorecliconfigtest and archive_id = cliconfigtest
[MainThread] [anchore_manager.cli.objectstorage/check()] [INFO] Checking document fetch
[MainThread] [anchore_manager.cli.objectstorage/check()] [INFO] Removing test object
[MainThread] [anchore_manager.cli.objectstorage/check()] [INFO] Archive config check completed successfully

Step 3a: Test the current config.yaml

If you are running the migration for a different location than one of the anchore engine containers

Same as above but using /config/config.yaml as the input to check (skipped in this instance since we’re running the migration from the same container)

Step 4: Run the Migration

By default, the migration process will remove data from the source once it has confirmed it has been copied to the destination and the metadata has been updated in the anchore db. To skip the deletion on the source, use the ‘–nodelete’ option. it is the safest option, but if you use it, you are responsible for removing the data later.

[root@3209ad44d7bb ~]# anchore-manager objectstorage --db-connect ${db} migrate /config/config.yaml /config/dest-config.yaml 
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB params: {"db_pool_size": 30, "db_connect": "postgresql+pg8000://postgres:postgres.dev@postgres-dev:5432/postgres", "db_connect_args": {"ssl": false, "connect_timeout": 120, "timeout": 30}, "db_pool_max_overflow": 100}
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connection configured: True
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB attempting to connect...
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connected: True
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] Loading configs
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] Migration from config: {
  "storage_driver": {
    "config": {}, 
    "name": "db"
  }, 
  "compression": {
    "enabled": false, 
    "min_size_kbytes": 100
  }
}
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] Migration to config: {
  "storage_driver": {
    "config": {
      "access_key": "9EB92C7W61YPFQ6QLDOU", 
      "create_bucket": true, 
      "url": "http://minio-ephemeral-test:9000/", 
      "region": false, 
      "bucket": "anchore-engine-testing", 
      "prefix": "internaltest", 
      "secret_key": "TuHo2UbBx+amD3YiCeidy+R3q82MPTPiyd+dlW+s"
    }, 
    "name": "s3"
  }, 
  "compression": {
    "enabled": true, 
    "min_size_kbytes": 100
  }
}
Performing this operation requires *all* anchore-engine services to be stopped - proceed? (y/N)y
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Initializing migration from {'storage_driver': {'config': {}, 'name': 'db'}, 'compression': {'enabled': False, 'min_size_kbytes': 100}} to {'storage_driver': {'config': {'access_key': '9EB92C7W61YPFQ6QLDOU', 'create_bucket': True, 'url': 'http://minio-ephemeral-test:9000/', 'region': False, 'bucket': 'anchore-engine-testing', 'prefix': 'internaltest', 'secret_key': 'TuHo2UbBx+amD3YiCeidy+R3q82MPTPiyd+dlW+s'}, 'name': 's3'}, 'compression': {'enabled': True, 'min_size_kbytes': 100}}
[MainThread] [anchore_engine.subsys.object_store.migration/migration_context()] [INFO] Initializing source object_store: {'storage_driver': {'config': {}, 'name': 'db'}, 'compression': {'enabled': False, 'min_size_kbytes': 100}}
[MainThread] [anchore_engine.subsys.object_store.migration/migration_context()] [INFO] Initializing dest object_store: {'storage_driver': {'config': {'access_key': '9EB92C7W61YPFQ6QLDOU', 'create_bucket': True, 'url': 'http://minio-ephemeral-test:9000/', 'region': False, 'bucket': 'anchore-engine-testing', 'prefix': 'internaltest', 'secret_key': 'TuHo2UbBx+amD3YiCeidy+R3q82MPTPiyd+dlW+s'}, 'name': 's3'}, 'compression': {'enabled': True, 'min_size_kbytes': 100}}
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Migration Task Id: 1
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Entering main migration loop
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Migrating 7 documents
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Deleting document on source after successful migration to destination. Src = db://admin/policy_bundles/2c53a13c-1765-11e8-82ef-23527761d060
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Deleting document on source after successful migration to destination. Src = db://admin/manifest_data/sha256:0873c923e00e0fd2ba78041bfb64a105e1ecb7678916d1f7776311e45bf5634b
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Deleting document on source after successful migration to destination. Src = db://admin/analysis_data/sha256:0873c923e00e0fd2ba78041bfb64a105e1ecb7678916d1f7776311e45bf5634b
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Deleting document on source after successful migration to destination. Src = db://admin/image_content_data/sha256:0873c923e00e0fd2ba78041bfb64a105e1ecb7678916d1f7776311e45bf5634b
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Deleting document on source after successful migration to destination. Src = db://admin/manifest_data/sha256:a0cd2c88c5cc65499e959ac33c8ebab45f24e6348b48d8c34fd2308fcb0cc138
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Deleting document on source after successful migration to destination. Src = db://admin/analysis_data/sha256:a0cd2c88c5cc65499e959ac33c8ebab45f24e6348b48d8c34fd2308fcb0cc138
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Deleting document on source after successful migration to destination. Src = db://admin/image_content_data/sha256:a0cd2c88c5cc65499e959ac33c8ebab45f24e6348b48d8c34fd2308fcb0cc138
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Migration result summary: {"last_state": "running", "executor_id": "3209ad44d7bb:37:139731996518208:", "archive_documents_migrated": 7, "last_updated": "2018-08-15T18:03:52.951364", "online_migration": null, "created_at": "2018-08-15T18:03:52.951354", "migrate_from_driver": "db", "archive_documents_to_migrate": 7, "state": "complete", "migrate_to_driver": "s3", "ended_at": "2018-08-15T18:03:53.720554", "started_at": "2018-08-15T18:03:52.949956", "type": "archivemigrationtask", "id": 1}
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] After this migration, your anchore-engine config.yaml MUST have the following configuration options added before starting up again:
compression:
  enabled: true
  min_size_kbytes: 100
storage_driver:
  config:
    access_key: 9EB92C7W61YPFQ6QLDOU
    bucket: anchore-engine-testing
    create_bucket: true
    prefix: internaltest
    region: false
    secret_key: TuHo2UbBx+amD3YiCeidy+R3q82MPTPiyd+dlW+s
    url: http://minio-ephemeral-test:9000/
  name: s3

Note: If something goes wrong you can reverse the parameters of the migrate command to migrate back to the original configuration (e.g. … migrate /config/dest-config.yaml /config/config.yaml)

Step 5: Get coffee!

The migration time will depend on the amount of data and the source and destination systems performance.

Step 6: View results summary

[root@3209ad44d7bb ~]# anchore-manager objectstorage --db-connect ${db} list-migrations
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB params: {"db_pool_size": 30, "db_connect": "postgresql+pg8000://postgres:postgres.dev@postgres-dev:5432/postgres", "db_connect_args": {"ssl": false, "connect_timeout": 120, "timeout": 30}, "db_pool_max_overflow": 100}
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connection configured: True
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB attempting to connect...
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connected: True
id         state                  start time                         end time                 from        to        migrated count        total to migrate               last updated               
1         complete        2018-08-15T18:03:52.949956        2018-08-15T18:03:53.720554         db         s3              7                      7                2018-08-15T18:03:53.724628   

This lists all migrations for the service and the number of objects migrated. If you’ve run multiple migrations you’ll see multiple rows in this response.

Step 7: Replace old config.yaml with updated dest-config.yaml

[root@3209ad44d7bb ~]# cp /config/config.yaml /config/config.old.yaml
[root@3209ad44d7bb ~]# cp /config/dest-config.yaml /config/config.yaml

Step 8: Restart anchore-engine services

[user@host aevolume]$ docker compose start anchore-engine

The system should now be up and running using the new configuration! You can verify with the anchorectl command by fetching a policy, which will have been migrated:

[root@d8d3f49d9328 /]# anchorectl policy list
 ✔ Fetched policies
┌─────────────────────────┬──────────────────────────────────────┬────────┬──────────────────────┐
│ NAME                    │ POLICY ID                            │ ACTIVE │ UPDATED              │
├─────────────────────────┼──────────────────────────────────────┼────────┼──────────────────────┤
│ Default bundle          │ 2c53a13c-1765-11e8-82ef-23527761d060 │ true   │ 2022-07-14T22:52:27Z │
│ anchore_security_only   │ anchore_security_only                │ false  │ 2022-07-14T22:52:27Z │
│ anchore_cis_1.13.0_base │ anchore_cis_1.13.0_base              │ false  │ 2022-07-14T22:52:27Z │
└─────────────────────────┴──────────────────────────────────────┴────────┴──────────────────────┘

[root@d8d3f49d9328 /]# anchorectl -o json-raw policy get 2c53a13c-1765-11e8-82ef-23527761d060 
[ 
  {
    "blacklisted_images": [], 
    "comment": "Default bundle", 
    "id": "2c53a13c-1765-11e8-82ef-23527761d060", 
... <lots of json>

If that returns the content properly, then you’re all done!

Things to Watch for in a Multi-Node Anchore Engine Installation

  • Before migration: Ensure all services are down before starting migration
  • At migration: Ensure the place you’re running the migration from has the same db access and network access to the archive locations
  • After migration: Ensure that all components get the update config.yaml. Strictly speaking, only containers that run the catalog service need the update configuration, but its best to ensure that any config.yaml in the system which has a services.catalog definition also has the proper and up-to-date configuration to avoid confusion or accidental reverting of the config.

Example Process with docker compose

# ls docker-compose.yaml 
docker-compose.yaml

# docker compose ps
              Name                            Command               State                       Ports                     
--------------------------------------------------------------------------------------------------------------------------
aevolumepy3_anchore-db_1           docker-entrypoint.sh postgres    Up      5432/tcp                                      
aevolumepy3_anchore-engine_1       /bin/sh -c anchore-engine        Up      0.0.0.0:8228->8228/tcp, 0.0.0.0:8338->8338/tcp
aevolumepy3_anchore-minio_1        /usr/bin/docker-entrypoint ...   Up      0.0.0.0:9000->9000/tcp                        
aevolumepy3_anchore-prometheus_1   /bin/prometheus --config.f ...   Up      0.0.0.0:9090->9090/tcp                        
aevolumepy3_anchore-redis_1        docker-entrypoint.sh redis ...   Up      6379/tcp                                      
aevolumepy3_anchore-ui_1           /bin/sh -c node /home/node ...   Up      0.0.0.0:3000->3000/tcp                        

# docker compose stop anchore-engine
Stopping aevolume_anchore-engine_1 ... done

# docker compose run anchore-engine anchore-manager objectstorage --db-connect postgresql+pg8000://postgres:mysecretpassword@anchore-db:5432/postgres check /config/config.yaml.new
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB params: {"db_connect": "postgresql+pg8000://postgres:mysecretpassword@anchore-db:5432/postgres", "db_connect_args": {"timeout": 30, "ssl": false}, "db_pool_size": 30, "db_pool_max_overflow": 100}
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connection configured: True
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB attempting to connect...
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connected: True
[MainThread] [anchore_manager.cli.objectstorage/check()] [INFO] Using config file /config/config.yaml.new
...
...

# docker compose run anchore-engine anchore-manager objectstorage --db-connect postgresql+pg8000://postgres:mysecretpassword@anchore-db:5432/postgres migrate /config/config.yaml /config/config.yaml.new
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB params: {"db_connect": "postgresql+pg8000://postgres:mysecretpassword@anchore-db:5432/postgres", "db_connect_args": {"timeout": 30, "ssl": false}, "db_pool_size": 30, "db_pool_max_overflow": 100}
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connection configured: True
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB attempting to connect...
[MainThread] [anchore_manager.cli.utils/connect_database()] [INFO] DB connected: True
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] Loading configs
[MainThread] [anchore_engine.configuration.localconfig/validate_config()] [WARN] no webhooks defined in configuration file - notifications will be disabled
[MainThread] [anchore_engine.configuration.localconfig/validate_config()] [WARN] no webhooks defined in configuration file - notifications will be disabled
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] Migration from config: {
  "compression": {
    "enabled": false,
    "min_size_kbytes": 100
  },
  "storage_driver": {
    "name": "db",
    "config": {}
  }
}
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] Migration to config: {
  "compression": {
    "enabled": true,
    "min_size_kbytes": 100
  },
  "storage_driver": {
    "name": "s3",
    "config": {
      "access_key": "Z54LPSMFKXSP2E2L4TGX",
      "secret_key": "EMaLAWLVhUmV/f6hnEqjJo5+/WeZ7ukyHaBKlscB",
      "url": "http://anchore-minio:9000",
      "region": false,
      "bucket": "anchorearchive",
      "create_bucket": true
    }
  }
}
Performing this operation requires *all* anchore-engine services to be stopped - proceed? (y/N) y
...
...
...
[MainThread] [anchore_engine.subsys.object_store.migration/initiate_migration()] [INFO] Migration result summary: {"last_updated": "2018-08-14T22:19:39.985250", "started_at": "2018-08-14T22:19:39.984603", "last_state": "running", "online_migration": null, "archive_documents_migrated": 500, "migrate_to_driver": "s3", "id": 9, "executor_id": "e9fc8f77714d:1:140375539468096:", "ended_at": "2018-08-14T22:20:03.957291", "created_at": "2018-08-14T22:19:39.985246", "state": "complete", "archive_documents_to_migrate": 500, "migrate_from_driver": "db", "type": "archivemigrationtask"}
[MainThread] [anchore_manager.cli.objectstorage/migrate()] [INFO] After this migration, your anchore-engine config.yaml MUST have the following configuration options added before starting up again:
...
...

# cp config/config.yaml config/config.yaml.original

# cp config/config.yaml.new config/config.yaml

# docker compose start anchore-engine
Starting anchore-engine ... done

Migrating Analysis Archive Data

The object storage migration process migrates any data stored in the source config to the destination configuration, if the analysis archive is configured to use the same storage backend as the primary object store then that data is migrated along with all other data, but if the source or destination configurations define different storage backends for the analysis archive than that which is used by the primary object store, then additional paramters are necesary in the migration commands to indicate which configurations to migrate to/from.

The most common migration patterns are:

  1. Migrate from a single backend configuration to a split configuration to move analysis archive data to an external system (db -> db + s3)

  2. Migrate from a dual-backend configuration to a single-backend configuration with a different config (e.g. db + s3 -> s3)

Migrating a single backend to split backend

For example, moving from unified db backend (default config) to a db + s3 configuration with s3 for the analysis archive .

source-config.yaml snippet:

...
services:
...
  catalog:
    ...
    object_store:
      compression:
        enabled: false
        min_size_kbytes: 100
      storage_driver:
        name: db
        config: {}        
...

dest-config.yaml snippet:

...
services:
...
  catalog:
    ...
    object_store:
      compression:
        enabled: false
        min_size_kbytes: 100
      storage_driver:
        name: db
        config: {}
    analysis_archive:
      enabled: true
      compression:
        enabled: false
        min_size_kbytes: 100
      storage_driver:
        name: s3
        config: 
          access_key: 9EB92C7W61YPFQ6QLDOU
          secret_key: TuHo2UbBx+amD3YiCeidy+R3q82MPTPiyd+dlW+s
          url: 'http://minio-ephemeral-test:9000'
          region: null
          bucket: analysisarchive
      ...   

Anchore stores its internal data in logical ‘buckets’ that are overlayed onto the storage backed in a driver-specific way, so to migrate specific internal buckets (effectively these are classes of data), use the –bucket option in the manager cli. This should generally not be necessary, but for specific kinds of migrations it may be needed.

The following command will execute the migration. Note that the –bucket option is for an internal Anchore logical-bucket, not and actual bucket in S3:

anchore-manager objectstorage --db-connect migrate --to-analysis-archive --bucket analysis_archive source-config.yaml dest-config.yaml

Migrating from dual object storage backends to a single backend

For example, migrating from a db + s3 backend to a single s3 backend in a different bucket:

Example source-config.yaml snippet:

...
services:
...
  catalog:
    ...
    object_store:
      compression:
        enabled: false
        min_size_kbytes: 100
      storage_driver:
        name: db
        config: {}
    analysis_archive:
      enabled: true
      compression:
        enabled: false
        min_size_kbytes: 100
      storage_driver:
        name: s3
        config: 
          access_key: 9EB92C7W61YPFQ6QLDOU
          secret_key: TuHo2UbBx+amD3YiCeidy+R3q82MPTPiyd+dlW+s
          url: 'http://minio-ephemeral-test:9000'
          region: null
          bucket: analysisarchive        
...

The dest config is a single backend. In this case, note the S3 bucket has changed so all data must be migrated.

Example dest-config.yaml snippet:

...
services:
...
  catalog:
    ...
    object_store:
      enabled: true
      compression:
        enabled: false
        min_size_kbytes: 100
      storage_driver:
        name: s3
        config: 
          access_key: 9EB92C7W61YPFQ6QLDOU
          secret_key: TuHo2UbBx+amD3YiCeidy+R3q82MPTPiyd+dlW+s
          url: 'http://minio-ephemeral-test:9000'
          region: null
          bucket: newanchorebucket
      ...

First, migrate the object data in the db on the source:

anchore-manager objectstorage --db-connect migrate source-config.yaml dest-config.yaml

Next, migrate the object data in the analysis archive from the old config (s3 bucket ‘analysisarchive’ to the new config (s3 bucket ’newanchorebucket’):

anchore-manager objectstorage --db-connect migrate --from-analysis-archive source-config.yaml dest-config.yaml  

4.2 - Database Driver

The default object store driver is the PostgreSQL database driver which stores all object store documents within the PostgreSQL database.

A component of the object store driver is the archive_document. When the default object store driver is used, as opposed to a user configuring a S3 bucket, this is the location where image SBOMs, vulnerability scans, policy evaluations, and reports are stored.

Compression is not supported for this driver since the underlying database will handle compression.

There are no configuration options required for the Database driver.

The embedded configuration for anchore enterprise includes the default configuration for the db driver.

object_store:
  compression:
    enabled: False
    min_size_kbytes: 100
  storage_driver:
    name: db
    config: {}

4.3 - S3 Object Store Driver

Using the S3 driver, data can be stored using Amazon’s S3 storage or any Amazon S3 API compatible system.

object_store:
  compression:
    enabled: False
    min_size_kbytes: 100
  storage_driver:
    name: 's3'
    config:
      access_key: 'MY_ACCESS_KEY'
      secret_key: 'MY_SECRET_KEY'
      #iamauto: True
      url: 'https://S3-end-point.example.com'
      region: False
      bucket: 'anchorearchive'
      create_bucket: True

Example for AWS S3 in us-west-2:

object_store:
  compression:
    enabled: True
    min_size_kbytes: 100
  storage_driver:
    name: 's3'
    config:
#      access_key: 'MY_ACCESS_KEY'
#      secret_key: 'MY_SECRET_KEY'
      iamauto: True
      #url: 'https://S3-end-point.example.com'
      region: us-west-2
      bucket: anchoredata
      create_bucket: False

Example for Minio running in a Docker Compose setup on the same host network as Anchore (container named ‘minio’):

object_store:
  compression:
    enabled: True
    min_size_kbytes: 100
  storage_driver:
    name: 's3'
    config:
      access_key: 'MY_ACCESS_KEY_FOR_MINIO'
      secret_key: 'MY_SECRET_KEY_FOR_MINIO'
      #iamauto: True
      url: 'https://minio:5000'
      #region: us-west-2
      bucket: anchoredata
      create_bucket: False

Compression

The S3 driver supports compression of documents. The documents are JSON formatted and will see significant reduction in size through compression there is an overhead incurred by running compression and decompression on every access of these documents. Anchore Enterprise can be configured to only compress documents above a certain size to reduce unnecessary overhead. In the example below any document over 100kb in size will be compressed.

Authentication

Anchore Enterprise can authenticate against the S3 service using one of two methods:

  • Amazon Access Keys Using this method an Access Key and Secret Access key that have access to read and write to the bucket. Parameters: access_key and secret_key

  • Inherit IAM Role Anchore Enterprise can be configured to inherit the IAM role from the EC2 or ECS instance that Anchore Enterprise is running on or is provided via Kubernetes service account. When launching the EC2 instance that will run Anchore Enterprise you need to specify a role that includes the ability to read and write from the archive bucket. To use IAM roles to authenticate the access_key and secret_access configurations should be replaced by iamauto: True Parameters: iamauto

S3 Endpoint and Bucket

  • url: (required if region not set) A URL to set to reach an S3-API compatible service if you are not using actual Amazon S3. If the URL is configured, the region config value is ignored.
  • region: (required if URL not set) The AWS region that is the primary bucket host (). If you are not using actual S3, this is probably not necessary unless your S3-compatible service requires it. If the ‘URL’ configured, this field is ignored.
  • bucket: (required) The name of the S3 bucket that Anchore will use for storing data.
  • create_bucket: (default: false) Try to create the bucket if it doesn’t already exist. This should be used very sparingly. For most cases, you should pre-create the bucket so that it has the permissions you desire, then set this to false.