Malware & Cataloger Scans

Malware & Cataloger Scanning Overview

When an Image is Analyzed/Scanned you have the ability to configure the process to best suit your particular use case and/or desired security control. After discovery these data can later be used within Anchore’s policy engine rules and gates. Please don’t forget to review this configuration too.

Both the Malware and Catalogers offer new capabilities and details on these are as follows:

Malware

For an overview of the feature and how it works. See Malware Scanning

Catalogers

During Analysis/Scans of your images, Anchore has the ability to run extra catalogers or searches. These are as follows:

  • retrieve_files - retrieve and index files matching a configured file list
  • secret_search and content_search - perform a search across file contents for a configured regexp match. Findings are then cataloged accordingly.

Limitations and Resource Usage

Both the Malware and Catalogers will impact analysis/scanning time, and this time will depend on the size and number of files the image contains. Anchore supports sources. However, sources currently need to be analyzed with Syft and not AnchoreCTL. Syft does not currently support catalogers or malware checks. Where possible, and use case depending, you should offload to Distributed Scanning/Analysis to reduce analyzer compute load on your central Anchore Deployment.

Malware

:warning: Malware scanning currently only supports image sizes up to 4gb.

:warning: Malware scanning can ONLY operate when using Centralized analysis and NOT Distributed Analysis.

Anchore uses ClamAV to deliver this capability. ClamAV has a maximum size of file that it can scan: 4GB. That means that it is only capable of scanning a 4GB squashed filesystem for a container. If you analyze an image larger than that with malware scanning enabled, you’ll see an error in the logs and the analyzer will not register a valid malware scan for the image. This condition can be caught using policy rules requiring a scan to be run for an image.

Finally, the Malware feature will check for malware definition updates on each analysis/scan. Your analyzer service will need to be able to reach https://database.clamav.net/ to be able to retrieve the latest definitions.

Catalogers

Running extra catalogers will require more resources and time to perform analysis of images. Please take this into consideration when enabling and defining your regexp values. This can be controlled by limiting the search with MAXFILESIZE to limit the search to large and/or very small files.

Enabling & Disabling Malware Scans & Catalogers

The process for enabling and configuring the Malware and other catalogers differs between Helm and Compose deployments. Additionally, there are two modes which you scan/anaylsis images and therefore two places that can configure this capability in 1. Distributed Mode 2. Centralized mode For Distributed Analysis, the catalogers are configured in the AnchoreCTL Configuration. For Centralized Analysis, the catalogers are configured in the centralized Anchore Deployment via the Analyzer config documented on this page.

Helm

Update the Helm values.yaml file. Below is an example configuration with Malware, retrieve_files, secret_search enabled. Helm will take these values and define a ConfigMap in your Anchore Kubernetes deployment.

:warning: Malforming this file can cause the Anchore Analyzer to fail on all image analysis.

anchoreConfig:
  analyzer:
    malware:
        configFile:
          retrieve_files:
            file_list:
              - '/etc/passwd'
          secret_search:
            match_params:
              - MAXFILESIZE=10000
            regexp_match:
              - "AWS_ACCESS_KEY=(?i).*aws_access_key_id( *=+ *).*(?<![A-Z0-9])[A-Z0-9]{20}(?![A-Z0-9]).*"
              - "AWS_SECRET_KEY=(?i).*aws_secret_access_key( *=+ *).*(?<![A-Za-z0-9/+=])[A-Za-z0-9/+=]{40}(?![A-Za-z0-9/+=]).*"
              - "PRIV_KEY=(?i)-+BEGIN(.*)PRIVATE KEY-+"
              - "DOCKER_AUTH=(?i).*\"auth\": *\".+\""
              - "API_KEY=(?i).*api(-|_)key( *=+ *).*(?<![A-Z0-9])[A-Z0-9]{20,60}(?![A-Z0-9]).*"
              # - "ALPINE_NULL_ROOT=^root:::0:::::$"
          #
          ## Uncomment content_search: {} to configure file content searching
          # Very expensive operation - recommend you carefully test and review
          # content_search:
          #   match_params:
          #     - MAXFILESIZE=10000
          #   regexp_match:
          #     - "EXAMPLE_MATCH="
          #
          ## Malware scanning occurs only at analysis time when the image content itself is available
          malware:
            clamav:
              enabled: true
              db_update_enabled: true

Please review the helm chart example values.yaml file for further detail.

Docker Compose

The Malware and Catalogers can be configured and enabled in the ‘analyzer_config.yaml’ file. This file needs to then be mounted as a file volume in your Anchore Docker Compose file under the analyzer: service as shown below:

analyzer:
    volumes:
      - ./analyzer_config.yaml:/anchore_service/analyzer_config.yaml:ro   #mounted analyzer_config

This file should contain the required configuration parameters. Please see the following example and adjust as required.

malware:
  clamav:
    # Set this to true to enable the malware scan
    enabled: true
    # Set this to false to turn off the db refresh on each scan
    db_update_enabled: true

retrieve_files:
  max_file_size_kb: 1000
  file_list:
    - '/etc/passwd'
    - '/etc/services'
    - '/etc/sudoers'

secret_search:
  match_params:
    - MAXFILESIZE=10000
  regexp_match:
    - "AWS_ACCESS_KEY=(?i).*aws_access_key_id( *=+ *).*(?<![A-Z0-9])[A-Z0-9]{20}(?![A-Z0-9]).*"
    - "AWS_SECRET_KEY=(?i).*aws_secret_access_key( *=+ *).*(?<![A-Za-z0-9/+=])[A-Za-z0-9/+=]{40}(?![A-Za-z0-9/+=]).*"
    - "PRIV_KEY=(?i)-+BEGIN(.*)PRIVATE KEY-+"
    - "DOCKER_AUTH=(?i).*\"auth\": *\".+\""
    - "API_KEY=(?i).*api(-|_)key( *=+ *).*(?<![A-Z0-9])[A-Z0-9]{20,60}(?![A-Z0-9]).*"

## Uncomment content_search: {} to configure file content searching
# Very expensive operation - recommend you carefully test and review
# content_search:
#   match_params:
#     - MAXFILESIZE=10000
#   regexp_match:
#     - "EXAMPLE_MATCH="

Malware - Disabling DB Updates

The db_update_enabled property of the malware.clamav object shown above in the analyzer_config.yaml controls whether the analyzer will invoke a refreshclam call prior to each analysis execution. By default, it is enabled and should be left on for up-to-date scan results. The db version is returned in the metadata section of the scan results available from the Anchore Enterprise API.

You can disable the update if you want to mount an external volume to provide the db data in /home/anchore/clamav/db inside the container (must be read-write for the Anchore user) This can be used to cache or share a db across multiple analyzers (e.g. using AWS EFS) or to support air-gapped deployments where the db cannot be automatically updated from deployment itself.

Malware - Advanced Configuration

The path for the db and db update configuration are also available as environment variables inside the analyzer containers. These should not need to be used in most cases, but for air-gapped or other installation where the default configuration is not sufficient they are available for customization.

NameDescriptionDefault
ANCHORE_FRESHCLAM_CONFIG_FILELocation of freshclam.conf to use/home/anchore/clamav/freshclam.conf
ANCHORE_CLAMAV_DB_DIRLocation of the db dir to read/write/home/anchore/clamav/db

For most cases, Anchore uses the default values for the clamscan and freshclam invocations. If you would like to override any of the default values of those commands or replace existing ones, you can add the following to the analyzer_config.yaml:

malware:
  clamav:
    clamscan_args:
      - max-filesize=1000m
      - max-scansize=1000m
    freshclam_args:
      - datadir=/tmp/different/datadir

Please note that the value above will be passed directly to the corresponding commands, e.g.:

clamscan --suppress-ok-results --infected --recursive --allmatch --archive-verbose --tempdir={tempdir} --database={database} --max-filesize=1000m --max-scansize=1000m <path_to_tar>
Last modified September 13, 2024