Image Analysis Process
There are two types of image analysis:
- Centralized Analysis
- Distributed Analysis
Image analysis is performed as a distinct, asynchronous, and scheduled
task driven by queues that analyzer workers periodically poll.
Image analysis_status
states:
stateDiagram
[*] --> not_analyzed: analysis queued
not_analyzed --> analyzing: analyzer starts processing
analyzing --> analyzed: analysis completed successfully
analyzing --> analysis_failed: analysis fails
analyzing --> not_analyzed: re-queue by timeout or analyzer shutdown
analysis_failed --> not_analyzed: re-queued by user request
analyzed --> not_analyzed: re-queued for re-processing by user request
Centralized Analysis
The analysis process is composed of several steps and utilizes several
system components. The basic flow of that task as shown in the following example:
Centralized analysis high level summary:
sequenceDiagram
participant A as AnchoreCTL
participant R as Registry
participant E as Anchore Deployment
A->>E: Request Image Analysis
E->>R: Get Image content
R-->>E: Image Content
E->>E: Analyze Image Content (Generate SBOM and secret scans etc) and store results
E->>E: Scan sbom for vulns and evaluate compliance
The analyzers operate in a task loop for analysis tasks as shown below:
Adding more detail, the API call trace between services looks similar to the following example flow:
Distributed Analysis
In distributed analysis, the analysis of image content takes place outside the Anchore deployment and the result is imported
into the deployment. The image has the same state machine transitions, but the ‘analyzing’ processing of an imported analysis
is the processing of the import data (vuln scanning, policy checks, etc) to prepare the data for internal use, but does not download or touch any image content.
High level example with AnchoreCTL:
sequenceDiagram
participant A as AnchoreCTL
participant R as Registry/Docker Daemon
participant E as Anchore Deployment
A->>R: Get Image content
R-->>A: Image Content
A->>A: Analyze Image Content (Generate SBOM and secret scans etc)
A->>E: Import SBOM, secret search, fs metadata
E->>E: Scan sbom for vulns and evaluate compliance
Next Steps
Now let’s get familiar with Watching Images and Tags with Anchore.
1 - Malware Scanning
Overview
Anchore Enterprise now supports the use of the open-source ClamAV malware scanner to detect malicious code embedded in container images.
This scan occurs only at analysis time when the image content itself is available, and the scan results are available via the API as well as for consumption
in new policy gates to allow gating of image with malware findings.
Signature DB Updates
Each analyzer service will run a malware signature update before analyzing each image. This does add some latency to the overall analysis time but ensures the signatures
are as up-to-date as possible for each image analyzed. The update behavior can be disabled if you prefer to manage the freshness of the db via another route, such as a shared filesystem
mounted to all analyzer nodes that is updated on a schedule. See the configuration section for details on disabling the db update.
The status of the db update is present in each scan output for each image.
Scan Results
The malware
content type is a list of scan results. Each result is the run of a malware scanner, by default clamav
.
The list of files found to contain malware signature matches is in the findings
property of each scan result. An empty array value indicates no matches found.
The metadata
property provides generic metadata specific to the scanner. For the ClamAV implementation, this includes the version data about the signature db used and
if the db update was enabled during the scan. If the db update is disabled, then the db_version
property of the metadata will not have values since the only way to get
the version metadata is during a db update.
{
"content": [
{
"findings": [
{
"path": "/somebadfile",
"signature": "Unix.Trojan.MSShellcode-40"
},
{
"path": "/somedir/somepath/otherbadfile",
"signature": "Unix.Trojan.MSShellcode-40"
}
],
"metadata": {
"db_update_enabled": true,
"db_version": {
"bytecode": "331",
"daily": "25890",
"main": "59"
}
},
"scanner": "clamav"
}
],
"content_type": "malware",
"imageDigest": "sha256:0eb874fcad5414762a2ca5b2496db5291aad7d3b737700d05e45af43bad3ce4d"
}
Policy Rules
A policy gate called malware
is available with two new triggers:
scans
trigger will fire for each file and signature combination found in the image so that you can fail an evaluation of an image if malware was detected during the analysis scansscan_not_run
trigger will fire if there are no malware scans (even empty) available for the image
See policy checks for more details
2 - Content Hints
Anchore Enterprise includes the ability to read a user-supplied ‘hints’ file to allow users to add software artifacts to Anchore’s
analysis report. The hints file, if present, contains records that describe a software package’s characteristics explicitly,
and are then added to the software bill of materials (SBOM). For example, if the owner of a CI/CD container build process
knows that there are some
software packages installed explicitly in a container image, but Anchore’s regular analyzers fail to identify them, this mechanism
can be used to include that information in the image’s SBOM, exactly as if the packages were discovered normally.
Hints cannot be used to modify the findings of Anchore’s analyzer beyond adding new packages to the report. If a user specifies
a package in the hints file that is found by Anchore’s image analyzers, the hint is ignored and a warning message is logged
to notify the user of the conflict.
Configuration
See Configuring Content Hints
Once enabled, the analyzer services will look for a file with a specific name, location and format located within the container image - /anchore_hints.json
.
The format of the file is illustrated using some examples, below.
OS Package Records
OS Packages are those that will represent packages installed using OS / Distro style package managers. Currently supported package types are rpm, dpkg, apkg
for RedHat, Debian, and Alpine flavored package managers respectively. Note that, for OS Packages, the name of the package is unique per SBOM, meaning
that only one package named ‘somepackage’ can exist in an image’s SBOM, and specifying a name in the hints file that conflicts with one with the same
name discovered by the Anchore analyzers will result in the record from the hints file taking precedence (override).
- Minimum required values for a package record in anchore_hints.json
{
"name": "musl",
"version": "1.1.20-r8",
"type": "apkg"
}
- Complete record demonstrating all of the available characteristics of a software package that can be specified
{
"name": "musl",
"version": "1.1.20",
"release": "r8",
"origin": "Timo Ter\u00e4s <[email protected]>",
"license": "MIT",
"size": "61440",
"source": "musl",
"files": ["/lib/ld-musl-x86_64.so.1", "/lib/libc.musl-x86_64.so.1", "/lib"],
"type": "apkg"
}
Non-OS/Language Package Records
Non-OS / language package records are similar in form to the OS package records, but with some extra/different characteristics being supplied, namely
the location
field. Since multiple non-os packages can be installed that have the same name, the location field is particularly important as it
is used to distinguish between package records that might otherwise be identical. Valid types for non-os packages are currently java, python, gem, npm, nuget, go, binary
.
For the latest types that are available, see the anchorectl image content <someimage>
output, which lists available types for any given deployment of Anchore Enterprise.
- Minimum required values for a package record in anchore_hints.json
{
"name": "wicked",
"version": "0.6.1",
"type": "gem"
}
- Complete record demonstrating all of the available characteristics of a software package that can be specified
{
"name": "wicked",
"version": "0.6.1",
"location": "/app/gems/specifications/wicked-0.9.0.gemspec",
"origin": "schneems",
"license": "MIT",
"source": "http://github.com/schneems/wicked",
"files": ["README.md"],
"type": "gem"
}
Putting it all together
Using the above examples, a complete anchore_hints.json file, when discovered by Anchore Enterprise located in /anchore_hints.json
inside any container image, is provided here:
{
"packages": [
{
"name": "musl",
"version": "1.1.20-r8",
"type": "apkg"
},
{
"name": "wicked",
"version": "0.6.1",
"type": "gem"
}
]
}
With such a hints file in an image based for example on alpine:latest
, the resulting image content would report these two package/version records
as part of the SBOM for the analyzed image, when viewed using anchorectl image content <image> -t os
and anchorectl image content <image> -t gem
to view the musl
and wicked
package records, respectively.
Note about using the hints file feature
The hints file feature is disabled by default, and is meant to be used in very specific circumstances where a trusted entity is entrusted with creating
and installing, or removing an anchore_hints.json file from all containers being built. It is not meant to be enabled when the container image builds
are not explicitly controlled, as the entity that is building container images could override any SBOM entry that Anchore would normally discover, which
affects the vulnerability/policy status of an image. For this reason, the feature is disabled by default and must be explicitly enabled in configuration
only if appropriate for your use case .