Skip to content

Relativize CycloneDX-JSON file component paths against --base-path#4966

Open
AgentGymLeader wants to merge 2 commits into
anchore:mainfrom
AgentGymLeader:fix/cyclonedx-json-base-path
Open

Relativize CycloneDX-JSON file component paths against --base-path#4966
AgentGymLeader wants to merge 2 commits into
anchore:mainfrom
AgentGymLeader:fix/cyclonedx-json-base-path

Conversation

@AgentGymLeader

Copy link
Copy Markdown

Description

CycloneDX-JSON file components emitted absolute host paths even when --base-path was set, while SPDX-JSON correctly emitted base-relative paths. The CycloneDX encoder set the file component Name from metadata.Path (the raw, non-chrooted indexer path), so the host prefix was never stripped.

This makes CycloneDX match SPDX. SPDX's toFiles relativizes each file via convertAbsoluteToRelative(coordinate.RealPath). I lifted that helper into syft/format/internal.ConvertAbsoluteToRelative so both encoders share one implementation (SPDX now delegates to it, with no behavior change) and used it for the CycloneDX file component name with the same error → fall-back-to-RealPath behavior.

@kzantow — thanks for the pointer on the issue. I went with the existing SPDX convertAbsoluteToRelative(RealPath) path rather than AccessPath, to keep the two encoders consistent and reuse the logic that already ships. Happy to switch the shared helper to AccessPath if you'd prefer that for the symlink-dedup cases you mentioned — since it's now shared, that would move both formats together.

Before / After (a file's CycloneDX components[].name, scanned with --base-path)

  • Before: /abs/scan/root/usr/bin/foo
  • After: usr/bin/foo (matches SPDX-JSON)

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • I have added unit tests that cover changed behavior
  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

Issue references

Fixes #4592

The CycloneDX-JSON encoder set the file component name from metadata.Path
(the raw, non-chrooted indexer path), so --base-path did not strip the host
prefix the way it does for SPDX-JSON (which uses the base-relative
Coordinates.RealPath). Use the base-relative RealPath for the CycloneDX file
component name so both formats stay consistent and the SBOM no longer leaks
absolute host paths.

Fixes anchore#4592

Signed-off-by: AgentGymLeader <AgentGymLeader@users.noreply.github.com>
@AgentGymLeader

Copy link
Copy Markdown
Author

@kzantow the one thing still open is whether the shared helper should use AccessPath or stay on RealPath. Both encoders run through it now, so I can move them together either way. Let me know which you'd prefer and I'll push the update.

Comment thread syft/format/internal/relativepath.go Outdated

relPath, found := strings.CutPrefix(absPath, "/")
if !found {
return "", fmt.Errorf("error calculating relative path: %s", absPath)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns an error converting a relative path to a relative path?

}

cdxHashes := digestsToHashes(digests)
relativePath, err := formatinternal.ConvertAbsoluteToRelative(coordinate.RealPath)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As was noted on the issue, converting the RealPath to a relative path in this manner seems like the wrong thing to do: if I scanned /subdir and there's a symlink to ../foo, this would be resolved to the absolute path of /foo, which is a RealPath, which then would be converted to the relative path foo, but there is no /subdir/foo. As noted in another comment, this should probably be ../foo

"github.com/anchore/syft/syft/artifact"
"github.com/anchore/syft/syft/file"
formatInternal "github.com/anchore/syft/syft/format/internal"
formatinternal "github.com/anchore/syft/syft/format/internal"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we avoid these import renames?

Comment thread syft/format/internal/relativepath.go Outdated
"strings"
)

func ConvertAbsoluteToRelative(absPath string) (string, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be something like what filepath.Rel is doing, though I don't think we can use that function because the paths here are normalized to forwards slashes

File paths were produced by stripping a prefix, so a file whose real path
escapes the scan --base-path (e.g. via a symlink) collapsed to 'foo' instead of
the correct '../foo'. Add a forward-slash filepath.Rel-style helper and use it
when the source is a directory scan with a base set; other source types keep the
existing behavior. Also drop the unnecessary error return from the relative-path
helpers.

Signed-off-by: AgentGymLeader <AgentGymLeader@users.noreply.github.com>
@AgentGymLeader

Copy link
Copy Markdown
Author

@kzantow thanks, addressed in 6f8c004:

  • Relative path (../foo): you were right. Prefix-stripping collapsed a symlink whose real path escapes the base down to foo. I added a forward-slash equivalent of filepath.Rel that preserves .., and take the base from DirectoryMetadata.Base (the --base-path). Non-directory sources keep the existing behavior.
  • Error return: agreed it's odd to return an error converting a relative path to a relative path, so the helper now returns a plain string with no error.
  • Import rename: lined the alias up as formatinternal to match cyclonedxhelpers. Happy to drop the alias entirely and reference it directly if you'd prefer.
  • Added tests for the escape case (/root/subdir/root/foo = ../foo) and the other path shapes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CycloneDX JSON output keeps absolute paths even if --base-path is set

2 participants