added electron cataloger & bin classifier#4679
Conversation
|
@wagoodman this hasnt had any review since Mar, i'm fixing t static analysis failure, looks like its related to t new capabilities.yaml requirement, once ci is green, could someone take a first pass? electron apps are a pretty common blind spot in sboms, so i think this adds meaningful coverage |
Signed-off-by: Rez Moss <hi@rezmoss.com>
Signed-off-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
c6bb6a8 to
3c0fe24
Compare
|
Updated this with the new @anchore/tools for a second review to merge |
Signed-off-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
kzantow
left a comment
There was a problem hiding this comment.
I left some feedback / questions inline, but should note that we're going to be adding nested archive cataloging as one of the top priorities. Given that, it might be good to wait on this PR. Otherwise, just to reiterate the main things I see are:
- I assume we're missing lots of electron binaries because of only checking a couple of file names, does that matter? should the electron cataloger do a more thorough scan of all executables instead of using the binary classifier?
- we should avoid
io.ReadAllunless necessary (all/most of the readers should already be seekable) - there seems to be a lot of functionality overlap with the javascript cataloger that we should somehow consolidate, unless I've misunderstood (e.g. inconsistencies like license lookup won't happen here, but might in the javascript cataloger version)
| func parsePackageJSON(_ context.Context, _ file.Resolver, _ *generic.Environment, reader file.LocationReadCloser) ([]pkg.Package, []artifact.Relationship, error) { | ||
| defer internal.CloseAndLogError(reader, reader.Path()) | ||
|
|
||
| contents, err := io.ReadAll(reader) //nolint:gocritic // json.Unmarshal requires the full document buffered |
There was a problem hiding this comment.
The parsePackageJSONFromContents call later is just unmarshaling, can it pass the reader instead?
| newSimplePackageTaskFactory(dotnet.NewDotnetPortableExecutableCataloger, pkgcataloging.DeprecatedTag), //nolint:staticcheck // TODO: remove in syft v2.0 | ||
| newSimplePackageTaskFactory(php.NewPeclCataloger, pkgcataloging.DeprecatedTag), //nolint:staticcheck // TODO: remove in syft v2.0 | ||
| newSimplePackageTaskFactory(nix.NewStoreCataloger, pkgcataloging.DeprecatedTag), //nolint:staticcheck // TODO: remove in syft v2.0 | ||
| //nolint:staticcheck // TODO: remove in syft v2.0 |
There was a problem hiding this comment.
nit: is this change necessary? there's an unfortunate inconsistency here now
| func parseAsarArchive(_ context.Context, _ file.Resolver, _ *generic.Environment, reader file.LocationReadCloser) ([]pkg.Package, []artifact.Relationship, error) { | ||
| defer internal.CloseAndLogError(reader, reader.Path()) | ||
|
|
||
| asarData, err := io.ReadAll(reader) //nolint:gocritic // ASAR is random-access by byte offset; the full archive must be buffered |
There was a problem hiding this comment.
We could get a seekable reader (most of the stereoscope/syft readers are seekable already, so this is free), does this need to ReadAll?
| }, | ||
| { | ||
| Class: "electron-binary-macos", | ||
| FileGlob: "**/Electron Framework", |
There was a problem hiding this comment.
Are electron apps frequently named electron/Electron Framework? is this because people don't change the defaults? I wonder if this would be better served with a specialized cataloger that checks all executables in the electron cataloger package, below. This might also cover VS code if it scanned all binaries
| }, | ||
| { | ||
| // VS Code and other Electron apps on Linux that rename the binary to "code" | ||
| Class: "electron-binary-renamed-linux", |
There was a problem hiding this comment.
this should probably be called something like visual-studio-code-binary and + -windows for the windows version. These could be merged together with the glob: **/{code,Code.exe}, I think and just be called visual-studio-code-binary
| "**/resources/app/node_modules.asar", // Linux/Win VS Code style | ||
| ). | ||
| WithParserByGlobs(parsePackageJSON, | ||
| "**/Contents/Resources/app/node_modules/*/package.json", // macOS |
There was a problem hiding this comment.
I think these paths would be covered by the package json cataloger, which is not enabled by default for directory scans (I think there is a desire to identify "full OS" scans and use the same catalogers as images, so that could change at some point). But if the package json cataloger is enabled, does this result in duplicate results and we're just deduplicating later?
| continue | ||
| } | ||
|
|
||
| virtualPath := fmt.Sprintf("%s:%s", reader.Path(), pkgPath) |
There was a problem hiding this comment.
I don't think this is the right thing to do: we concat some Java nested JARs like this but these paths are only put in the Java Package metadata, not used for locations. Additionally, using the colon separator results in potential ambiguities. I'm fairly certain that Locations today only contain the path within the top-level source that was scanned, even for things we find in nested archives. From what I can tell NewVirtualLocationFromCoordinates is used to represent logical paths within the source, e.g. "/".
I am very soon planning on working on properly representing nested paths to support scanning nested archives more generically, so I hope I'm not missing some behavior we have that is different than my understanding. I wonder if we should actually wait to merge this PR until the nested archive work lands.
| Private bool `json:"private"` | ||
| } | ||
|
|
||
| func parsePackageJSONFromContents(contents []byte, location file.Location) (pkg.Package, error) { |
There was a problem hiding this comment.
It looks like this is a duplicate of what the javascript cataloger is doing and should be using some kind of shared functionality. When processing nested archives lands, assuming ASAR files can be identified and processed as archives, I think this would automatically include those records.
Description
electron cataloger & classifier
Type of change
Checklist
Issue references