Skip to content

added electron cataloger & bin classifier#4679

Open
rezmoss wants to merge 3 commits into
anchore:mainfrom
rezmoss:electron-cataloger-class
Open

added electron cataloger & bin classifier#4679
rezmoss wants to merge 3 commits into
anchore:mainfrom
rezmoss:electron-cataloger-class

Conversation

@rezmoss

@rezmoss rezmoss commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

Description

electron cataloger & classifier

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (please discuss with the team first; Syft is 1.0 software and we won't accept breaking changes without going to 2.0)
  • Documentation (updates the documentation)
  • Chore (improve the developer experience, fix a test flake, etc, without changing the visible behavior of Syft)
  • Performance (make Syft run faster or use less memory, without changing visible behavior much)

Checklist

  • I have added unit tests that cover changed behavior
  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

Issue references

@rezmoss

rezmoss commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

@wagoodman this hasnt had any review since Mar, i'm fixing t static analysis failure, looks like its related to t new capabilities.yaml requirement, once ci is green, could someone take a first pass? electron apps are a pretty common blind spot in sboms, so i think this adds meaningful coverage

rezmoss and others added 2 commits June 22, 2026 11:57
Signed-off-by: Rez Moss <hi@rezmoss.com>
Signed-off-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>
@spiffcs spiffcs force-pushed the electron-cataloger-class branch from c6bb6a8 to 3c0fe24 Compare June 22, 2026 16:12
@spiffcs

spiffcs commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Updated this with the new testdata patterns and rebased. Ty @rezmoss as always for the very quality PR and effort on this one.

@anchore/tools for a second review to merge

Signed-off-by: Christopher Phillips <32073428+spiffcs@users.noreply.github.com>

@kzantow kzantow left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some feedback / questions inline, but should note that we're going to be adding nested archive cataloging as one of the top priorities. Given that, it might be good to wait on this PR. Otherwise, just to reiterate the main things I see are:

  • I assume we're missing lots of electron binaries because of only checking a couple of file names, does that matter? should the electron cataloger do a more thorough scan of all executables instead of using the binary classifier?
  • we should avoid io.ReadAll unless necessary (all/most of the readers should already be seekable)
  • there seems to be a lot of functionality overlap with the javascript cataloger that we should somehow consolidate, unless I've misunderstood (e.g. inconsistencies like license lookup won't happen here, but might in the javascript cataloger version)

func parsePackageJSON(_ context.Context, _ file.Resolver, _ *generic.Environment, reader file.LocationReadCloser) ([]pkg.Package, []artifact.Relationship, error) {
defer internal.CloseAndLogError(reader, reader.Path())

contents, err := io.ReadAll(reader) //nolint:gocritic // json.Unmarshal requires the full document buffered

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parsePackageJSONFromContents call later is just unmarshaling, can it pass the reader instead?

newSimplePackageTaskFactory(dotnet.NewDotnetPortableExecutableCataloger, pkgcataloging.DeprecatedTag), //nolint:staticcheck // TODO: remove in syft v2.0
newSimplePackageTaskFactory(php.NewPeclCataloger, pkgcataloging.DeprecatedTag), //nolint:staticcheck // TODO: remove in syft v2.0
newSimplePackageTaskFactory(nix.NewStoreCataloger, pkgcataloging.DeprecatedTag), //nolint:staticcheck // TODO: remove in syft v2.0
//nolint:staticcheck // TODO: remove in syft v2.0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is this change necessary? there's an unfortunate inconsistency here now

func parseAsarArchive(_ context.Context, _ file.Resolver, _ *generic.Environment, reader file.LocationReadCloser) ([]pkg.Package, []artifact.Relationship, error) {
defer internal.CloseAndLogError(reader, reader.Path())

asarData, err := io.ReadAll(reader) //nolint:gocritic // ASAR is random-access by byte offset; the full archive must be buffered

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could get a seekable reader (most of the stereoscope/syft readers are seekable already, so this is free), does this need to ReadAll?

},
{
Class: "electron-binary-macos",
FileGlob: "**/Electron Framework",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are electron apps frequently named electron/Electron Framework? is this because people don't change the defaults? I wonder if this would be better served with a specialized cataloger that checks all executables in the electron cataloger package, below. This might also cover VS code if it scanned all binaries

},
{
// VS Code and other Electron apps on Linux that rename the binary to "code"
Class: "electron-binary-renamed-linux",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be called something like visual-studio-code-binary and + -windows for the windows version. These could be merged together with the glob: **/{code,Code.exe}, I think and just be called visual-studio-code-binary

"**/resources/app/node_modules.asar", // Linux/Win VS Code style
).
WithParserByGlobs(parsePackageJSON,
"**/Contents/Resources/app/node_modules/*/package.json", // macOS

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these paths would be covered by the package json cataloger, which is not enabled by default for directory scans (I think there is a desire to identify "full OS" scans and use the same catalogers as images, so that could change at some point). But if the package json cataloger is enabled, does this result in duplicate results and we're just deduplicating later?

continue
}

virtualPath := fmt.Sprintf("%s:%s", reader.Path(), pkgPath)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right thing to do: we concat some Java nested JARs like this but these paths are only put in the Java Package metadata, not used for locations. Additionally, using the colon separator results in potential ambiguities. I'm fairly certain that Locations today only contain the path within the top-level source that was scanned, even for things we find in nested archives. From what I can tell NewVirtualLocationFromCoordinates is used to represent logical paths within the source, e.g. "/".

I am very soon planning on working on properly representing nested paths to support scanning nested archives more generically, so I hope I'm not missing some behavior we have that is different than my understanding. I wonder if we should actually wait to merge this PR until the nested archive work lands.

Private bool `json:"private"`
}

func parsePackageJSONFromContents(contents []byte, location file.Location) (pkg.Package, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is a duplicate of what the javascript cataloger is doing and should be using some kind of shared functionality. When processing nested archives lands, assuming ASAR files can be identified and processed as archives, I think this would automatically include those records.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants