Optimize AssemblyPatternBlock fromString method by apocalypse9949 · Pull Request #9228 · NationalSecurityAgency/ghidra

apocalypse9949 · 2026-05-27T07:34:01Z

The method was splitting a string into an array of strings to parse hex bytes, calling NumericUtilities.convertHexStringToMaskedValue with AtomicLong parameters, which created multiple intermediate strings and objects for each byte array element.

I optimized this method by replacing the object allocations, strings splitting, and AtomicLong usage with an in-place single-pass loop over the string, computing the byte mask and value directly.

This optimization speeds up assembly pattern block parsing from string representations and reduces unnecessary memory allocations.

I successfully ran AssemblyPatternBlockTest tests locally in the SoftwareModeling module.

Co-authored-by: apocalypse9949 <125962989+apocalypse9949@users.noreply.github.com>

nsadeveloper789

Thanks for the pull request; however, this is not ready for merge. I also have doubts about the benefits of pursuing this. Nevertheless, I've provided specific feedback for you, should you choose to continue.

nsadeveloper789 · 2026-05-27T12:40:21Z

-			mask[i] = (byte) msk.get();
-			vals[i] = (byte) val.get();
+		int p = pos;
+		while (p < str.length()) {


I'd like this to follow the form on Line 132, i.e., use a for loop.

nsadeveloper789 · 2026-05-27T12:48:57Z

+				char c = str.charAt(j);
+				m <<= 4;
+				v <<= 4;
+				if (c == 'X' || c == 'x') {


This should only match (upper) 'X'. The meaning of (lower) 'x' is one masked bit, not a nibble.

That's also telling, as the support for, e.g., fromString("F[x1xx]") is dropped by this refactor. Some examples of that will come up in the SolverTest. More in WildSleighAssemblerTest. There may be more in the various descendants ofAbstractAssemblyTest (per-ISA) cases, but there are many, and I haven't examined them all. I'd recommend you run those test cases.

nsadeveloper789 · 2026-05-27T12:58:52Z

-		AtomicLong val = new AtomicLong();
 		int i = 0;
-		for (String hex : str.substring(pos).split(":")) {
-			NumericUtilities.convertHexStringToMaskedValue(msk, val, hex, 2, 0, null);


In general, I think this refactor sacrifices clarity for what is only a small performance improvement, and it's only realized in testing. (All call paths into fromString originate from a test method, currently.)

That said, the needless synchronization added by AtomicLongs here is noted. It turns out, this is the only call into convertHexStringToMaskedValue(), so it might be more appropriate to refactor this boundary to be more efficient. It's only ever being used to parse a byte, so you could simplify it. As for the "return values", you could choose from a few options, e.g.:

Use a single-element long array for each of msk and val. Very C like, not great, but acceptable.

Use a two-element long array, the first element for msk and the second for val.

Split the method into two variants (or one variant with a parameter to control what is returned), both return a long, but one returns msk and the other returns val. Saves some heap, but essentially has to parse twice.

Pick any of 1-3 above, but use a byte instead of a long, presuming you refactor it to parse only a byte at a time.

Refactor it to parse a byte at a time, and pack msk and val into a single int return value.

As for the concern about using String.split: I don't think it's that big a cost, but it would be acceptable to use the p and newpos pattern to locate each substring needing parsing. You could then pass those as arguments into convertHexStringToMaskedValue so that you're not creating any new objects.

Optimize AssemblyPatternBlock fromString method

7e9a1a4

Co-authored-by: apocalypse9949 <125962989+apocalypse9949@users.noreply.github.com>

ryanmkurtz assigned nsadeveloper789 May 27, 2026

ryanmkurtz added Feature: Assembler Status: Triage Information is being gathered labels May 27, 2026

nsadeveloper789 requested changes May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize AssemblyPatternBlock fromString method#9228

Optimize AssemblyPatternBlock fromString method#9228
apocalypse9949 wants to merge 1 commit into
NationalSecurityAgency:masterfrom
apocalypse9949:pattern-optimization

apocalypse9949 commented May 27, 2026

Uh oh!

nsadeveloper789 left a comment

Uh oh!

nsadeveloper789 May 27, 2026

Uh oh!

nsadeveloper789 May 27, 2026

Uh oh!

nsadeveloper789 May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

apocalypse9949 commented May 27, 2026

Uh oh!

nsadeveloper789 left a comment

Choose a reason for hiding this comment

Uh oh!

nsadeveloper789 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

nsadeveloper789 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

nsadeveloper789 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants