Optimize AssemblyPatternBlock fromString method#9228
Conversation
Co-authored-by: apocalypse9949 <125962989+apocalypse9949@users.noreply.github.com>
nsadeveloper789
left a comment
There was a problem hiding this comment.
Thanks for the pull request; however, this is not ready for merge. I also have doubts about the benefits of pursuing this. Nevertheless, I've provided specific feedback for you, should you choose to continue.
| mask[i] = (byte) msk.get(); | ||
| vals[i] = (byte) val.get(); | ||
| int p = pos; | ||
| while (p < str.length()) { |
There was a problem hiding this comment.
I'd like this to follow the form on Line 132, i.e., use a for loop.
| char c = str.charAt(j); | ||
| m <<= 4; | ||
| v <<= 4; | ||
| if (c == 'X' || c == 'x') { |
There was a problem hiding this comment.
This should only match (upper) 'X'. The meaning of (lower) 'x' is one masked bit, not a nibble.
That's also telling, as the support for, e.g., fromString("F[x1xx]") is dropped by this refactor. Some examples of that will come up in the SolverTest. More in WildSleighAssemblerTest. There may be more in the various descendants ofAbstractAssemblyTest (per-ISA) cases, but there are many, and I haven't examined them all. I'd recommend you run those test cases.
| AtomicLong val = new AtomicLong(); | ||
| int i = 0; | ||
| for (String hex : str.substring(pos).split(":")) { | ||
| NumericUtilities.convertHexStringToMaskedValue(msk, val, hex, 2, 0, null); |
There was a problem hiding this comment.
In general, I think this refactor sacrifices clarity for what is only a small performance improvement, and it's only realized in testing. (All call paths into fromString originate from a test method, currently.)
That said, the needless synchronization added by AtomicLongs here is noted. It turns out, this is the only call into convertHexStringToMaskedValue(), so it might be more appropriate to refactor this boundary to be more efficient. It's only ever being used to parse a byte, so you could simplify it. As for the "return values", you could choose from a few options, e.g.:
- Use a single-element
longarray for each ofmskandval. Very C like, not great, but acceptable. - Use a two-element
longarray, the first element formskand the second forval. - Split the method into two variants (or one variant with a parameter to control what is returned), both return a
long, but one returnsmskand the other returnsval. Saves some heap, but essentially has to parse twice. - Pick any of 1-3 above, but use a
byteinstead of along, presuming you refactor it to parse only a byte at a time. - Refactor it to parse a byte at a time, and pack
mskandvalinto a singleintreturn value.
As for the concern about using String.split: I don't think it's that big a cost, but it would be acceptable to use the p and newpos pattern to locate each substring needing parsing. You could then pass those as arguments into convertHexStringToMaskedValue so that you're not creating any new objects.
The method was splitting a string into an array of strings to parse hex bytes, calling
NumericUtilities.convertHexStringToMaskedValuewithAtomicLongparameters, which created multiple intermediate strings and objects for each byte array element.I optimized this method by replacing the object allocations, strings splitting, and
AtomicLongusage with an in-place single-pass loop over the string, computing the byte mask and value directly.This optimization speeds up assembly pattern block parsing from string representations and reduces unnecessary memory allocations.
I successfully ran
AssemblyPatternBlockTesttests locally in theSoftwareModelingmodule.