This is an archive of the discontinued LLVM Phabricator instance.

[Assignment Tracking][NFC] Use BitVectors as masks for SmallVectors rather than using DenseMaps
ClosedPublic

Authored by Orlando on Mar 8 2023, 1:54 AM.

Details

Summary

Rather than tracking 3 maps of {VariableID: SomeInfo} per block, use a BitVector indexed by VariableID to mask 3 vectors of SomeInfo.

BlockInfos now need to be initialised with a call to init which sets the BitVector width to the number of partially promoted variables in the function and fills the vectors with Top values.

Prior to this patch, in joinBlockInfo, it was necessary to insert Top values into the Join result for variables in A XOR B after joining the variables in A AND B. Now, because the vectors are pre-filled with Top values we need only join the variables A AND B and set the BitVector of tracked variables to A OR B.

The patch achieves an average of 0.25% reduction in instructions retired and a 1.1% max-rss for the CTMark suite in LTO-O3-g builds.

Diff Detail

Event Timeline

Orlando created this revision.Mar 8 2023, 1:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2023, 1:54 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
Orlando requested review of this revision.Mar 8 2023, 1:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2023, 1:54 AM

It might be worth including the rationale in the description/commit message, like you've done in other similar changes. I assume it is a performance improvement?

llvm/lib/CodeGen/AssignmentTrackingAnalysis.cpp
1001–1008

It's a matter of taste, but I find the algorithms approach easier to read. Alternatively, I think this loop can at least use the set_bits() range rather than find_first+find_next

1012–1016
1033–1036
1236–1244

I think this starts to get a little confusing, as there isn't a direct connection between the two arguments (Mask and M). Maybe they should be adjacent in the parameter list?

The only other suggestion I have is to hide all the data members in BlockInfo. There is already isVariableTracked and getDebugValue, I would just extend this to cover ever operation you need. All of the mask handling will then be adjacent and abstracted away, so the rest of the code reads easier and there is no danger of missing a mask check or update.

I'm leaving a bunch of edit suggestions, but feel free to ignore them! I was just going through the exercise of writing one possible implementation to prove to myself it is an actual improvement, YMMV.

1275–1277
1286
1290–1291
1303–1304
1323–1336
1520–1521
Orlando updated this revision to Diff 505830.Mar 16 2023, 9:10 AM
Orlando marked 10 inline comments as done.

Thank you @scott.linder for taking a look at this.

It might be worth including the rationale in the description/commit message, like you've done in other similar changes. I assume it is a performance improvement?

Yep there's an average of around 0.25% reduction in instructions retired and a 1.1% max-rss for the CTMark suite in LTO-O3-g builds (I'll update the description).

I'm leaving a bunch of edit suggestions, but feel free to ignore them! I was just going through the exercise of writing one possible implementation to prove to myself it is an actual improvement, YMMV

I appreciate the in-depth review and the suggestions all look good to me!

The diff has become quite large now so I'll could look at trying to factor out some of the suggestions into NFC patches?

Orlando edited the summary of this revision. (Show Details)Mar 16 2023, 9:11 AM
Orlando updated this revision to Diff 505832.Mar 16 2023, 9:12 AM

(this time with context)

scott.linder accepted this revision.Mar 20 2023, 12:09 PM

Thank you @scott.linder for taking a look at this.

It might be worth including the rationale in the description/commit message, like you've done in other similar changes. I assume it is a performance improvement?

Yep there's an average of around 0.25% reduction in instructions retired and a 1.1% max-rss for the CTMark suite in LTO-O3-g builds (I'll update the description).

I'm leaving a bunch of edit suggestions, but feel free to ignore them! I was just going through the exercise of writing one possible implementation to prove to myself it is an actual improvement, YMMV

I appreciate the in-depth review and the suggestions all look good to me!

The diff has become quite large now so I'll could look at trying to factor out some of the suggestions into NFC patches?

That is OK with me, however you prefer to land it! I am also OK with landing the previous version as-is, as I said I don't see any correctness issues

This revision is now accepted and ready to land.Mar 20 2023, 12:09 PM
This revision was landed with ongoing or failed builds.Mar 21 2023, 2:12 AM
This revision was automatically updated to reflect the committed changes.