This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][Falkor] Avoid HW prefetcher tag collisions (step 2)
ClosedPublic

Authored by gberry on Jul 13 2017, 10:39 AM.

Details

Summary

Avoid HW prefetcher instruction tag collisions in loops by inserting
MOVs to change the base address register of strided loads.

Diff Detail

Repository
rL LLVM

Event Timeline

gberry created this revision.Jul 13 2017, 10:39 AM
MatzeB added inline comments.Jul 16 2017, 10:26 AM
lib/Target/AArch64/AArch64FalkorHWPFFix.cpp
188 ↗(On Diff #106459)

///

214 ↗(On Diff #106459)

I generally prefer long instruction list being in AArch64InstrInfo. I think a good guide is asking yourself, what happens if someone adds a new Load/Store instruction: does this code need to be adapted and will the person adding the load/store be able to find this code?

AArch64InstrInfo, also already has various getMemXXX functions, at a first glance it looks like code duplication. Did you check whether this code could use the existing functions or the existing functions could be rewritten using this code?

684 ↗(On Diff #106459)

It looks like you only need the scavenger to find free registers. In this case I'd recommend using the slightly leaner/simpler LiveRegUnits class instead:

  • addLiveOuts() correctly handles pristines registers already.
  • You can iterate over the register class directly testing

available(Reg) && !MRI.isReserved(Reg) which avoids constructing intermediate bitsets (and hence saves you some malloc/free operations).

685–687 ↗(On Diff #106459)

Less auto please.

test/CodeGen/AArch64/falkor-hwpf-fix.mir
30 ↗(On Diff #106459)

I think this should be surrounded by ---, ...; looks like the llvm yaml parser is lenient.

gberry marked 3 inline comments as done.Jul 17 2017, 8:11 PM
gberry added inline comments.
lib/Target/AArch64/AArch64FalkorHWPFFix.cpp
214 ↗(On Diff #106459)

I considered this, but some of the info computed by this function is specific to the Falkor HW prefetcher tag implementation, so I figured if I'm going to need a giant switch op opcodes, I might as well compute everything I need to know in one place, even though some of it is a bit redundant.

To your point about adding instructions in the future: I don't think that applies in this case since the opcodes supported by Falkor are not going to change going forward (unless we add more pseudo load/store instructions I guess, in which case those would hopefully have been expanded by the time this pass runs).

685–687 ↗(On Diff #106459)

I got rid of the auto on 'MBB', but left it on the iterator since that seems fairly standard, and the variable is only used on the next line (which doesn't use 'auto').

gberry updated this revision to Diff 107014.Jul 17 2017, 8:12 PM

Update to address Matthias' comments

MatzeB accepted this revision.Jul 17 2017, 8:38 PM

LGTM

lib/Target/AArch64/AArch64FalkorHWPFFix.cpp
650–651 ↗(On Diff #107014)

Could move the declarations to the assignments.

214 ↗(On Diff #106459)

ok, makes sense.

This revision is now accepted and ready to land.Jul 17 2017, 8:38 PM
This revision was automatically updated to reflect the committed changes.