This is an archive of the discontinued LLVM Phabricator instance.

[ARM][ParallelDSP] Relax alias checks
ClosedPublic

Authored by samparker on Apr 23 2019, 8:25 AM.

Details

Summary

When deciding the safety of generating smlad, we checked for any writes within the block that may alias with any of the loads that need to be widened. This is overly conservative because it only matters when there's a potential aliasing write to a location accessed by a pair of loads.

Now we check for aliasing writes only once, during setup. If two loads are found to have an aliasing write between them, we don't add these loads to LoadPairs. This means that later during the transform, we can safely widened a pair without worrying about aliasing.

However, to maintain correctness, we also need to change the way that wide loads are inserted because the order is now important.

The MatchSMLAD method has also been changed, absorbing MatchReductions and AddMACCandidate to hopefully improve readability.

Diff Detail

Event Timeline

samparker created this revision.Apr 23 2019, 8:25 AM
SjoerdMeijer added inline comments.May 3 2019, 1:52 AM
lib/Target/ARM/ARMParallelDSP.cpp
387

Functions CheckRAWDeps and CheckWARDeps are the same, just the order of Base and Offset are different, and whether WarDeps or RawDeps are queried, but this could be passed in as an argument.

415

Can you elaborate on what this function is doing? My first reaction would be, given that we are looking at 2 loads, how come are we looking for RAW dependencies? Perhaps the function name is a bit misleading?

samparker marked 2 inline comments as done.May 3 2019, 2:13 AM
samparker added inline comments.
lib/Target/ARM/ARMParallelDSP.cpp
387

I will have a look into changing these.

415

Yeah, it is... even looking at it myself I'm getting confused - not a great sign! First we check is the dependency sets are the same, which means there's no write between the loads. The next checks allow the sets not to be equal as long as the dominating load is the only one with RAW dependency. This means we can safely schedule the write and then the combined base + offset pair.

samparker updated this revision to Diff 198450.May 7 2019, 5:47 AM

Greatly simplified the alias checks after they didn't seem to make much sense...

SjoerdMeijer added inline comments.May 8 2019, 2:06 AM
lib/Target/ARM/ARMParallelDSP.cpp
405

I think I am struggling with the algorithm here. Here, and in the line below, we get the "write sets" of the 2 loads. We then start iterating over them , and if a write does not occur in both sets then we say it is safe to pair. But is this enough?

samparker marked an inline comment as done.May 8 2019, 4:49 AM
samparker added inline comments.
lib/Target/ARM/ARMParallelDSP.cpp
405

Good point! No... we also need to check that it's possible to either move the first load forward or the second load backward.

samparker updated this revision to Diff 198804.May 9 2019, 5:58 AM

When we create a wide load, we pull the second load up to the first. So for aliasing worries, we only need to check whether there's a write between the two loads that would prevent us from hoisting the second load.

SjoerdMeijer accepted this revision.May 10 2019, 6:05 AM

This looks okay to me; just some nits inline.

lib/Target/ARM/ARMParallelDSP.cpp
323

nit: instruction -> instructions

344

nit: I don't think this condition can be true. The loads/reads are Simple loads (not atomics), and the only writes that can be loads are loads with atomic ordering.

368

nit: newline

618

Not even sure if it is possible, but looking at this condition I am wondering about i128s.

675

Don't think we need to check for hasOneUse again here; we only add loads with one use to the list. Asserts for BaseSExt and OffetSext might be useful.

This revision is now accepted and ready to land.May 10 2019, 6:05 AM
samparker marked 3 inline comments as done.May 13 2019, 2:01 AM
samparker added inline comments.
lib/Target/ARM/ARMParallelDSP.cpp
344

Makes sense, good catch!

618

The IR supports arbitrary bit widths, but we only support 32- and 64-bit macs.

675

Cheers, I'll remove the check - and we are asserting on SExt!

samparker closed this revision.May 17 2019, 2:20 AM

Committed in rL360567.

test/CodeGen/ARM/ParallelDSP/smlad11.ll