Page MenuHomePhabricator

[AMDGPU] Fix crash in SILoadStoreOptimizer
ClosedPublic

Authored by rampitec on Apr 1 2020, 3:07 PM.

Details

Summary

SILoadStoreOptimizer::checkAndPrepareMerge() expects base and
paired instruction to come in order and scans MBB from base to
the paired instruction. An original order can be changed if
there were a dependent instruction in between and base instruction
was moved.

Fixed by bailing the optimization. In theory it might be possible
still to perform a merge by swapping instructions, but on practice
it bails anyway because it finds dependency on that same instruction
which has resulted in the base move.

Diff Detail

Event Timeline

rampitec created this revision.Apr 1 2020, 3:07 PM

Related to D75741?

Probably it is the same bug. I also think this is a very rare situation which is not worth all the machinery around it, but the crash still needs to be fixed.

arsenm accepted this revision.Apr 2 2020, 7:36 AM

LGTM, but an additional MIR test wouldn't hurt

llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
893

s/times/time

This revision is now accepted and ready to land.Apr 2 2020, 7:36 AM

My fix has been hanging around for a little while now - so my memory of it is a little hazy.

However, the issue I saw (and attempted to fix) was similar to yours. Agreed that I think this is a rare occurrence, so perhaps your approach makes more sense (and is simpler/cheaper).
One question - the reason I went with the solution I did was that a previous iteration of the loop doing the optimisation was invalidating the iterator stored in a later list - which caused a crash - is your fix working around this by avoiding creating those potentially dependent lists in the first place?? (I wasn't 100% sure about the logic).

I guess one potential advantage with my approach is that you still get the potential optimisations at the expense of having to re-compute the lists for dependent instructions.

hliao added a subscriber: hliao.Apr 2 2020, 8:27 AM

My fix has been hanging around for a little while now - so my memory of it is a little hazy.

However, the issue I saw (and attempted to fix) was similar to yours. Agreed that I think this is a rare occurrence, so perhaps your approach makes more sense (and is simpler/cheaper).
One question - the reason I went with the solution I did was that a previous iteration of the loop doing the optimisation was invalidating the iterator stored in a later list - which caused a crash - is your fix working around this by avoiding creating those potentially dependent lists in the first place?? (I wasn't 100% sure about the logic).

I guess one potential advantage with my approach is that you still get the potential optimisations at the expense of having to re-compute the lists for dependent instructions.

The current implementation always *sink* instructions no matter whether it's a load or store. That makes the instruction movement inevitable. But, if we *lift* loads and sink stores, no instructions (or very few address calculation) need moving as uses of loads are still preceded by their loads.

My fix has been hanging around for a little while now - so my memory of it is a little hazy.

However, the issue I saw (and attempted to fix) was similar to yours. Agreed that I think this is a rare occurrence, so perhaps your approach makes more sense (and is simpler/cheaper).
One question - the reason I went with the solution I did was that a previous iteration of the loop doing the optimisation was invalidating the iterator stored in a later list - which caused a crash - is your fix working around this by avoiding creating those potentially dependent lists in the first place?? (I wasn't 100% sure about the logic).

I guess one potential advantage with my approach is that you still get the potential optimisations at the expense of having to re-compute the lists for dependent instructions.

It sounds like related but not exactly the same issue. In my case iterators are valid, but the order is swapped.

In general the idea to collect all merge lists in advance seems problematic. We may do a better job collecting one list at a time. It is also simpler, although slower.

rampitec updated this revision to Diff 254557.Apr 2 2020, 10:10 AM
rampitec marked an inline comment as done.

Fixed typo in the comment.
Added mir test.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 2 2020, 10:50 AM