This is an archive of the discontinued LLVM Phabricator instance.

[X86] Fix assert fails in pass X86AvoidSFBPass
ClosedPublic

Authored by pengfei on Dec 12 2018, 7:47 PM.

Details

Summary

Fixes https://bugs.llvm.org/show_bug.cgi?id=38743

The function removeRedundantBlockingStores is supposed to remove any blocking stores contained in each other in lockingStoresDispSizeMap.
But it currently looks only at the previous one, which will miss some cases that result in assert.

This patch refine the function to check all previous layouts until find the uncontained one. So all redundant stores will be removed.

Diff Detail

Event Timeline

pengfei created this revision.Dec 12 2018, 7:47 PM
pengfei edited the summary of this revision. (Show Details)Dec 12 2018, 7:50 PM
nikic added a subscriber: nikic.Dec 13 2018, 4:15 AM
nikic added inline comments.
lib/Target/X86/X86AvoidStoreForwardingBlocks.cpp
646

This can be a range-based for loop now.

test/CodeGen/X86/pr38743.ll
1

Could you please also include the generated assembly? This can be done by running utils/update_llc_test_checks.py test/CodeGen/X86/pr38743.ll. Possibly with a --llc-bin build/bin/llc argument, or similar.

pengfei updated this revision to Diff 178041.Dec 13 2018, 5:50 AM

Thanks for nikic's comments

pengfei marked 2 inline comments as done.Dec 13 2018, 5:52 AM
lsaba added a comment.Dec 13 2018, 9:59 AM

no comments from my side (LGTM)

I think your current test case has been invalidated by D55365. In the current form, it does not assert anymore, even without this fix. Would it be possible to tweak the test case to reproduce the assertion failure, possibly by writing out some memcpys in terms of load/stores in a way that resembled the previous lowering?

pengfei updated this revision to Diff 178192.Dec 14 2018, 12:23 AM
pengfei added a reviewer: nikic.

Thanks Nikita! Otherwise I will provide an invalid test.

nikic accepted this revision.Dec 14 2018, 11:36 AM

LGTM, with a small nit.

In general, I'm not sure whether the overall behavior here makes sense. Especially in the case where the blocking stores don't come from different BBs, it would make more sense to me to preserve the last (in program order) of the overlapping stores, rather than the smallest, as that's the store that will be forwarded. If the stores come from multiple predecessors (like here) it's less clear which choice is right.

But in any case, that would require larger changes and shouldn't block this assertion fix.

lib/Target/X86/X86AvoidStoreForwardingBlocks.cpp
649–654

I don't think the const on a non-pointer here is useful.

This revision is now accepted and ready to land.Dec 14 2018, 11:36 AM
pengfei updated this revision to Diff 178440.Dec 17 2018, 4:13 AM
pengfei marked an inline comment as done.

Remove useless const.

@nikic Thank you very much for your auditing and suggestions.
The test here was just trimmed from the bug repro and used to verify the fix to the assertion. However it exposed that it may be better if it just optimized inside the BB in some cases. We will reserach it later.

@craig.topper Can you help me to submit this patch? Thanks a lot.

lsaba added a comment.Dec 18 2018, 7:09 AM

LGTM, with a small nit.

In general, I'm not sure whether the overall behavior here makes sense. Especially in the case where the blocking stores don't come from different BBs, it would make more sense to me to preserve the last (in program order) of the overlapping stores, rather than the smallest, as that's the store that will be forwarded. If the stores come from multiple predecessors (like here) it's less clear which choice is right.

But in any case, that would require larger changes and shouldn't block this assertion fix.

The idea behind using the smallest blocking store was to not introduce a new SFB between the smallest store and the newly created loads which are still bigger than it (since they are the size of the latest blocking store), I agree that improvements can be made to this patch especially for identifying and handling the behavior of blocking stores that come from predecessing blocks.

This revision was automatically updated to reflect the committed changes.