This is an archive of the discontinued LLVM Phabricator instance.

[VE] Add regression test for D91151
Changes PlannedPublic

Authored by kaz7 on Nov 19 2020, 4:10 AM.

Details

Summary

Add regression test for D91151 regarding to a request. Using this test,
we can inspect the reason of D91151 problem.

Diff Detail

Unit TestsFailed

Event Timeline

kaz7 created this revision.Nov 19 2020, 4:10 AM
kaz7 requested review of this revision.Nov 19 2020, 4:10 AM
kaz7 added a comment.Nov 19 2020, 4:16 AM

Hi, @dblaikie. I generated a test case which causes a segmentation fault if we didn't apply D91151 (https://reviews.llvm.org/D91151) following your suggestion. I appreciate if you have more suggestions.

This test is generated from openmp/runtime/src/kmp_cancel.cpp. I removed several functions which don't cause problems. But it is still kind of big test.

Hi, @dblaikie. I generated a test case which causes a segmentation fault if we didn't apply D91151 (https://reviews.llvm.org/D91151) following your suggestion. I appreciate if you have more suggestions.

Were you testing with assertions enabled? I'd expect a test case to cause an assertion failure (when the "cast" was applied to an object that wasn't the intended type) rather than a segmentation fault.

kaz7 added a comment.Nov 19 2020, 1:08 PM

Hi, @dblaikie. I generated a test case which causes a segmentation fault if we didn't apply D91151 (https://reviews.llvm.org/D91151) following your suggestion. I appreciate if you have more suggestions.

Were you testing with assertions enabled? I'd expect a test case to cause an assertion failure (when the "cast" was applied to an object that wasn't the intended type) rather than a segmentation fault.

There is no assertions. Original code used cast<ConstantSDNode> and caused segmentation fault. I modified it to dyn_cast<...> in D91151 to avoid this segmentation fault. Then, you asked me a test case to show the segmentation fault actually happens. Therefore, I add it this time.

You suggested assert before the cast this time. However, this cast causes segmentation fault on real programs. It means adding assertion before the cast will simply cause assertion fault on real programs. I don't want to do that...

Hi, @dblaikie. I generated a test case which causes a segmentation fault if we didn't apply D91151 (https://reviews.llvm.org/D91151) following your suggestion. I appreciate if you have more suggestions.

Were you testing with assertions enabled? I'd expect a test case to cause an assertion failure (when the "cast" was applied to an object that wasn't the intended type) rather than a segmentation fault.

There is no assertions.

'cast' has an assertion built into it if the cast is not valid ( https://github.com/llvm-mirror/llvm/blob/master/include/llvm/Support/Casting.h#L250 ). If this dyn_cast has any effect (ie: if any execution of the program has different behavior with the dyn_cast compared to the cast) it should be true that the cast version of the code would've triggered an assertion on that same program.

Are you building/running the code with assertions enabled? (I'm not asking you to add a new assertion to the code - I'm asking if you have assertions enabled in your build so you'd see an assertion before/rather than a segmentation fault)

with assertions enabled/if the old code asserting, you may be able to simplify the test case down much further, as the assertion will ensure a more reliable/fast failure rather than some unspecified failure due to undefined behavior in the execution later on/elsewhere.

kaz7 added a comment.Nov 19 2020, 1:33 PM

There is no assertions.

'cast' has an assertion built into it if the cast is not valid ( https://github.com/llvm-mirror/llvm/blob/master/include/llvm/Support/Casting.h#L250 ). If this dyn_cast has any effect (ie: if any execution of the program has different behavior with the dyn_cast compared to the cast) it should be true that the cast version of the code would've triggered an assertion on that same program.

Are you building/running the code with assertions enabled? (I'm not asking you to add a new assertion to the code - I'm asking if you have assertions enabled in your build so you'd see an assertion before/rather than a segmentation fault)

I see. I misunderstood English. Thank you for correcting it.

with assertions enabled/if the old code asserting, you may be able to simplify the test case down much further, as the assertion will ensure a more reliable/fast failure rather than some unspecified failure due to undefined behavior in the execution later on/elsewhere.

That's make sens. However, cast mechanism is compilcated and I couldn't find any good way to simplify this test case. Anyway, this is an assertion message and some back trace from old code.

llc: /home/jam/llvm-upstream/llvm-project/llvm/include/llvm/Support/Casting.h:262: typename llvm::cast_retty<To, From>::ret_type llvm::cast(Y&) [with X = llvm::ConstantSDNode; Y = llvm::SDValue; typename llvm::cast_retty<To, From>::ret_type = llvm::ConstantSDNode*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
...
 #6 0x00007fbf5a26b859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x25859)
 #7 0x00007fbf5a26b729 (/lib/x86_64-linux-gnu/libc.so.6+0x25729)
 #8 0x00007fbf5a27cf36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
 #9 0x0000559f57608668 llvm::cast_retty<llvm::ConstantSDNode, llvm::SDValue>::ret_type llvm::cast<llvm::ConstantSDNode, llvm::SDValue>(llvm::SDValue&) /home/jam/llvm-upstream/llvm-project/llvm/include/llvm/Support/Casting.h:262:3
#10 0x0000559f5792a5af (anonymous namespace)::VEDAGToDAGISel::selectADDRzii(llvm::SDValue, llvm::SDValue&, llvm::SDValue&, llvm::SDValue&) /home/jam/llvm-upstream/llvm-project/llvm/lib/Target/VE/VEISelDAGToDAG.cpp:230:38
#11 0x0000559f5792965d (anonymous namespace)::VEDAGToDAGISel::CheckComplexPattern(llvm::SDNode*, llvm::SDNode*, llvm::SDValue, unsigned int, llvm::SmallVectorImpl<std::pair<llvm::SDValue, llvm::SDNode*> >&) /home/jam/llvm-upstream/build-debug/lib/Target/VE/VEGenDAGISel.inc:11266:100

with assertions enabled/if the old code asserting, you may be able to simplify the test case down much further, as the assertion will ensure a more reliable/fast failure rather than some unspecified failure due to undefined behavior in the execution later on/elsewhere.

That's make sens. However, cast mechanism is compilcated and I couldn't find any good way to simplify this test case. Anyway, this is an assertion message and some back trace from old code.

llc: /home/jam/llvm-upstream/llvm-project/llvm/include/llvm/Support/Casting.h:262: typename llvm::cast_retty<To, From>::ret_type llvm::cast(Y&) [with X = llvm::ConstantSDNode; Y = llvm::SDValue; typename llvm::cast_retty<To, From>::ret_type = llvm::ConstantSDNode*]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
...
 #6 0x00007fbf5a26b859 abort (/lib/x86_64-linux-gnu/libc.so.6+0x25859)
 #7 0x00007fbf5a26b729 (/lib/x86_64-linux-gnu/libc.so.6+0x25729)
 #8 0x00007fbf5a27cf36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
 #9 0x0000559f57608668 llvm::cast_retty<llvm::ConstantSDNode, llvm::SDValue>::ret_type llvm::cast<llvm::ConstantSDNode, llvm::SDValue>(llvm::SDValue&) /home/jam/llvm-upstream/llvm-project/llvm/include/llvm/Support/Casting.h:262:3
#10 0x0000559f5792a5af (anonymous namespace)::VEDAGToDAGISel::selectADDRzii(llvm::SDValue, llvm::SDValue&, llvm::SDValue&, llvm::SDValue&) /home/jam/llvm-upstream/llvm-project/llvm/lib/Target/VE/VEISelDAGToDAG.cpp:230:38
#11 0x0000559f5792965d (anonymous namespace)::VEDAGToDAGISel::CheckComplexPattern(llvm::SDNode*, llvm::SDNode*, llvm::SDValue, unsigned int, llvm::SmallVectorImpl<std::pair<llvm::SDValue, llvm::SDNode*> >&) /home/jam/llvm-upstream/build-debug/lib/Target/VE/VEGenDAGISel.inc:11266:100

With the use of CReduce & some manual reduction, I managed to get the test down to this:

define i32 @a(i32* %b) {
if.c:
        %d = load atomic i32, i32* %b monotonic, align 4
        switch i32 %d, label %e.default [
                i32 1, label %e.bb
                i32 2, label %e.f
                i32 4, label %e.g
                i32 0, label %if.end
        ]
e.bb:
        br label %if.end
e.f:
        br label %if.end
e.g:
        br label %if.end
e.default:
        br label %if.end
if.end:
        %ret.0 = phi i32
                [ 0, %e.default ],
                [ 6, %if.c ],
                [ 0, %e.g ],
                [ 1, %e.f ],
                [ 1, %e.bb ]
        ret i32 %ret.0
}

Though it'd be good if you could check this over and ensure this test makes sense for the failure - and see if there's any domain-specific knowledge you could apply to change if it could be modified/simplified/be more explicit about how/why this code is what's needed to reach the buggy case.

kaz7 added a comment.Nov 20 2020, 4:53 AM

With the use of CReduce & some manual reduction, I managed to get the test down to this:

Thank you so much. I wanted simplify this test like your example. I really appreciate that you show it. I also understand that I definitely need to try creduce and other tools.

Though it'd be good if you could check this over and ensure this test makes sense for the failure - and see if there's any domain-specific knowledge you could apply to change if it could be modified/simplified/be more explicit about how/why this code is what's needed to reach the buggy case.

I will do that. Your test case causes the identical problem. I can inspect it now since it is much simplified. Thanks so mcuh.

kaz7 updated this revision to Diff 306824.Nov 20 2020, 9:22 PM

Update regression test and rebase it. Need to inspect behavior of VEISelDAGToDAG.

kaz7 planned changes to this revision.Nov 20 2020, 9:22 PM
kaz7 added a comment.EditedDec 4 2020, 4:50 PM

I've been inspecting why this problem happens iff the case of atomic_load_32 and switch. This problem doesn't happen if we have only atomic_load_32 here. The reason of this is that our implementation. We have optimizing ISel patterns for (i64 (zext (atomic_load_32 ...))) and (i64 (sext (atomic_load_32 ...))), but not for (i64 (anyext (atomic_load_32 ...))). This is the reason why this problem happens iff we have switch here. This switch is converted to br_cc and a jump-table. Our implementation of br_cc requires zext to i64 here (This is another point for optimization in future).

Then, we have following patterns. Why does this problem happen iff we have atomic_load_32 and zext?

def: Pat<(atomic_load_8 ADDRrri, ...), ...>;
def: Pat<(atomic_load_8 ADDRzii, ...), ...>;   <-- problem doesn't happen here
def: Pat<(atomic_load_32 ADDRrri, ...), ...>;  <-- This matches for `(anyext (atomic_load_32 ...))`
def: Pat<(atomic_load_32 ADDRzii, ...), ...>;
def: Pat<(i64 (zext (atomic_load_8 ADDRrri, ...))), ...>;
def: Pat<(i64 (zext (atomic_load_8 ADDRzii, ...))), ...>;  <-- problem happens here
def: Pat<(i64 (zext (atomic_load_32 ADDRrri, ...))), ...>;
def: Pat<(i64 (zext (atomic_load_32 ADDRzii, ...))), ...>;  <-- This will match for `(zext (atomic_load_32 ...))`

This is not clear for me, but ISel works like this: if ISel is searching matching pattern for (atomic_load_32 ...), ISel skips patterns for atomic_load_8 completely. However, if ISel is searching for (i64 (zext (atomic_load_32 ...))), ISel inspect (i64 (zext (atomic_load_8 ...))) patterns throughly and hit (i64 (zext (atomic_load_8 ADDRzii, ...))) pattern. This cause the problem.

As a conclusion, I can say this regression test is enoughly simplified and it shows an essential of this problem. So, I'd like to apply this patch and use it to detect future problems.

Thank you very much, @dblaikie, to help us!

kaz7 edited the summary of this revision. (Show Details)Dec 4 2020, 4:51 PM
kaz7 updated this revision to Diff 309699.Dec 4 2020, 4:52 PM

Rebase to master.

kaz7 planned changes to this revision.EditedDec 4 2020, 5:18 PM

Hmm, I try to simplify this test to (i64 (zext (atomic_load_32 ...))). But, this avoids the problem, unfortunately. I need to inspect deeper. Chain is doing something bad for the case of load_atomic_32 and switch.