Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
To handle the various undef cases this ended up being a lot bulkier than I hoped - I'd love to hear if anyone has any suggestions on how we could use ConstantExpr/PatternMatch to simplify this as I imagine we could use such functionality in other places as we improve non-uniform vector support.
Would something along the lines of ConstantExpr::getAdd(LC, RC) and then matching the result against a Width splat work? As C + undef already gets folded to undef.
I did wonder if a ConstantExpr::isUndef() help could be added to return a bool/vectorbool that could be then used to select the 'matching' values (e.g. undef + 1 -> 31 + 1 for i32 rotate).
I've removed the undef handling from this patch - we can revisit that later once we have a better idea of how to achieve it (see D88687).
It is a miscompile--or, at least a fingerprint is being miscomputed. The failure check is here:
https://boringssl.googlesource.com/boringssl/+/refs/heads/master/crypto/hrss/hrss_test.cc#165
But that is obviously somewhat removed from where things actually went wrong, which is what I'm (rather slowly, unfortunately) trying to figure out.
I think there were some related fixes afterwards. Is the issue still present in master?
The powerpc64le issue is still present as of 51ff04567b2f8d06b2062bd3ed72eab2e93e4466.
Adding @nemanjai in case he can find anything obvious from this commit or D88834
Here is a relatively simple, self-contained testcase. Of interest is that the entire loop needs to be present, and completely unwound for this bug to show. It looks to me like either next_carry or carry is mis-calculated or stale on the ninth iteration which fills the second-to-last element.
Failed Correct Actual 0: c71c71c71c71c9fc c71c71c71c71c9fc 1: 5b91f6e791f6e791 5b91f6e791f6e791 2: 6000000000000005 6000000000000005 3: 9e79e79e79e79ed8 9e79e79e79e79ed8 4: 4bf2c78af2c7816e 4bf2c78af2c7816e 5: a53d3d3d3d3d3d3d a53d3d3d3d3d3d3d 6: 7555555555555558 7555555555555558 7: cf3cf3cf3cf3cfcd cf3cf3cf3cf3cfcd 8: d3cfa5effb5effcf d3cfa5effb5effcf 9: 0000000000000000 0000000000000001 10: 0000000000000000 0000000000000000