This change fixes a regression in r279537 and
makes getRawSubclassData behave like r279536.
Without this change, the fp128-g.ll test case will have an
infinite loop involving SoftenFloatRes_LOAD.
Details
Diff Detail
Event Timeline
include/llvm/CodeGen/SelectionDAGNodes.h | ||
---|---|---|
1142 | s/new node/new nodes/ But probably better would be simply "has to match nodes with debug info with nodes without debug info". | |
1152 | I'll have to defer to Chandler on whether or not this is legal type punning, although it strikes me as probably not? | |
test/CodeGen/X86/fp128-g.ll | ||
2 | Please add a comment in this file indicating what it's testing. This is important so that someone without context to this thread who hits a failure in this file can understand what is going on. | |
2 | This seems like a lot of testcase for this bug. If all of these tests are necessary to cover cases here, please add comments explaining why they're different. |
include/llvm/CodeGen/SelectionDAGNodes.h | ||
---|---|---|
1152 | Worse, I think this will get the wrong answer on big endian machines.... Anyways, I think you want to do two memcpy's here: first into a union that looks the same the class one containing RawSDNodeBits, and use that to clear the HasDebugValue bit, and then from your local copy into the uint16_t that you return. |
Could we do it without two memcpy?
I am not sure what could go wrong on big endian machines.
Before this change, the memcpy gave Data different values on big and little endian machines. That was acceptable, right?
As long as the anonymous union injects names to a local stack the same way as to a struct, we should get/set the same HasDebugValue bit before and after memcpy. To make it completely independent from local stack, I can declare a local struct like the following to enclose the anonymous union. Wouldn't the following code return the same uint16 value before and after change on a big endian machine?
unsigned getRawSubclassData() const { - uint16_t Data; - memcpy(&Data, &RawSDNodeBits, sizeof(RawSDNodeBits)); - return Data; + struct SDNodeBitsContainer { + union { + uint16_t Data; + char RawSDNodeBits[sizeof(uint16_t)]; + SDNodeBitfields SDNodeBits; + }; + } tmp; + memcpy(&tmp.Data, &RawSDNodeBits, sizeof(RawSDNodeBits)); + tmp.SDNodeBits.HasDebugValue = 0; + return tmp.Data; }
Why? They're not going to end up as runtime calls to memcpy...
I am not sure what could go wrong on big endian machines.
Before this change, the memcpy gave Data different values on big and little endian machines. That was acceptable, right?
Yes, that is fine. I also think your proposed patch is OK as well on second thought.
As long as the anonymous union injects names to a local stack the same way as to a struct, we should get/set the same HasDebugValue bit before and after memcpy. To make it completely independent from local stack,
My concern wasn't about stack. I was worried about which bit you end up setting to clear the debug value, but even though the target of the first memcpy is a uint16_t, because it is a memcpy you get the bytes in the right order.
Anyways, I still think it would be much more clear to have a local union that mirrors the one in the class to fix a bit, and then copy out to a uint16_t that you return.
Please take a look of new Diff 78889. It changes more lines, but should make the union type safer and easier to access. Thanks.
include/llvm/CodeGen/SelectionDAGNodes.h | ||
---|---|---|
476–494 | This change seems a really big change and completely outside of the scope I suggested. I don't particularly like it (it seems to make things less clear) and I'm not sure what motivates it. Could you instead try the approach I suggested? |
include/llvm/CodeGen/SelectionDAGNodes.h | ||
---|---|---|
476–494 | I thought this larger change would meet your previous suggestion to It hit me that the trouble in mirroring the structure was anonymous union, Or, could you show me what to change the following version in my previous comment? unsigned getRawSubclassData() const { struct SDNodeBitsContainer { union { uint16_t Data; char RawSDNodeBits[sizeof(uint16_t)]; SDNodeBitfields SDNodeBits; }; } tmp; memcpy(&tmp.Data, &RawSDNodeBits, sizeof(RawSDNodeBits)); tmp.SDNodeBits.HasDebugValue = 0; return tmp.Data; } |
Chandler,
Stephen and I guessed that the following diff is what you want.
--- include/llvm/CodeGen/SelectionDAGNodes.h (revision 287585) +++ include/llvm/CodeGen/SelectionDAGNodes.h (working copy) @@ -415,6 +415,7 @@ class SDNodeBitfields { friend class SDNode; friend class MemIntrinsicSDNode; + friend class MemSDNode; uint16_t HasDebugValue : 1; uint16_t IsMemIntrinsic : 1; @@ -1134,11 +1135,19 @@ return MMO->getAlignment(); } - /// Return the SubclassData value, which contains an + /// Return the SubclassData value, without HasDebugValue. This contains an /// encoding of the volatile flag, as well as bits used by subclasses. This /// function should only be used to compute a FoldingSetNodeID value. + /// The HasDebugValue bit is masked out because CSE map needs to match + /// nodes with debug info with nodes without debug info. unsigned getRawSubclassData() const { uint16_t Data; + union { + char RawSDNodeBits[sizeof(uint16_t)]; + SDNodeBitfields SDNodeBits; + }; + memcpy(&RawSDNodeBits, &this->RawSDNodeBits, sizeof(this->RawSDNodeBits)); + SDNodeBits.HasDebugValue = 0; memcpy(&Data, &RawSDNodeBits, sizeof(RawSDNodeBits)); return Data; }
Assuming the memcpy will be optimized as much as possible,
it should generate similar code to my 2nd diff in
https://reviews.llvm.org/D26942?id=78800
Please let us know which version you like or tell us the exact code.
Next Android llvm upgrade is pending on this bug fix.
Thanks.
Stephen and I guessed that the following diff is what you want.
FWIW I am happy to approve that diff. I think that's what Chandler was after (and I'm also happy with it).
test/CodeGen/X86/fp128-g.ll | ||
---|---|---|
2 | The problem appeared only when compiled with -g. | |
24 | It's just a minimal check that debug info was not lost, |
This change seems a really big change and completely outside of the scope I suggested. I don't particularly like it (it seems to make things less clear) and I'm not sure what motivates it. Could you instead try the approach I suggested?