This is an archive of the discontinued LLVM Phabricator instance.

Reland "[WebAssembly] Emulate v128.const efficiently""
ClosedPublic

Authored by tlively on Oct 2 2020, 9:03 PM.

Details

Summary

This reverts commit 432e4e56d3d2, which reverted 542523a61a21. Two issues from
the original commit have been fixed. First, MSVC does not like when std::array
is initialized from a braced init list contained within a parenthesized
expression list, so this commit switches to using the more portable double
braces. Second, there was a subtle endianness bug that prevented the original
commit from working correctly on big-endian machines, which has been fixed by
switching to using endianness-agnostic bit twiddling instead of type punning.

Diff Detail

Event Timeline

tlively created this revision.Oct 2 2020, 9:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 2 2020, 9:03 PM
tlively requested review of this revision.Oct 2 2020, 9:03 PM

Thanks all for your help diagnosing and fixing these issues! I decided to go with the bit twiddling solution rather than the corrected type punning solution because it seems simpler overall to not have to think about endianness.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1589–1602

This is the only part that has changed from the previous revision.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1589–1590

MSVC doesn't work with the parens even with the extra braces.

dweber added inline comments.Oct 2 2020, 9:16 PM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1589–1590

Right. It shouldn't have parens.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1600

Can LaneBits be 64?

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
33

We can remove this now, I think.

tlively updated this revision to Diff 295959.Oct 2 2020, 9:49 PM
  • Make initializations even more MSVC-friendly
tlively marked 2 inline comments as done and an inline comment as not done.Oct 2 2020, 9:50 PM
tlively added inline comments.
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1600

Yes, if the vector is already an v2i64.

I can confirm that at least llvm/test/CodeGen/WebAssembly/simd-build-vector.ll is good on AIX (big-endian Power) with this patch (at least with the maskTrailingOnes<uint64_t>(LaneBits)) change.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1600

Okay, I suggest using maskTrailingOnes<uint64_t>(LaneBits) to avoid undefined behaviour:

diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index ec62f2a..b2913b6 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -30,8 +30,8 @@
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/IntrinsicsWebAssembly.h"
 #include "llvm/Support/Debug.h"
-#include "llvm/Support/Endian.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/MathExtras.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Target/TargetOptions.h"
 using namespace llvm;
@@ -1597,7 +1597,8 @@ SDValue WebAssemblyTargetLowering::LowerBUILD_VECTOR(SDValue Op,
         auto Shift = LaneBits * (I % HalfLanes);
         auto Val = cast<ConstantSDNode>(Lane.getNode())->getZExtValue();
         I64s[I / HalfLanes] |= Val << Shift;
-        ConstLaneMasks[I / HalfLanes] |= ((1ULL << LaneBits) - 1) << Shift;
+        ConstLaneMasks[I / HalfLanes] |= maskTrailingOnes<uint64_t>(LaneBits)
+                                         << Shift;
       }
     }
     // Check whether all constant lanes in the second half of the vector are
  • Make initializations even more MSVC-friendly

The patch description should be updated to reflect what the issue with MSVC actually was.

dweber added inline comments.Oct 3 2020, 12:15 PM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1600

@tlively I think you need to add test cases for each integer type with all of the variants of 0xdeadbeef. This logic has so many branches, it would have been easier to read if it merely used if statements to compare integer types for the appropriate shifts.

dweber added inline comments.Oct 3 2020, 12:32 PM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1584–1633

@tlively this is a real edge case, but I think you need to verify that the integer type is a representable native integer type. If APInt causes VecT.isInteger() to return true, you'll have a colossal headache with supporting integers that are 27 bits.

dweber added a comment.Oct 3 2020, 2:17 PM

Fix for the integer check. Proposal for a better, documented alternative solution.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1584–1633
1589–1602
dweber added a comment.Oct 3 2020, 2:42 PM

Side note: I filed a bug request on the operator= behavior for ulittle64_t. It has an explicit constructor that takes in any value in native endianness, but operator= is defined to convert endianness. My guess is because the operator= doesn't return ulittle64_t&, it can't construct properly through operator=. Issue is here: https://bugs.llvm.org/show_bug.cgi?id=47719

Thanks for your continued work on this @dweber! I still think this solution is simpler than the type punning solution, though, because it saves readers from having to go figure out what ulittle64_t is and how it works.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1584–1633

This operation lowering occurs only after type legalization, so we know that the only machine value types we need to consider are those that WebAssembly supports. I also don't think this suggested change changes the semantics. Am I missing something?

1589–1590

Thanks!

1600

Is there UB here? 1ULL << 64 is well defined to be 0ULL [1], and 0ULL - 1 is well defined to be the max int for that type [2]. The reason I'm pushing back here is that I want to save readers of this code from having to go look up what exactly maskTrailingOnes does.

[1] https://en.cppreference.com/w/cpp/language/operator_arithmetic#Bitwise_shift_operators:~:text=For%20unsigned%20a%2C%20the%20value%20of%20a,of%20the%20destination%20type%20are%20discarded).

[2] https://en.cppreference.com/w/cpp/language/operator_arithmetic#Overflows:~:text=Unsigned%20integer%20arithmetic%20is%20always%20performed%20modulo%202n

  • Make initializations even more MSVC-friendly

The patch description should be updated to reflect what the issue with MSVC actually was.

I think the description still applies. Were you thinking of mentioning the specific error message? "C2100: illegal indirection" doesn't seem to give any more information than is already in the description.

efriedma added inline comments.Oct 5 2020, 6:11 PM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1600

I think you missed the part where it says "if the value of the right operand is negative or is greater or equal to the number of bits in the promoted left operand, the behavior is undefined".

I think the description still applies. Were you thinking of mentioning the specific error message? "C2100: illegal indirection" doesn't seem to give any more information than is already in the description.

I was referring to the following.

Quote:

First, MSVC does not like when std::array is initialized with only single braces [ ... ]

MSVC does not like when std::array is initialized from a braced init list contained within a parenthesized expression list.

tlively edited the summary of this revision. (Show Details)Oct 9 2020, 2:38 PM
tlively updated this revision to Diff 297362.Oct 9 2020, 4:34 PM
tlively edited the summary of this revision. (Show Details)
  • Address review comments
tlively added inline comments.Oct 9 2020, 5:37 PM
llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
1600

Ugh, yep. Thank you! Switched to using maskTrailingOnes.

Thanks. This matches what I tested with a big endian host system. I'm not sure if @dweber agrees that their comments have been addressed though.

Thanks. This matches what I tested with a big endian host system. I'm not sure if @dweber agrees that their comments have been addressed though.

I'm pretty okay with this change. It took me a minute to wrap my head around because it wasn't immediately obvious how it was placing the integers inside the 64s, but overall it's fine. The only thing I think is missing is a safety check on the bitwise or operation inside the loop. Before it implicitly extracted the least significant bits by byte size meaning there was no room for a value to exceed range (e.g. bits set above the shift before the or). I spoke with @tlively offline about this, and he said he would add an assertion to make sure the value is in range. With that change, this has my blessing.

tlively updated this revision to Diff 297752.Oct 12 2020, 8:49 PM
  • Add masking of lane value and new tests

@dweber, you were totally right that this needed to be masked. In particular, when the lane contains a negative number, getZExtValue returns a very large 64-bit constant that needs to be truncated. Thanks for pressing me on that!

  • Add masking of lane value and new tests

@dweber, you were totally right that this needed to be masked. In particular, when the lane contains a negative number, getZExtValue returns a very large 64-bit constant that needs to be truncated. Thanks for pressing me on that!

Yeah... I had an eerie feeling something like that could happen -- I just couldn't put my finger on it. Really. Thanks go to you for taking care of this.

Do you know if it would still be the case if you were using getLimitedValue instead of getZExtValue? I want to say yes, but I'm not entirely sure how these two work.

Do you know if it would still be the case if you were using getLimitedValue instead of getZExtValue? I want to say yes, but I'm not entirely sure how these two work.

Yes, getLimitedValue bottoms out with this line of code: return ugt(Limit) ? Limit : getZExtValue(); so they are identical in this situation.

Do you know if it would still be the case if you were using getLimitedValue instead of getZExtValue? I want to say yes, but I'm not entirely sure how these two work.

Yes, getLimitedValue bottoms out with this line of code: return ugt(Limit) ? Limit : getZExtValue(); so they are identical in this situation.

That answers all of my questions... now to figure out where I press the checkbox to approve.

dweber accepted this revision.Oct 12 2020, 9:33 PM
This revision is now accepted and ready to land.Oct 12 2020, 9:33 PM
This revision was landed with ongoing or failed builds.Oct 12 2020, 9:37 PM
This revision was automatically updated to reflect the committed changes.