This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] More uses of uint8_t for single byte values
ClosedPublic

Authored by aheejin on Mar 1 2018, 6:24 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

aheejin created this revision.Mar 1 2018, 6:24 PM
sbc100 accepted this revision.Mar 1 2018, 9:12 PM
This revision is now accepted and ready to land.Mar 1 2018, 9:12 PM
This revision was automatically updated to reflect the committed changes.

This broke test/MC/WebAssembly/global-ctor-dtor.ll in the waterfall. I checked the output but don't have any idea why it's causing differences in relocation and function information (I'm not familiar with the linker). Do you know why?

ncw added a subscriber: ncw.Mar 2 2018, 4:36 AM

I saw the build failures too, when I was trying to land some patches today. I feel I can't land until the tests pass, even if the failures weren't caused by me.

This commit is causing the function body to change from:

from: 024041818080800041004180808080001081808080000D000F0B00000B
to:   02C00041818080800041004180808080001081808080000D000F0B00000B

That corresponds to block = 0x02 <blocktype> where blocktype has changed from being 0x40 (which was correct) to 0xC000 which is incorrect - and is -0x40 in SLEB encoding.

There's something fishy in this code even to begin with - the old code had ExprType::Void = -0x40 when the value to be written out is +0x40, and yet ExprType::I32 = -0x01 was used to write out a value of +0x7F. Somehow the encoding was treating void differently!?!

In the new code, it's been changed to ExprType::Void = +0x40 (corresponding to the byte value that's correct) and ExprType::I32 = +0x7F (again the same as the byte that's to be written out).

We could just revert this commit, or maybe spend another half hour trying to understand how on earth the old code was working...

ncw added a comment.Mar 2 2018, 5:22 AM

OK, I get it. The writer is writing out operands as SLEB128 still, not as uint8_t.

So the old code had Void = -0x40, which encodes in SLEB to 0x40, and had I32 = -0x01 which encodes in SLEB to 0x7F.

This patch changed the values to their actual byte values, but they'll be mangled when they go through the SLEB operand writer (WebAssemblyMCCodeEmitter, branch for Info.OperandType == WebAssembly::OPERAND_SIGNATURE).

ncw added a comment.Mar 2 2018, 5:24 AM

This broke test/MC/WebAssembly/global-ctor-dtor.ll in the waterfall. I checked the output but don't have any idea why it's causing differences in relocation and function information (I'm not familiar with the linker). Do you know why?

The relocs were changing because the operand went from 0x40 (correct value) to 0xC000 (SLEB encoding of 0x40). Because the length of the sequence changed, the addresses of all the relocs were shifted.

ncw added a comment.Mar 2 2018, 6:00 AM

For now, I'll revert the bad commit; if you want to re-apply it, you'll have to apply this change with it, so that SLEB encoding isn't applied to the uint8_t bytes:

diff --git a/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCCodeEmitter.cpp b/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCCodeEmitter.cpp
index 77744e53d62..0d35806229a 100644
--- a/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCCodeEmitter.cpp
+++ b/lib/Target/WebAssembly/MCTargetDesc/WebAssemblyMCCodeEmitter.cpp
@@ -93,7 +93,9 @@ void WebAssemblyMCCodeEmitter::encodeInstruction(
         } else if (Info.OperandType == WebAssembly::OPERAND_GLOBAL) {
           llvm_unreachable("wasm globals should only be accessed symbolicly");
         } else if (Info.OperandType == WebAssembly::OPERAND_SIGNATURE) {
-          encodeSLEB128(int64_t(MO.getImm()), OS);
+          assert(MO.getImm() > 0 && (MO.getImm() & ~0x3f) == 0x40 &&
+                 "Signature must be pre-encoded negative single-byte SLEB");
+          OS << uint8_t(MO.getImm());
         } else {
           encodeULEB128(uint64_t(MO.getImm()), OS);
         }

Thank you so much for looking into this! Resubmitted in D44034.