This is an archive of the discontinued LLVM Phabricator instance.

[MC] Fix undefined behavior in MCInstPrinter::formatHex
ClosedPublic

Authored by JDevlieghere on Sep 5 2019, 12:52 PM.

Details

Summary

Passing INT64_MIN to MCInstPrinter::formatHex triggers undefined behavior because the negation of -9223372036854775808 cannot be represented in type 'int64_t' (aka 'long long'). This patch puts a workaround in place to just print the hex value directly, without the leading -. A possible alternative involves using a small helper functions that uses (implementation) defined conversions to achieve the desirable value:

static int64_t helper(int64_t V) {
  auto U = static_cast<uint64_t>(V);
  return V < 0 ? -U : U;
}

The underlying problem is that MCInstPrinter::formatHex(int64_t) returns a format_object<int64_t> and should really return a format_object<uint64_t>. However, that's not possible because formatImm needs to be able to print both as decimal (where a signed is required) and hex (where we'd prefer to always have an unsigned).

format_object<int64_t> formatImm(int64_t Value) const {
  return PrintImmHex ? formatHex(Value) : formatDec(Value);
}

Diff Detail

Event Timeline

JDevlieghere created this revision.Sep 5 2019, 12:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2019, 12:52 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
shafik added a comment.Sep 5 2019, 1:01 PM

It is not obvious what the change in result is expected to be so a test would be helpful.

I'm confused - the patch description mentions passing unsigned min to a function that takes a signed value - if there's UB, it's in the caller so can't really be fixed in the callee, right? (or is the UB still present with passing int32 max too?)

JDevlieghere edited the summary of this revision. (Show Details)
  • Add test
  • Fix description to say INT64_MIN instead of UINT64_MIN
  • Update test (Forgot to stage before updating the patch)

Couldn't this value be hardcoded? The maximum value of a 64 bit signed integer is a known value, right? We could hardcode that string?

  • Hard-code the value

I don't know what it is with me today, but this patch seems to be suffering a lot from (un)staged changes...

dblaikie accepted this revision.Sep 5 2019, 4:28 PM

Seems fairly reasonable to me - test might be able to be cleaned up a bit. Code might be cleaned up a bit (either now or in a follow-up commit)

llvm/lib/MC/MCInstPrinter.cpp
88–93

There's a lot of else-after-return here, and maybe a special case might work here:

if (Value < 0) {
  if (Value == min)
    return constant
  return format("-0x%"... -Value);
}
return format(...)

(this means only one test for the positive case, rather than 2)

But all that might be better as a separate cleanup patch.

llvm/unittests/MC/MCInstPrinter.cpp
61–64

Could probably put this in a test fixture, rather than a singleton? (but if there's nearby prior-art, I'm OK with it) (oh, the Printer class itself could be the test fixture class)

This revision is now accepted and ready to land.Sep 5 2019, 4:28 PM
This revision was automatically updated to reflect the committed changes.