This is an archive of the discontinued LLVM Phabricator instance.

[NFC, CFI] Use unsigned type for CFI register values in parser/streamer code
Needs RevisionPublic

Authored by RamNalamothu on May 11 2021, 1:19 AM.

Details

Summary

While MCCFIInstruction uses proper type for CFI register values i.e. unsigned,
the interfacing code to that in parsers and streamers are feeding signed
arguments. This fixes the inconsistency and makes the interfacing code also
to use an unsigned type for CFI register values.

Diff Detail

Event Timeline

RamNalamothu created this revision.May 11 2021, 1:19 AM
RamNalamothu requested review of this revision.May 11 2021, 1:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2021, 1:19 AM
RamNalamothu edited the summary of this revision. (Show Details)May 11 2021, 1:37 AM
MaskRay added inline comments.
llvm/lib/MC/MCParser/MasmParser.cpp
1832
epastor added inline comments.May 11 2021, 12:57 PM
llvm/lib/MC/MCParser/MasmParser.cpp
1832

I have no objection, this seems good to add here.

Would it be reasonable to just clean up all the types in these functions? I noted the ones I identified.

llvm/include/llvm/MC/MCStreamer.h
983

Should this just be uint64_t?

984

Same

993

Same

llvm/lib/MC/MCParser/AsmParser.cpp
1534

The Absolute in this case is just referring to "not relative", right? The Res value from evaluateAsAbsolute may be negative, and this cast will result in some large unsigned value for that case.

Maybe we just add an assert(Value > 0)? Or as you note a variant of evaluateAsAbsolute for unsigned which does the check could work.

llvm/lib/MC/MCParser/MasmParser.cpp
1844

Same

MaskRay added inline comments.May 11 2021, 2:24 PM
llvm/lib/MC/MCAsmStreamer.cpp
62

If you are changing the interface, you can rename the function to camelCase in the meanwhile.

Address the review comments.

RamNalamothu marked 7 inline comments as done.May 12 2021, 1:57 AM

Would it be reasonable to just clean up all the types in these functions? I noted the ones I identified.

Since these are virtual interfaces and will need changes in other places as well, I would defer that to an another patch.

llvm/lib/MC/MCParser/AsmParser.cpp
1534

I went ahead with an assert. Thank you.

For this patch, I feel it is a good point to stop here before getting into MCExpr and changing all variants of evaluateAsAbsolute() and probably parts of the interfacing code.

RamNalamothu marked an inline comment as done.

Missed changes in the previous update(s).

scott.linder added inline comments.May 12 2021, 8:39 AM
llvm/lib/MC/MCParser/MasmParser.cpp
1839

Looking at this again, it seems like at the same time you would make this change, you would probably just make parseAbsoluteExpression(uint64_t) virtual in the base class to match the signed variant? It seems odd to have an overload of parseAbsoluteExpression only in this derived class, so maybe a TODO at the declaration to clarify for future readers would be helpful?

1844

I don't think I considered the context fully here, as this will fire on pretty easy-to-write and reasonable user inputs, right? Any absolute expression in a CFA directive which ends up being negative will cause an assert.

Maybe this should be if (Value >= 0) return Error(...); instead?

RamNalamothu marked an inline comment as done.May 12 2021, 9:21 AM
RamNalamothu added inline comments.
llvm/lib/MC/MCParser/MasmParser.cpp
1839

I thought about it but while trying to limit the changes, I didn't explore it fully thinking that might trigger more changes but apparently not many changes. Will make it virtual in base class.

Yes, a TODO should have been helpful.

1844

If call sites invoke parseAbsoluteExpression(uint64_t) and expect a positive return value from an expression which actually yields negative value, I feel this check better be an assert so that those scenarios can be fixed appropriately.

RamNalamothu marked an inline comment as done.

Add virtual method parseAbsoluteExpression(uint64_t) to MCAsmParser.

RamNalamothu marked an inline comment as done.May 12 2021, 10:03 AM

the interfacing code in parsers and streamers are using both signed and unsigned type in different parts.

Where is uint64_t used? Most places use int64_t.

RamNalamothu edited the summary of this revision. (Show Details)May 12 2021, 6:57 PM

the interfacing code in parsers and streamers are using both signed and unsigned type in different parts.

Where is uint64_t used? Most places use int64_t.

For instance in AsmParser::parseRegisterOrRegisterNumber() and type cast in MCStreamer::emitCFIDefCfaRegister().

Anyway, I have changed the summary as the above scenarios are minor in the entire changes.

MaskRay requested changes to this revision.EditedMay 12 2021, 8:04 PM

the interfacing code in parsers and streamers are using both signed and unsigned type in different parts.

Where is uint64_t used? Most places use int64_t.

For instance in AsmParser::parseRegisterOrRegisterNumber() and type cast in MCStreamer::emitCFIDefCfaRegister().

Anyway, I have changed the summary as the above scenarios are minor in the entire changes.

Both AsmParser::parseRegisterOrRegisterNumber() and MCStreamer::emitCFIDefCfaRegister() uses int64_t.

OK, I am now convinced that most places use int64_t.
Using unsigned type to represent "negative values aren't allowed" is not a good idea.

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es106-dont-try-to-avoid-negative-values-by-using-unsigned

This revision now requires changes to proceed.May 12 2021, 8:04 PM
RamNalamothu added a comment.EditedMay 12 2021, 9:06 PM

the interfacing code in parsers and streamers are using both signed and unsigned type in different parts.

Where is uint64_t used? Most places use int64_t.

For instance in AsmParser::parseRegisterOrRegisterNumber() and type cast in MCStreamer::emitCFIDefCfaRegister().

Anyway, I have changed the summary as the above scenarios are minor in the entire changes.

Both AsmParser::parseRegisterOrRegisterNumber() and MCStreamer::emitCFIDefCfaRegister() uses int64_t.

bool AsmParser::parseRegisterOrRegisterNumber(int64_t &Register,          ---> signed for register
                                              SMLoc DirectiveLoc) {
  unsigned RegNo;                                                         ---> unsigned for register

  if (getLexer().isNot(AsmToken::Integer)) {
    if (getTargetParser().ParseRegister(RegNo, DirectiveLoc, DirectiveLoc))
      return true;
    Register = getContext().getRegisterInfo()->getDwarfRegNum(RegNo, true);
  } else
    return parseAbsoluteExpression(Register);

  return false;
}

OK, I am now convinced that most places use int64_t.
Using unsigned type to represent "negative values aren't allowed" is not a good idea.

Looks like I couldn't get the intent of this patch expressed better earlier.

This patch is based on the fact that we don't use negative numbers to represent registers in CFI, meaning DWARF register mappings always use positive numbers or at least that is what I understood. Based on that, we never emit

.cfi_register -2020, 2050 ---> value of register -2020 is saved in register 2050.

So, doesn't it make sense to use an unsigned type for a variable which is not expected to have negative numbers?
And that is what this patch is trying to achieve.

Otherwise, I think, we should be adding type casts similar to the one below

void MCStreamer::emitCFIDefCfa(int64_t Register, int64_t Offset) {
   ....
   CurFrame->CurrentCfaRegister = static_cast<unsigned>(Register);
}

at all the call sites, in the streamers/parsers being modified in this patch, to the MCCFIInstruction::create* methods because they expect unsigned arguments. For e.g.

void MCStreamer::emitCFIRegister(int64_t Register1, int64_t Register2) {
  ......
  MCCFIInstruction Instruction =
    MCCFIInstruction::createRegister(Label, static_cast<unsigned>(Register1), static_cast<unsigned>(Register2));
 .....
}