Clang generates a '%' prefix for some registers in CFI directives. E.g.
".cfi_register lr, r12" becomes ".cfi_register lr, %r12" after
processing.
Details
- Reviewers
hfinkel nickdesaulniers MaskRay - Group Reviewers
Restricted Project - Commits
- rGb2f663073917: [PowerPC] Allow a '%' prefix for registers in CFI directives
Diff Detail
- Repository
- rG LLVM Github Monorepo
Time | Test | |
---|---|---|
400 ms | linux > HWAddressSanitizer-x86_64.TestCases::sizes.cpp Script:
--
: 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang --driver-mode=g++ -m64 -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mcmodel=large -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions /mnt/disks/ssd0/agent/llvm-project/compiler-rt/test/hwasan/TestCases/sizes.cpp -nostdlib++ -lstdc++ -o /mnt/disks/ssd0/agent/llvm-project/build/projects/compiler-rt/test/hwasan/X86_64/TestCases/Output/sizes.cpp.tmp
| |
2,240 ms | linux > libarcher.races::task-taskwait-nested.c Script:
--
: 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-taskwait-nested.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskwait-nested.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskwait-nested.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-taskwait-nested.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-taskwait-nested.c
|
Event Timeline
I was thinking "would it be better to move this logic down into PPCAsmParser::MatchRegisterName? Well, I would suppose that would depend on if the % prefix on registers is valid in other contexts, for instance maybe other places that directly call PPCAsmParser::MatchRegisterName rather than `PPCAsmParser::tryParseRegister which you've modified here.
Right away, the next caller of PPCAsmParser::MatchRegisterName I found is in PPCAsmParser::ParseOperand:
1480 switch (getLexer().getKind()) { 1481 // Special handling for register names. These are interpreted 1482 // as immediates corresponding to the register number. 1483 case AsmToken::Percent: 1484 Parser.Lex(); // Eat the '%'. 1485 unsigned RegNo; 1486 int64_t IntVal; 1487 if (MatchRegisterName(RegNo, IntVal)) 1488 return Error(S, "invalid register name"); ... 1553 switch (getLexer().getKind()) { 1554 case AsmToken::Percent: 1555 Parser.Lex(); // Eat the '%'. 1556 unsigned RegNo; 1557 if (MatchRegisterName(RegNo, IntVal))
So it looks like it's currently idiosyncratic that some callers lex the % and others do not. I highly doubt that's intentional. In that case, would you mind please sinking the lexing of % into PPCAsmParser::MatchRegisterName, then removing the existing % lexing, that way all sites that expect to parse registers conditionally with % prefixed do so in one place? I guess the existing sites expect there to be a % unconditionally; I suspect it should be conditional, at least to match GNU as portability concerns.
When I do that, I get several register parsing failures, including for this testcase. I wanted to restrict this change to the CFI directives (this function seems to be used by those instead of normal functions). X86 has similar code in the same place.
I think what may be happening is that the MatchRegisterName function is looking at the token that's sometimes already lexed, and so isn't always "current" when the function calls.
Oh?
diff --git a/llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp b/llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp index 458edf71d6c8..5e4d17ec9a13 100644 --- a/llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp +++ b/llvm/lib/Target/PowerPC/AsmParser/PPCAsmParser.cpp @@ -1201,6 +1201,8 @@ bool PPCAsmParser::MatchAndEmitInstruction(SMLoc IDLoc, unsigned &Opcode, } bool PPCAsmParser::MatchRegisterName(unsigned &RegNo, int64_t &IntVal) { + if (getParser().getTok().is(AsmToken::Percent)) + getParser().Lex(); // Eat the '%'. if (getParser().getTok().is(AsmToken::Identifier)) { StringRef Name = getParser().getTok().getString(); if (Name.equals_lower("lr")) { @@ -1481,7 +1483,6 @@ bool PPCAsmParser::ParseOperand(OperandVector &Operands) { // Special handling for register names. These are interpreted // as immediates corresponding to the register number. case AsmToken::Percent: - Parser.Lex(); // Eat the '%'. unsigned RegNo; int64_t IntVal; if (MatchRegisterName(RegNo, IntVal)) @@ -1552,7 +1553,6 @@ bool PPCAsmParser::ParseOperand(OperandVector &Operands) { int64_t IntVal; switch (getLexer().getKind()) { case AsmToken::Percent: - Parser.Lex(); // Eat the '%'. unsigned RegNo; if (MatchRegisterName(RegNo, IntVal)) return Error(S, "invalid register name");
passes all tests for me, other than this newly added one; for some reason we get an additional newline before the blr instruction:
llvm-mc -triple powerpc64le-unknown-unknown /android0/llvm-project/llvm/test/MC/PowerPC/cfi-register-directive-parse.s .text .globl __test1 __test1: .cfi_startproc mflr 12 .cfi_register lr, r12 blr .cfi_endproc
(Though maybe that's a bug in MCAsmStreamer for the assembler .cfi directives, or how registers are printed? Maybe DwarfRegNumForCFI is related?)
PPCAsmParser::ParseOperand also has a comment that %rNN is used for ELF but not Macho; maybe this should be conditioned on isDarwin()? Though that comment and the current implementation of PPCAsmParser::ParseOperand don't look like they match to me. Also, the above print out is curious to me how r12 gets printed without r prefix for mflr. LLVM looks a little inconsistent here in how PPC asm register operands are printed.
I'm not sure how much ppc Darwin is much relevant anymore. If anyone cares, we can always make the % token consumption conditional.