Currently this is done rather hackily by letting it be parsed as an
MCSymbolRefExpr whose name we then validate. The logic is also split
across isFenceArg and addFenceArgOperands as a result. Instead, use a
custom parser so we look at the identifier token directly and construct
the immediate at the same time. As well as being cleaner, this also
allows us to give better error messages.
Details
- Reviewers
craig.topper luismarques asb
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | ||
---|---|---|
1688 | I see this SMLoc::getFromPointer(S.getPointer() - 1) repeated a lot in the assembly parser. Is this doing something I don't understand and setting a valid end location or are we just frequently setting an invalid end location? |
llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | ||
---|---|---|
1688 | Hm, indeed, this seems to be a consistent misunderstanding in the parser here. Normally you'd lex and _then_ get E based on the _new_ location, but this creates a -1-sized range. The -1 itself is also odd, though seems to be rather pervasive across the backends; SMRange is meant to be half-open, so the -1 is already accounted for. |
llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | ||
---|---|---|
1688 | You don't even need to Lex the next token. There's a getEndLoc() on the Token class. |
llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp | ||
---|---|---|
1688 | In this case, yeah, though some cases in this file are more complex and so would likely need to use that kind of pattern instead. |
I see this SMLoc::getFromPointer(S.getPointer() - 1) repeated a lot in the assembly parser. Is this doing something I don't understand and setting a valid end location or are we just frequently setting an invalid end location?