This patch adds support in the MC layer for parsing and assembling the 4-operand add instruction needed for TLS addressing. This also involves parsing the %tprel_hi, %tprel_lo and %tprel_add operand modifiers.
GCC/binutils allows add a0, a0, tp, 0. I've submitted a bug against binutils  which attempts to clarify if that should actually be supported or not. Depending on the feedback, this patch's requirement that %tprel_add needs to always be present might need to be revised.
I am a bit curious about this function because it looks very similar to RISCVAsmParser::parseOperandWithModifier(OperandVector &Operands), which ends being used when attempting to parse an immediate that starts with %.
My understanding is that isTPRelAddSymbol should filter wrong cases without the need of a custom parser but I think you wanted a more precise diagnostic (as shown in the relevant testcases below)? Is my understanding correct or I am missing other reasons to need a custom parser here?
Yes, I did pretty much copy the function. I will check just using isTPRelAddSymbol with parseOperandWithModifier as the parsing method and see if I can get some decent error messages out of it. If non-%tprel_add operands are valid as @luismarques suggested might be the case then this would be irrelevant anyway.
They've clarified in that bug report that it's not intended and should be fixed, so we don't need to support it.
For consistency with Match_InvalidSImm12 (along with my %got_pcrel_hi change and your Match_InvalidTPRelAddSymbol) I'd drop the () from the error message.
We also need a fixup_riscv_relax depending on STI.getFeatureBits()[RISCV::FeatureRelax].
Maybe add a comment explaining why we don't expect to see it? Also, perhaps a separate llvm_unreachable that's more useful?
I assume this is coming from your local rebasing of my patches. Note that GOT_HI is *not* a relax candidate so this is incorrect.
To clarify why: HI20/LO12 and PCREL_HI20/PCREL_LO12 pairs are relaxed to use GP, CALL/CALL_PLT are relaxed to a single (C.)JAL(R), and TPREL_HI20/TPREL_ADD/TPREL_LO12 is relaxed to a single TP-relative instruction (the LUI and ADDI disappear, and the LO12's instruction's base register is changed to TP).
Since the GOT is almost certainly not within 2K of PC, there's no opportunity to shorten the sequence for PIC. In theory, if you're linking -fPIC code into an executable, you could relax the indirection via the GOT (similarly if you're using -Bsymbolic and the symbol is defined in the output), and also for an executable you could go further and apply the same GP relaxation. However, these are both fairly niche, unlike the other relaxations, hence why they don't exist.