This patch introduces LI.S and LI.D pseudo instructions with floating point operands.
Details
Diff Detail
Event Timeline
I believe that we can achieve the same thing without introducing new floating-point operands. We could add the AsmToken::Real case in the switch cases inside parseImm() & parseOperand(). This way the generic parser would parse the floats/doubles, saving the result in a int64_t type. After that it's just a matter of re-interpreting the bits to float or double depending on the instruction that we expand, and save the new bits in in64_t again. This would allow us to utilize the backtracking from the generic matcher in the future. Of course, we wouldn't be able to print() the operands with their types. However, we only care for their value, not their types, so there's no harm done.
Please provide the whole context as per http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface
I haven't fully read the patch yet but there's some high-level things to talk about.
There's a good reason to avoid adding the new MipsOperand kinds which is that it's important that the MipsOperand kinds do not overlap. We had some really big problems with this a year or two ago and I ended up rewriting a lot of the parsing and MipsOperand handling to resolve it. This patch currently re-introduces the root cause of those problems which I'm keen to avoid doing.
The problem with overlapping MipsOperand kinds is that there's only one chance to get the operand kind correct and there's no support for backtracking. In the original problem we had two 'add.d' instructions with one accepting FGR64 registers, and the other accepting AFGR64. These two register classes use the same set of names ($f0, $f1, ...). Our matcher table contained both possibilities with the FGR64 one appearing first. What happened was, the ParserMethod for FGR64 would be called and would create three MipsOperand of kind k_Register with, for example, registers D0, D1, and D2. It would then check the feature bits and reject this match because the feature bits said that the FPU was 32-bit (and therefore D0_64, D1_64, and D2_64 were needed instead). Then it would try the AFGR64 case but because there's no backtracking, we still had the same MipsOperand's. The PredicateMethod would always return false for these because D0/D1/D2 are not members of AFGR64 and we would therefore reject this match too. Having found no match, we would then reject the input and error out. Variants of this problem affected the majority of our instruction set in one way or another.
This problem almost manifests in this patch for inputs like:
li.s $2, 1
If it weren't for the ParserMethod's (which we ought to remove, we need to keep them to a minimum because they cause the above problems), the '1' would become a MipsOperand of kind k_Immediate which would never pass the PredicateMethod.
I'd approach this in a similar way to what Vasileios is describing but more towards the way k_RegisterIndex works. I'd have a single k_Immediate operand that separately holds integer, and APFloat values as well as a bitfield indicating which types are possible. I'd then have appropriate MipsOperand::Create*Imm functions that tell it which types are possible with '1' being valid for integer, float, and double, while '1.1' would only be valid for float and double. Finally, I'd have predicate methods that test the appropriate encoding (if it's valid) and a render method for each encoding to add the appropriately converted operand to the instruction.
This is a lot easier to explain with in person and with diagrams :-). Did that make sense?
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
558 | Why did you remove RegKind_FGR? Without it, you can't match things like: mfc1 $2, $3 which is equivalent to: mfc1 $2, $f3 | |
lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h | ||
128–216 ↗ | (On Diff #39399) | I don't understand this code. We don't have 8-bit floating point to my knowledge and li.s/li.d should be able to handle normal 32-bit and 64-bit floating point values respectively. |
lib/Target/Mips/MipsInstrFPU.td | ||
560–579 | Formatting. It looks like you may have used clang-format on a tablegen file which won't work correctly. | |
test/MC/Mips/li.s.s | ||
5–7 ↗ | (On Diff #39399) | Hmm, it looks like we have some rounding here. I think it should be 0x3f8fcd35 but GAS emits the same output. |
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
556 | RegKind_FGR is removed because we didn't find another way to distinguish RegKind_GPR and RegKind_FGR in |
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
556 |
This should work: bool isStrictlyFGRAsmReg() { return isRegIdx() && RegIdx.Kind == RegKind_FGR && RegIdx.Index <= 31; } This will only be true when an FGR is the only option (i.e. the source said '$f4'). You'll also need to subclass FGR32/FGR64/AFGR64 in tablegen, override the predicate method to isStrictlyFGRAsmReg, and use those new operands in the li.s/li.d pseudos.
$4 is a RegKind_Numeric because $4 is ambiguous without additional context. It could be a A0, F4, D4, D4_64, FCC4, COP24 (register 4 in the COP2 set, we should probably add an underscore to the name), etc. depending on the mnemonic and which operand it appears in. The same is true of $f4 and RegKind_FGR to a lesser extent. It's still ambiguous which register '$f4' refers to and could be F4, D4, or D4_64 but it definitely isn't A0, COP24, etc. The main thing here is that MipsOperand describes what the operand _might_ be rather than what it really is (we figure out what it _is_ at a later point in the assembler). With that in mind, MipsOperand::RegIdxOp::Kind is a bitfield representing all the possible interpretations of the operand by any instruction in our instruction set. There are some instructions where $4 is a floating point register, therefore RegKind_FGR must be part of RegKind_Numeric. The ambiguity is resolved by the match table in the AsmMatcher. For each matchable, we call a particular predicate on each operand (specified in the tablegen definitions) and the first one to find that all the predicates are true is a match. When you have multiple matchables that can accept the same operands, the one that appears first in the table is chosen. I expect that LoadImmDoubleFGR appears first in your table so $4 is accepted there and LoadImmDoubleGPR never has the chance to match. We need to either make the two cases distinct (see isStrictlyFGRAsmReg() above) or control the sort order of the table. | |
lib/Target/Mips/MipsInstrFPU.td | ||
602–616 | These should have the appropriate FGR_32, FGR_64, and HARDFLOAT adjectives. | |
614 | I don't think you mean FGR32Opnd here. You need FGR64Opnd for a 64-bit FPU and AFGR64Opnd for a 32-bit FPU. |
Hi,
I'm unlikely to be able to look at this in the near future. I've added the new Mips code-owner (Simon) to the review.
In several places there is the expression 'FirstReg + 1', This is unsafe as tablegen does not order the registers in the namespace as you would expect. Instead write a helper function to compute the next register.
Currently lib/Target/Mips/MipsGenRegisterInfo.inc in the build directory, the next register after a3 is ac0, not t0.
The second issue is that the mthc0 is only available for MIPS32r2 or later. For earlier revisions, the expansion is to use two mtcs, accessing the next numerical register and treating f0 as the "next register" in the f31 case.
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
197–201 | There appears to be spurious white space on the line before this prototype, please remove it when you're committing. | |
921 | Here too. | |
956 | Whitespace here too. | |
2196 | Here as well. | |
2750–2756 | Please adjust the comment to say that this is a conversion of a double in an uint64_t to a float in a uint32_t, retaining the bit pattern of a float. | |
2805 | Indentation, this needs another space before the TOut.emitRRX(Mips::LWC1, ... | |
2830–2836 | Nm. | |
2874–2875 | Rather than using FirstReg + 1, instead write a helper function to compute the next register. It is unsafe to rely on tablegen ordering the registers in the expected manner. Currently with FirstReg == Mips::A3, FirstReg + 1 is Mips::AC0, not Mips::T0. | |
lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp | ||
714 | Spurious newline. | |
lib/Target/Mips/MipsInstrFPU.td | ||
606–608 | This needs a HARDFLOAT predicate. | |
test/MC/Mips/macro-li.d.s | ||
268 | Spurious whitespace at the end of this line. | |
test/MC/Mips/macro-li.s.s | ||
32 | Spurious whitespace at the end of this line. | |
86 | Spurious whitespace at the end of this line. |
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
2759 | Two things: This function also needs to handle floating point registers. Rather than returning 0, instead call llvm_unreachable("Unknown register in assembly macro expansion!"); | |
2919 | This block has the wrong condition. This should check that the lower 32 bits are zero and the high part can be loaded with a single instruction. | |
2921–2926 | Add a FIXME comment here noting that in the case where the constant is zero, we can load the register directly from the zero register. | |
2923 | This condition is not required. | |
2927 | This also needs a check for hasMips64() before checking hasMips32r2(). In the mips64 case, we use dmtc1. | |
2928–2929 | If you've reverse engineered this from the output of gas, you've hit upon a latent gas bug. On a MIPS32 system (or a MIPS64 system executing MIPS32 code) , writing to a 64 bit FPU register must be done in a specific order; use mtc1 to write the lower 32bits, then use mthc1 to write the upper 32 bits. mtcX instructions leave the upper 32bits *UNPREDICTABLE*. Reverse the order of these two lines. | |
2931–2932 | See my comment about nextReg. Also, indentation is incorrect. | |
2969–2970 | Rather than Mips::LDC1, this should be (IsFPU64 ? Mips::LDC164 : Mips::LDC1) to get the correct instruction. Add a FIXME comment here noting that this expansion is incorrect for mips1, it should expand to two word loads. | |
test/MC/Mips/macro-li.d.s | ||
2 | Can you also a RUN line for mips32 and add appropriate tests? |
I missed something the first time around reviewing this. The usage of the .lit4 & .lit8 is conditional on the usage of the sdata section. For now, you can removed the .litX handling and just use .rodata in all cases. There's a longer explanation inline.
Thanks,
Simon
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
2726–2727 | This is almost correct. GAS will assemble "li.d $f31, 2.5" into "lui $at, 0x4004; mtc $at, $f0; mtc $zero, $f31". You'll need to expand out the check or check for that specific case. | |
2761 | Stray space the end of the line. | |
2833–2839 | I missed this the first time around. The usage of .lit4 & .lit8 is permitted when: a) the small data section is in use, and b) the size of the constant is within the size threshold for the small data section. If the small data section cannot be used, the constant is located within the .rodata section. For the moment, just change this to always use the .rodata section, and add "FIXME: Enhance this expansion to use the .lit4 & .lit8 sections where appropriate." This avoids having to modify MipsTargetObjectFile.cpp with arguably unrelated changes. | |
2958–2966 | See my comment about the .lit4 section. | |
test/MC/Mips/macro-li.d.s | ||
306 | Please put a newline at the end of the file. |
A few small things. Mostly the changes necessary are along the lines of differentiating between PIC and non-PIC. I've written them inline. The other minor change is that you should remove the stub implementation of the literal section and submit that afterwards. It will be easier to implement/review then.
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
2809–2815 | Rather than going through void pointers, use BitsToDouble & co. from llvm/Support/MathExtras.h. | |
2850–2867 | This section isn't quite right. In the PIC case, we have to load the address of the constant via the GOT, with a R_MIPS_GOT16 relocation attached to an lw. Then load the constant with ldc with a relocation type R_MIPS_LO16 attached to it. In the non-pic case, it's lui with a R_MIPS_HI16 relocation, then a ldc1 with a R_MIPS_LO16. | |
2881–2886 | This is the expected behaviour for loading a 64bit value in 32bit GPRs. For 64 bit GPRs we have to load it into a single register. Test the ABI to determine if we should load into one or two registers. | |
2910–2936 | This hunk needs to be re-arranged slightly. The first step is to generate an emit the upper portion of the constant's address for non-PIC or load the address of the the constant from the GOT. Both cases set up $at. The second step is to build an expression that refers to the base address of the constant. Then finally, if the ABI is N32 or N64 perform emit ld, otherwise since it's O32, emit two loads. | |
2950 | This should be if (isABI_N32() || isABI_N64()) as the instruction expansion is dependant on the ABI. | |
2980–2996 | This section isn't quite right. In the PIC case, we have to load the address of the constant via the GOT, with a R_MIPS_GOT16 relocation attached to an ld. Then load the constant with ldc with a relocation type R_MIPS_LO16 attached to it. In the non-pic case, it's lui with a R_MIPS_HI16 relocation, then a ldc1 with a R_MIPS_LO16. | |
test/MC/Mips/macro-li.s | ||
1 ↗ | (On Diff #88364) | This file can be kept, it's testing a different macro. |
Made additional ABI checks when loading floats to FPRs, doubles to GPRs and doubles to FPRs.
We've mostly compared it to GCC, the caveat being that GCC generates floats using lui/ori in GPRs and then moves them to FPRs with mtc1 if necessary,
whereas we always place constants into .rodata, and have relocations accordingly.
Another difference is that despite invoking GCC without -fpic flag for n64, it generates code that has R_MIPS_GOT16 relocations.
Implemented that part to use %higher, %highest relocations to fetch the address.
This needs a bit more work. The problem is that the address for the constant loads is the address of the .rodata section, not the address of the constant in the .rodata section. I've spelled out the changes required inline.
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
2781 | Can you make this a private class member of MipsAsmParser instead? It avoids having to explicitly pass STI, ATReg, IsPicEnabled. | |
2882–2896 | See my comment about generating the addresses correctly. | |
2930–2942 | See my comment about generating addresses correctly. | |
2982–2994 | This two hunks need to be reversed and modified slightly. The problem is that you're generating the symbol with the name of the ReadOnlySection. The linker will resolve this as to the start of the ReadOnlySection and rewrite the relocs to that value. Instead, you need to create a symbol visible to this object, switch to the ReadOnlySection, then emit the symbol and data. | |
test/MC/Mips/macro-li.d.s | ||
2 | Small change here for both files. Can you use the label O32-N32-(NO-)PIC as required? It makes to more obvious that the check lines are for O32 and N32. |
LGTM with the fixmes added and the whitespace nit addressed.
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
2806 | Sorry, I missed something here. For O32 and N32 all addresses are 32 bits. For N64 addresses are 64 bits unless -msym32 is used--in which case symbols are 32 bits. Unfortunately, I've found something questionable in the GNU assembler which is that li.d always assumes that the address of the temporary symbol used to load the constant is always 32 bits in the non-pic case. Can you attach two fixmes here? The first is that our assembler is technically correct but gives a different result to gas but gas is incomplete there (it has a fixme noting it doesn't work with 64-bit addresses). The second is that with -msym32 the address expansion for N64 should probably use the O32 / N32 case. It's safe to use the 64 address expansion as the symbol's value is considered sign extended. | |
2943–2944 | FIXME: This method is too general. In principal we should compute the number of instructions required to synthesize the immediate inline compared to synthesising the address inline and relying on non .text sections. For static O32 and N32 this may yield a small benefit, for static N64 this is likely to yield a much larger benefit as we have to synthesize a 64bit address to load a 64 bit value. | |
test/MC/Mips/macro-li.d.s | ||
357 | Spurious whitespace at the end of this line. |
I haven't looked too closely at the output of the macro since I don't know what it should be but it looks like the issues I raised earlier in the review are no longer there. I just have one nit about nextReg().
lib/Target/Mips/AsmParser/MipsAsmParser.cpp | ||
---|---|---|
2725 | I'd recommend a comment mentioning the D0 + 1 == F1 and F1 + 1 == D1 quirks. It makes sense when you're thinking of it in the context of where it's being used but F1 + 1 == D1 in particular isn't very obvious without that context. It would be reasonable for a reader to expect F1 + 1 == F2. |
There appears to be spurious white space on the line before this prototype, please remove it when you're committing.