Allow instructions such as 'cmp w0, #(end - start)' by folding the
expression into a constant. For ELF, we fold only if the symbols are in
the same section. For MachO, we fold if the expression contains only
symbols that are not linker visible.
Details
Diff Detail
Event Timeline
This is relaxing the error-checking in the asm parser quite significantly, so please look for any edge cases that I might have missed in the tests.
Also, I'm not sure this is doing the right thing for Darwin, any guidance there is more than welcome.
Thanks.
@t.p.northover should have a look at the Darwin relocation, but the rest looks fine for me.
Nice set of tests, thanks!
| lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp | ||
|---|---|---|
| 3601 | Maybe hacky, but can't you just count Loc[] and return 2 | 3 depending on the size? | |
| lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp | ||
|---|---|---|
| 3601 | Right, I'll try to fix that. | |
I think the MachO logic is pretty problematic:
- Because of .subsections_via_symbols a relocation would usually be needed for 2 assembler-local symbols if there's a linker-visible one between them.
- I think ARM64_RELOC_UNSIGNED is supposed to set all bits of the data it covers, not parse and interpret the instruction it's in the middle of.
I don't think there's an appropriate MachO relocation for this either, so we probably shouldn't try.
How does the ELF logic deal with attempts to perform subtractions across sections (the equivalent of the first MachO problem)?
How does the ELF logic deal with attempts to perform subtractions across sections (the equivalent of the first MachO problem)?
Ah, I see there's logic in ELFObjectWriter.cpp and it really is checking the section, so MachO probably can't benefit.
We probably want to diagnose attempts in MachO where MCAssembler::getAtom is different (or null) for each symbol. With that restriction, we can even cope if one of them is the defining (visible) symbol of the atom.
I agree with Renato that the ELF side looks good.
Some suggestions the tests:
- All of the instructions are either add or cmp and all use the w registers. It may be worth varying these a bit more to cover the other cases within the switch(Inst.getOpcode())
- The labels used in the instructions look to be backwards references to labels already defined. It would be good to have some tests that use forward references to labels that are defined later in the file.
- All the Darwin tests are of the form (label1 - label2 + constant), are there other forms that can be expressed?
I note that the label expression must evaluate to a positive value. For example #(end - start) is ok, but #(start - end) or #-(end-start) is not. I don't think that there is much that you can do about this in the general case as IIRC the value of the expression is evaluated after the fragments are fixed and it is too late to easily change an ADD into a SUB or vice-versa.
To the best of my knowledge the Mach-O side looks ok, but I agree with Renato that someone more familar with Mach-O should take a look.
Those are very good suggestions, thanks.
- All the Darwin tests are of the form (label1 - label2 + constant), are there other forms that can be expressed?
I'm actually not proud of the Darwin tests, but I wanted to get an initial round of feedback before diving too deep. From what I've played with so far, we can't express (label1 - label2) >> constant (at least not out of the box; I don't know how we'd add support for that). Also, since we're generating relocations, we can't diagnose a lot of cases where we error out for ELF - I'm not sure what to do about these, at the moment I'm just hoping the linker will error out in a helpful way, or maybe magically handle them by moving atoms around.
I note that the label expression must evaluate to a positive value. For example #(end - start) is ok, but #(start - end) or #-(end-start) is not. I don't think that there is much that you can do about this in the general case as IIRC the value of the expression is evaluated after the fragments are fixed and it is too late to easily change an ADD into a SUB or vice-versa.
I should at least add a test-case diagnosing this though.
Thanks for the feedback.
Fix Darwin to allow label arithmetic only with assembler-private labels and update the Darwin tests accordingly. Also update the ELF tests as per the review comments.
| lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp | ||
|---|---|---|
| 735 | I think this is better written as: if (auto *CE = dyn_cast<MCConstantExpr>(Expr)) return CE->getValue() >= 0 && CE->getValue() <= 0xfff; return true; | |
- Address Saleem's comment (thanks!)
- Get Darwin to error out on invalid values obtained from label arithmetic, i.e. negative values or values that are too large to fit. I did this by overriding the processFixupValue method in DarwinAArch64AsmBackend. It now calls adjustFixupValue with a legitimate MCContext, so it can dump errors (just like the ELF backend does).
I think this is better written as: