Try to resolve the difference of two symbols in consecutive MCDataFragments.
This is important for an idiom like "foo:instr; .if . - foo; instr; .endif"
(https://bugs.llvm.org/show_bug.cgi?id=43795).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 40029 Build 40104: arc lint + arc unit
Event Timeline
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
517 | Move these closer to their use below. It is nice to have them all declared together, but it would be nicer to bail as early as possible without doing more work than necessary. | |
564 | This patch adds checks for thumb, but the test case doesn't use a thumb target triple. Consider adding tests that exercise the newly added code. | |
569 | MicroMips? | |
576 | It looks like this block was copied from L529-L548. Is it possible to adjust the condition on L529 and addend calculation on L532 to support this case, rather than duplicating so much code? Looks like this repeats again below. Is it possible to refactor some of this, maybe into its own method, for better code reuse? | |
llvm/test/MC/AsmParser/directive_if_with_dot_symbol.s | ||
1 ↗ | (On Diff #226704) | Is it possible to add tests for other ISA's, too? |
6 ↗ | (On Diff #226704) | We're checking that an error occurs? Wouldn't a test for this new feature test that an error does not occur? Or am I misunderstanding the test? |
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
564 | It's actually checking if SA is flagged with .thumb_func, although adding .thumb_func right before label 9997 does not make the check become true? | |
569 | It seems micromips has been used through out the file, will keep it for consistency. | |
llvm/test/MC/AsmParser/directive_if_with_dot_symbol.s | ||
1 ↗ | (On Diff #226704) | I have not figured out an example with other ISA. The reason is MCFragment is transparent to assembly so I do not know an easy way to force two adjacent labels to be assigned to two MCFragment in the assembly file. For the same reason, the ISA extension right after this line is needed for the integrated assembler to place 9997 and the .if directive into two different fragments in the same section. |
6 ↗ | (On Diff #226704) | Yes, you are correct. I think I uploaded an incomplete version. Updated it. |
I'd like to see some more test cases for the error cases. Off the top of my head I can think of these cases:
// Align fragment in between MCDataFragments 9997: nop ; .align 4 nop .if . - 9997b == 4 ;
// Fill (I think) fragment in between MCDataFragments 9997: nop ; .space 4 nop .if . - 9997b == 4 ;
// Relaxable (in Thumb) fragment in between MCDataFragments 9997: nop ; b external nop .if . - 9997b == 4 ;
// Constant pool between MCDataFragments, 9997: ldr r0,=0x12345678 ; .ltorg nop .if . - 9997b == 4 ;
There is also something called a MCCompactEncodingInstFragment that would cause this to fail, but I think that only exists as part of NaCl bundle locking, which I think isn't too much of a concern.
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
556–578 | I think it would be useful to add a comment explaining this line and what kind of expressions it is expected to support. It isn't intuitive given | |
llvm/test/MC/AsmParser/directive_if_offset.s | ||
1 ↗ | (On Diff #226839) | Be careful with module dependencies and targets.
if not 'X86' in config.root.targets: config.unsupported = True Which is backed up by all of the tests in the directory using X86 backend triples. I think that this test if it remains in this location will need to be rewritten in x86. If the test needs Arm I think it ought to go in the MC/ARM directory. |
1 ↗ | (On Diff #226839) | The test fails when there is asm output from llvm-mc. llvm-mc -filetype=obj -triple=armv7a-linux-gnueabihf directive_if_offset.s -o /dev/null echo $? 0 However when using the assembler output, this fails: llvm-mc --triple armv7a-linux-gnueabihf directive_if_offset.s -filetype=asm -o /dev/null directive_if_offset.s:7:5: error: expected absolute expression .if . - 9997b == 4 ; It may be that there is no way to resolve this in the asm output as the data structures might not exist. It will be worth a check to see though. I think that differences between the asm and obj outputs of llvm-mv are frowned on, but I don't think that they are necessarily fatal if coming from a assembler file and not something generated by clang. This might be a problem if the .if exists in inline assembler, not that I'd recommend anyone do that. |
5 ↗ | (On Diff #226839) | Is this line significant for the test? If it isn't then it is worth taking it out. |
8 ↗ | (On Diff #226839) | Unless the location is particular significant I'd recommend just "error: expected absolute expression". This will make it a bit less sensitive if the change is added to. |
10 ↗ | (On Diff #226839) | You could use llvm-objdump -d on the ELF output to test that the correct instruction had been generated. Normally we'd use llvm-mc -filetype=asm and use FileCheck on that but that isn't working in this case. |
When I run check-llvm with this patch applied I get 7 failures:
LLVM :: DebugInfo/Mips/delay-slot.ll LLVM :: DebugInfo/Mips/dsr-fixed-objects.ll LLVM :: DebugInfo/Mips/dsr-non-fixed-objects.ll LLVM :: DebugInfo/Mips/fn-call-line.ll LLVM :: MC/AsmParser/directive_fill_2.s LLVM :: MC/MachO/reloc-diff.s LLVM :: MC/X86/expand-var.s
This is applied to master revision 12c9ffd108345f643df98dfa8653af1a4311ed86 and the tests don't fail with master. Is it possible your most recent changes have broken something? I'm testing a release build of LLVM with clang, assertions enabled.
There are also some test cases for other ISA's in https://bugs.llvm.org/show_bug.cgi?id=41825.
Yes, I am aware of failure on LLVM :: MC/MachO/reloc-diff.s and working on it, but my build does not have assertions on so I haven't seen the other failures. Thanks for verifying, I will take a look on them.
Thanks for providing these test cases! It seems the third and fourth case do not produce any errors as expected (even with arm gcc). Will dig more.
The latest patch should address these issues. It seems other than .if conditions, llvm::MCExpr::evaluateAsAbsolute is allowed to not immediately resolve the difference of two symbols in adjacent fragments, such as ".long _external_def - _local_def" in llvm/test/MC/MachO/reloc-diff.s. They will be resolved later when finalizing the layout of the object file. The newer version of the code essentially passes an additional flag to tell the function whether the substraction to evaluate is an if condition.
The cause of 41825 is different from the one this patch tries to solve. But their solution looks similar enough so I have added code to handle those as well.
llvm/test/MC/AsmParser/directive_if_offset.s | ||
---|---|---|
1 ↗ | (On Diff #226839) | I have moved the tests to MC/ARM directory. |
10 ↗ | (On Diff #226839) | I did verify myself with -filetype=obj and llvm-objdump -disassemble, although I can't seem to trigger if (Asm->isThumbFunc(&SA)) Addend |= 1; and therefore cannot the else branch here. Any thoughts on this? Thanks. |
I think that this needs some wider review for a couple of reasons. It looks like this is heading towards a special case evaluation just for .if. I'm not comfortable going too much further in the approval process as this is exceeding my knowledge of MC, and I'd like see some supporting opinions on whether this is the right thing to do. Will be worth adding some more reviewers, particularly someone familiar with Mach-O. You may need an RFC to the llvm-dev mailing list to draw some attention.
I'd also like to see if there is a way of implementing this without special casing .if. That is a pragmatic solution for a specific problem, but it does make the code harder to understand so it has a cost. Looking at the reloc-diff.s test case the commit log says:
MC/Mach-O: Use the SECTDIFF relocation type for (A - B + constant) where A is external. - I'm not sure why, but this is what 'as' does.
I've not been able to easily find a definition of what SECTDIFF is, but it appears to be a relocation that supports subtraction, I don't think that there is an equivalent in ELF. It maybe that the case can be resolved by not doing the folding for MachO.
some other specific comments.
- Some of the examples I gave would only fail in Thumb as Arm instructions are all the same size, whereas some Thumb-2 instructions are assembled initially as the 2-byte narrow form, and are relaxed to the wider 4-byte form if the instruction is out of range. You'll need something like -triple=thumbv7a-linux-gnueabihf to see the errors. I'd not expect some of these to fail on GNU as (that is a 2-pass assembler).
- If you can make an X86_64 test I recommend doing so as few contributors will have this optionally removed. Substantially more have ARM optionally removed, as do several buildbots.
llvm/include/llvm/MC/MCExpr.h | ||
---|---|---|
60 ↗ | (On Diff #227553) | There is a comment in evaluateAsAbsolute // Setting InSet causes us to absolutize differences across sections and that // is what the MachO writer uses Addrs for. I think it would be useful to have something similar for IsCond. |
llvm/lib/MC/MCExpr.cpp | ||
491 | When there is more than one boolean parameter it can get difficult to track which is which, can you annotate the call sites with literal values with a comment? I've left a comment on the ones that I've noticed. /* InSet */ true, /* IsCond */ false. | |
499 | /* IsCond */ false. | |
582 | If I've understood the code correctly I think we could be a bit more specific here: // When there is no layout our ability to resolve differences between symbols is // limited. In specific cases where the symbols are both defined in consecutive // MCDataFragments the difference can be calculated. This is important for an // idiom like foo:instr; .if . - foo; instr; .endif // We cannot handle any kind of case where the difference may change due to // layout. | |
584 | It is difficult to see how it would be possible to resolve .if conditions at layout time in a single pass assembler. In theory the assembler could evaluate all conditional blocks and select between them at layout time, if such a layout could be converged on. | |
699 | /* InSet */ false, /* IsCond */ false. | |
705 | /* InSet */ true, /* IsCond */ false. | |
746 | /* IsCond */ false. | |
746 | We have IsCond passed in as a parameter to evaluateAsRelocatableImpl, so even if it is passed in true, we set it to false here? Is it important for the value to be false here? If so then it implies that IsCond might not be specific enough a name. If it just doesn't matter then can we pass in IsCond here? | |
776 | /* IsCond */ false. |
Address some concerns.
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
556–578 | This is inherited from the original code, except it was before the check for Layout originally. I am not exactly sure why this check is needed in the first place. In my opinion this is redundant with the check for Layout, as we can always calculate the differences of two symbols based on the order of the sections (MCAsmLayout::getSectionOrder ()) they are in and their offset within its own section respectively if the layout is provided., but I do need this check anyway for our case. Any thoughts on why this check is necessary? | |
584 | Thanks for the clarification, although I am not sure I follow. The code looks iterative to me https://llvm.org/doxygen/MCAssembler_8cpp_source.html#l00785. I was thinking as the loop iterates and calls layoutOnce, we can relax (if needed) and calculate the sizes of the fragments before a .if statement and resolve the condition there. But I am not completely convinced by myself that it is doable. Also there some cases like the one below will create more complexity. foo: jump to bar This creates a loop of dependency as depending on the instruction selected in the if-else block, the size of the jump instruction may change due to the number of bits it need to specify the offset, which in turn affects which instruction should be chosen. | |
746 | Yes, thanks for the catching this. IsCond is never used here so I simply replaced it false. But passing in IsCond is a better choice. |
I have sent an RFC and am waiting for comments.
I'd also like to see if there is a way of implementing this without special casing .if. That is a pragmatic solution for a specific problem, but it does make the code harder to understand so it has a cost. Looking at the reloc-diff.s test case the commit log says:
MC/Mach-O: Use the SECTDIFF relocation type for (A - B + constant) where A is external. - I'm not sure why, but this is what 'as' does.I've not been able to easily find a definition of what SECTDIFF is, but it appears to be a relocation that supports subtraction, I don't think that there is an equivalent in ELF. It maybe that the case can be resolved by not doing the folding for MachO.
some other specific comments.
- Some of the examples I gave would only fail in Thumb as Arm instructions are all the same size, whereas some Thumb-2 instructions are assembled initially as the 2-byte narrow form, and are relaxed to the wider 4-byte form if the instruction is out of range. You'll need something like -triple=thumbv7a-linux-gnueabihf to see the errors. I'd not expect some of these to fail on GNU as (that is a 2-pass assembler).
Thanks for the clarification. I have included all the four test cases.
- If you can make an X86_64 test I recommend doing so as few contributors will have this optionally removed. Substantially more have ARM optionally removed, as do several buildbots.
Yes, I agree. Will try to make test cases for x64.
Thanks for the update. I'll wait a few days to see what comments we get. If there are none then I guess there aren't any strong objections.
To clarify the remarks about resolving .if at layout time. I think that there are two major obstacles.
- MC assembles instructions once, if the condition in the .if is not satisfied the contents of the block aren't even parsed. For example, the following will assemble if the .if condition fails, but will fail to parse if the .if condition passes. We'd need to change MC to either reparse (effectively making it a multipass assembler), or to parse and remember all parts. This is doable but it isn't a small change.
.text .if 0 You aren't parsing me are you? .else nop .endif
- The second problem, and not one unique to llvm-mc, is that it is possible to write a program that doesn't converge, something like (not testable as it needs relaxation and late evaluation of .if):
label: beq after // In thumb 2 this 2-byte branch will be relaxed to a 4-byte branch if out of range. .if . - label == 2 .space 1024 * 1024 // sufficient to make after out of range .endif after: nop
In pass 1, beq is 2-bytes in size, the .if passes, beq is then relaxed to 4-bytes, which would make the .if fail, which then makes beq 2-bytes ...
The interaction with relaxation could be fixed by making all size increases permanent. However I think it is possible to write multiple .if conditions that conflict. In Arm's old 2 - pass assembler (pass -1 find all sizes so layout is known, pass-2 encode instructions knowing layout), we had an error message when the equivalent of .if evaluated differently in subsequent passes. In summary, I think it would be a major change to MC to support something like late evaluation of .if in a reliable way.
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
584 | See also b/132538429.
|
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
482 | You can move this declaration down closer to where it is used. No need to construct if the below conditional returns early. |
Thanks for all the clarification and examples. I was wondering if there would be an easier way for late evaluate of .if condition so we could avoid the current implementation, which I totally agree is not straightforward and somehow ad-hoc. But I guess a major haul to MC seems to be the only alternative based on what you and Nick said, which is probably an overkill for the particular issue this patch tries to solve. However, this patch will not solve the second case you brought up, so how likely do you think we would encounter such cases, and should we consider the multiple-pass solution to future-proof these cases if they happen frequently enough, or maybe we could rewrite the assembly instead to avoid such complexity?
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
584 | Yeah, thanks for the explanation. I wonder if GAS would be able to support all these paradoxes. |
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
591 | Thanks for the clarification. I have adjusted my code accordingly. |
llvm/test/MC/AsmParser/directive_if_offset.s | ||
---|---|---|
5 ↗ | (On Diff #226839) | This line is required to reproduce this issue, as it changes the subtarget and forces the creation of a new MCDataSegment (https://llvm.org/doxygen/MCObjectStreamer_8cpp_source.html#l00157). I found a similar directive for i386, .arch + cpu_type (https://sourceware.org/binutils/docs/as/i386_002dArch.html#i386_002dArch), although the parser does not support it. To have a similar test case on x86, it seems we would have to first support that directive. However, I could not find any uses of this directive in llvm or Linux kernel, so I assume it is not commonly used. Will keep the test case for arm for now. |
To summarise where I think we are right now.
- D76002 fixes a problem affecting within MCDataFragment eager evaluation of .if expressions. Specifically when there is a label that would be inserted into the MCDataFragment, but at the time of encountering the label the fragment hadn't been created. MC will attempt to reuse an MCDataFragment for new instructions, see CanReuseDataFragment() in MCObjectStreamer.cpp. As noted earlier when the Subtarget changes a new MCDataFragment is created. In the majority of cases this is done via a .arch or .arch_extension directive.
- This patch extends the eager evaluation to cope with two adjacent MCDataFragments. My understanding is that this only occurs in the following circumstances:
- When bundle locking and --mc-relax-all is used, there is a complicated 1 instruction per fragment + fragment merging that goes on here. This is only used in NaCl which I'm not sure what the status of in Chrome is. I think it is at best deprecated.
- When there is a Subtarget change between MCDataFragments.
- In all other cases such as .align, .space and a relaxable instruction there is a separate non MCDataFragment created so we cannot fix these up.
- The patch restricts the eager evaluation to .if as some asm backends do not want the expressions between fragments evaluated eagerly in some cases.
From the perspective of the linux kernel, is D76002 sufficient? For example if the Subtarget changing directives are used in such a way that they don't create new MCDataFragments in a sensitive location then we may not need this. For example the following will assemble with D76002
.arch_extension sec // Outside of .text .text 9997: nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endif
As will:
.text .arch_extension sec nop 9997: nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endif
Could the examples in the kernel be altered to not require this? It does seem like we are writing a lot of code for a small number of easily resolved cases.
I note that it is possible to write a contrived example that this patch can't handle (needs --triple=armv8a-linux-gnu as crypto and crc are V8 extensions) although I'm not suggesting implementing support for it.
.text nop 9997: .arch_extension crypto nop .arch_extension crc nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endif .text nop 9997: .arch_extension crypto nop .arch_extension crc nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endif
if.s:11:5: error: expected absolute expression
.if . - 9997b == 0
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
585 | With the new information coming from D70062 I think we need to make this comment more specific. For example: When the Subtarget is changed a new MCDataFragment is created. This handles the case of foo: instr; .arch_extension ext; instr .if . - foo | |
llvm/test/MC/ARM/directive_if_offset.s | ||
5 ↗ | (On Diff #228751) | Can we add a comment to explain the importance of .arch_extension, such as: |
Thanks for the summary.
I'll follow up on this. There's likely a lot of code that can be dropped if that's the case.
- When there is a Subtarget change between MCDataFragments.
- In all other cases such as .align, .space and a relaxable instruction there is a separate non MCDataFragment created so we cannot fix these up.
- The patch restricts the eager evaluation to .if as some asm backends do not want the expressions between fragments evaluated eagerly in some cases.
From the perspective of the linux kernel, is D76002 sufficient?
I can still reproduce https://github.com/ClangBuiltLinux/linux/issues/742 with https://reviews.llvm.org/D70062 applied.
For example if the Subtarget changing directives are used in such a way that they don't create new MCDataFragments in a sensitive location then we may not need this. For example the following will assemble with D76002
.arch_extension sec // Outside of .text .text 9997: nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endifAs will:
.text .arch_extension sec nop 9997: nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endifCould the examples in the kernel be altered to not require this? It does seem like we are writing a lot of code for a small number of easily resolved cases.
The case from the kernel is testing that a single instruction is 2B rather than 4B (narrow vs wide). See https://github.com/ClangBuiltLinux/linux/blob/de620fb99ef2bd52b2c5bc52656e89dcfc0e223a/arch/arm/include/asm/assembler.h#L255-L270.
I don't see any subarch directives there.
I note that it is possible to write a contrived example that this patch can't handle (needs --triple=armv8a-linux-gnu as crypto and crc are V8 extensions) although I'm not suggesting implementing support for it.
.text nop 9997: .arch_extension crypto nop .arch_extension crc nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endif .text nop 9997: .arch_extension crypto nop .arch_extension crc nop .if . - 9997b == 0 // CHECK-NOT: error: expected absolute expression orr r1, r1, #1 .else orr r1, r1, #2 .endifif.s:11:5: error: expected absolute expression
.if . - 9997b == 0
If we need to support cases like this, we can probably add a while loop and check if the fragments between two symbols are all MCDataFragments and sum their sizes up.
The case from the kernel is testing that a single instruction is 2B rather than 4B (narrow vs wide). See https://github.com/ClangBuiltLinux/linux/blob/de620fb99ef2bd52b2c5bc52656e89dcfc0e223a/arch/arm/include/asm/assembler.h#L255-L270.
I don't see any subarch directives there.
This issue happens when the ALT_UP macro is expanded in files like arch/arm/mm/proc-v7.S. This is how I reproduce the code of interest
clang -I./arch/arm/include -I./arch/arm/include/generated -I./include -I./arch/arm/include/uapi -I./arch/arm/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -DASSEMBLY --target=arm-linux-gnueabihf -DLINUX_ARM_ARCH=7 -march=armv7-a -include asm/unified.h -S arch/arm/mm/proc-v7.S &> clang-proc-v7.s
That's unfortunate. I've found out how to reproduce the original issue:
clang -I./arch/arm/include -I./arch/arm/include/generated -I./include -I./arch/arm/include/uapi -I./arch/arm/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__ASSEMBLY__ --target=arm-linux-gnueabihf -D__LINUX_ARM_ARCH__=7 -march=armv7-a -include asm/unified.h -c -o /tmp/proc-v7.o arch/arm/mm/proc-v7.Sclang -I./arch/arm/include -I./arch/arm/include/generated -I./include -I./arch/arm/include/uapi -I./arch/arm/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__ASSEMBLY__ --target=arm-linux-gnueabihf -D__LINUX_ARM_ARCH__=7 -march=armv7-a -include asm/unified.h -c -o /tmp/proc-v7.o arch/arm/mm/proc-v7.S
If I extract out just the small part of the test case it works even without 76002.
The case from the kernel is testing that a single instruction is 2B rather than 4B (narrow vs wide). See https://github.com/ClangBuiltLinux/linux/blob/de620fb99ef2bd52b2c5bc52656e89dcfc0e223a/arch/arm/include/asm/assembler.h#L255-L270.
I don't see any subarch directives there.
They are there, I preprocessed to get a rather large output file including:
.globl cpu_v7_dcache_clean_area ; .align 0 ; cpu_v7_dcache_clean_area: 9998: nop @ MP extensions imply L1 PTW .equ up_b_offset, 1f - 9998b ; .pushsection ".alt.smp.init", "a" ; .long 9998b ; b . + up_b_offset ; .popsection ret lr 1: dcache_line_size r2, r3 2: mcr p15, 0, r0, c7, c10, 1 @ clean D entry add r0, r0, r2 subs r1, r1, r2 bhi 2b dsb ishst ret lr .type cpu_v7_dcache_clean_area, %function; .size cpu_v7_dcache_clean_area, .-cpu_v7_dcache_clean_area .arch_extension sec .globl cpu_v7_smc_switch_mm ; .align 0 ; cpu_v7_smc_switch_mm: stmfd sp!, {r0 - r3} movw r0, #:lower16:(((1) << 31) | ((0) << 30) | (((0) & 0x3F) << 24) | ((0x8000) & 0xFFFF)) movt r0, #:upper16:(((1) << 31) | ((0) << 30) | (((0) & 0x3F) << 24) | ((0x8000) & 0xFFFF)) smc #0 ldmfd sp!, {r0 - r3} b cpu_v7_switch_mm .type cpu_v7_smc_switch_mm, %function; .size cpu_v7_smc_switch_mm, .-cpu_v7_smc_switch_mm ... ... 9998: orr r1, r1, #((0 << 0) | (1 << 6))|(1 << 1)|(1 << 5)|(1 << 3) .pushsection ".alt.smp.init", "a" ; .long 9998b ;9997: orr r1, r1, #((1 << 0) | (1 << 6))|(3 << 3) ; .if . - 9997b == 2 ; nop ; .endif ; .if . - 9997b != 4 ; .error "ALT_UP() content must assemble to exactly 4 bytes"; .endif ; .popsection
This confused me for a bit, I think something like the following is happening.
.pushsection ".alt.smp.init"
// some instructions with Subtarget X
.popsection
.arch_extension sec or anything to change the subtarget
more stuff
.pushsection ".alt.smp.init"
// we are returning to .alt.smp.init but our subtarget is now X +sec
next instruction will start a new fragment
.popsection
It is getting late over here so I need to go home. Would jcai19 or yourself be able to investigate to confirm, check further? We would need to start a new MCDataFragment within .alt.smp.init for the new SubTarget, but I'd expect it all to happen before the 9997: in the failing case.
Making this work only on ".if" is IMO a non-starter, at least without understanding why. So I looked into what broke with llvm/test/MC/MachO/reloc-diff.s.
Basically, with LLVM's current representation, assuming that consecutive fragments are never moved with respect to each-other is invalid. In both ELF and Mach-O. In ELF, we have the ability to use numbered subsections that can insert new fragments between existing fragments. In Mach-O, fragments can effectively be turned into subsections with the 'subsections_via_symbols' directive. Committing this patch as-is would both be ugly and wrong, since we'd allow computing nonsense offsets -- even if we only restrict it to ".if".
I think it's a mistake that it works like that, and I'm going to spend a little bit of time to see if it'll be easy to fix this representational mistake in LLVM (in a separate patch), so that we have fragment lists which _are_ guaranteed to be kept exactly in the order they appear.
For this review, I have two requests:
- Undo all the ".if"-only hacks. (understanding that it will cause some tests to fail, for now).
- Fix the code to support arbitrary numbers of fragments between the symbols, of kinds MCFillFragment and MCDataFragment (and maybe others if they are fixed size -- I haven't looked through all of the kinds). Probably best would be to introduce a helper function to calculate the delta between two arbitrary symbols, given an optional layout (and either return an answer or a failure indication). This should make AttemptToFoldSymbolOffsetDifference significantly more straightforward.
I believe these examples should work, and will after making the latter change:
1: .arch armv7a nop .arch armv4 nop .arch armv7a nop .if . - 1b != 0 .word 0x12345678 .endif 2: nop .zero 0x10000 nop .if . - 2b == 4 .word 0x12345678 .endif
Thanks for the clarification. It will be great if we can remove the restriction to ".if". I will making changes accordingly while the representation of subsections are being changed.
Support arbitrary numbers of fragments between the symbols of kind
MCDataFragment. Getting the size of MCFillFragments seems to be less
straightforward https://llvm.org/doxygen/MCAssembler_8cpp_source.html#l00287.
@jyknight while working on a different issue, I happened to take a look at the implementation of subsection (MCSection::getSubsectionInsertionPoint) and from what I understood, fragments in subsections were always placed after fragments not in any subsection., and fragments in the same subsection (or not in any subsection) were always inserted in order and separate from fragments in other subsections. Is it safe to assume what between two fragments either not in any subsections or in the same subsection will be remain unchanged once they are inserted to the fragment list of a section? If so, can we resolve differences of two labels in fragments with such restriction?
Limit the scope to fragments in the same subsection (fragments not in any subsections are placed in subsection 0). Fragments are inserted into each subsection in order and the difference of their offsets should be fixed once they are inserted. This is experimental as it breaks many tests. Will address the test failure if this idea is proven correct.
(1) Fixed a bug while rebasing the patch last time. The failed unit tests dropped to 2 after the fix.
(2) Updated the 2 failed unit tests.
Worth mentioning that this is needed by (this assembly file does not assemble with the integrated assembler as of today):
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- LLVM=1 LLVM_IAS=1 defconfig arch/arm/mm/proc-v7.o
llvm/include/llvm/MC/MCFragment.h | ||
---|---|---|
74 ↗ | (On Diff #286445) | Move below LayoutOrder. sizeof(MCDataFragment) will not change Alternatively, disable the folding for Mach-O because this is too subtle (if getSubsectionViaSymbols()) |
llvm/lib/MC/MCExpr.cpp | ||
586 | Consider swapping then and else branches to avoid ! | |
llvm/test/MC/ARM/directive_if_offset.s | ||
1 ↗ | (On Diff #286445) | In this directory file-name.s is more common. What about directive-if-sub.s? sub is more meaningful than offset. Add a llvm-mc -triple armv7a-linux-gnueabihf %s -o /dev/null 2>&1 test to show that -filetype=asm does not work (MCAssembler * is null so, but this is less of an issue) |
llvm/test/MC/ARM/directive_if_offset_error.s | ||
1 ↗ | (On Diff #286445) | See ELF/reloc-directive.s You can use --defsym=ERR=1 to merge the tests into directive-if-sub.s |
llvm/test/MC/ARM/thumb2_directive_if_offset_error.s | ||
1 ↗ | (On Diff #286445) | ditto |
llvm/include/llvm/MC/MCFragment.h | ||
---|---|---|
74 ↗ | (On Diff #286445) | Thanks for all the comments! Can you please elaborate this one? Are you suggesting to move SubsectionNumber right after LayoutOrder? |
llvm/test/MC/ARM/directive_if_offset_error.s | ||
1 ↗ | (On Diff #286445) | --defsym=ERR=1 does not seem to work if I move the code from this file into directive-if-subtraction.s, as the run commands still fail. Maybe I am missing something? Also directive-if-subtraction.s requires armv7a so this run command will fail once I make the move. |
llvm/test/MC/ARM/directive_if_offset_error.s | ||
---|---|---|
1 ↗ | (On Diff #286445) | See ELF/reloc-directive.s for an example. # RUN: not llvm-mc ....... | FileCheck %s --check-prefix=ERR normal assembly .ifdef ERR # ERR: {{.*}}.s:[[#@LINE+1]]:10: error: expected comma error line .endif I think it is clearer to place working and non-working examples in one file. |
llvm/include/llvm/MC/MCFragment.h | ||
---|---|---|
74 ↗ | (On Diff #286445) | SubsectionNumber looks too subtle. I'd hope we just remove the variable, and avoid the computation for Mach-O. Happy to hear what @jyknight will say, and whether this is a reasonable (imperfect) approach. Personally I think it is mostly good if SubsectionNumber is removed. |
llvm/include/llvm/MC/MCFragment.h | ||
---|---|---|
73 ↗ | (On Diff #288471) | Second line needs three slashes and punctuation. |
llvm/lib/MC/MCExpr.cpp | ||
544 | Could this be a static function accepting (const MCAssembler *Asm, const MCSymbol &SA, const MCSymbolRefExpr *&A, const MCSymbolRefExpr *&B), rather than a closure? | |
552 | drop extra parens? | |
553–554 | return FinalizeFolding(); | |
595 | Does the subexpression (SA.getOffset() - SB.getOffset()) change throughout the loop? If not, consider initializing Offset to that value. | |
596–597 | return FinalizeFolding(); | |
599–600 | This can be 3 lines rather than four by swapping the condition: if (... != ...) return Offset ... Probably can drop the extra parens around the cast, too. | |
610 | Check spelling and punctuation here. :set spell in vim. |
llvm/include/llvm/MC/MCFragment.h | ||
---|---|---|
73 ↗ | (On Diff #288471) | Punctuation. (Period at end of sentence in comment). |
llvm/lib/MC/MCExpr.cpp | ||
586 | I agree; prefer: if x: foo() else: bar() to: if !x: bar() else: foo() if with a negated conditional is ok when there is only a then-clause. If there's an else-clause, then it's a code smell. | |
610 | Punctuation (Period at end of sentence in comment). |
Thanks for the update and sorry to take so long to comment. I can't see anything immediately wrong and I think limiting this to fragments in the same subsection makes sense.
llvm/lib/MC/MCExpr.cpp | ||
---|---|---|
606 | It would be good to have a comment here as we have Offset and getOffset() meaning two different things. IIUC getOffset() is really getOffsetWithinFragment(). Perhaps use Displacement instead of Offset as the accumulating variable name. For example: |
With this patch applied, I no longer observe the error from https://github.com/ClangBuiltLinux/linux/issues/742 for 32b ARM, though I can't do a full kernel build with Clang's IA yet due to missing support for the adrl pseudo instruction. (https://github.com/ClangBuiltLinux/linux/issues/430, https://bugs.llvm.org/show_bug.cgi?id=24350) in order to boot test. I did boot test x86 (32b and 64b) and arm64 Linux kernels with this change just fine (using Clang's integrated assembler).
It looks like there may have been unresolved comments from @MaskRay . I'm also curious whether @jyknight or @psmith had parting thoughts? (@jcai19 maybe wait a week for their feedback?)
I am waiting on @jyknight's opinion about the potentially subtle unsigned SubsectionNumber = 0; (at the very least, if we want to keep it, it should be reordered as my comment says, and don't repeat the name of the variable)
Hi @MaskRay, thanks for the comment. By reordering, do you mean moving the definition of SubsectionNumber right after LayoutOrder? I do not understand what difference that would make, would you care to explain a little bit more? Also what by not repeating the name of the variable, I assume you were referring to the comment right above its definition?
Moved the definition of SubsectionNumber to avoid padding bytes. Also updated the comments.
You can move this declaration down closer to where it is used. No need to construct if the below conditional returns early.