This is an archive of the discontinued LLVM Phabricator instance.

Teach TargetInstrInfo::getInlineAsmLength to parse .space directives with integer arguments
ClosedPublic

Authored by asb on Sep 13 2017, 3:28 AM.

Details

Summary

It's currently quite difficult to test passes like branch relaxation, which requires branches with large displacement to be generated. The .space assembler directive makes it easy to create arbitrarily large basic blocks, but getInlineAsmLength is not able to parse it and so the size of the block is not correctly estimated. Other backends (AArch64, AMDGPU) introduce options just for testing that artificially restrict the ranges of branch instructions (e.g. aarch64-tbz-offset-bits). Although parsing a single form of the .space directive feels inelegant, it does allow a more direct testing approach.

This patch adapts the .space parsing code from Mips16InstrInfo::getInlineAsmLength and removes it now the extra functionality is provided by the base implementation. CC @fhahn, @rengolin as this change might motivate dropping aarch64-{tbz,cbz,bcc}-offset-bits and updating the tests to use .space. I want to move this functionality to the generic getInlineAsmLength as 1) I need the same for RISC-V, and 2) I feel other backends will benefit from more direct testing of large branch displacements.

Diff Detail

Repository
rL LLVM

Event Timeline

asb created this revision.Sep 13 2017, 3:28 AM
sdardis requested changes to this revision.Sep 13 2017, 5:11 AM
sdardis added inline comments.
lib/CodeGen/TargetInstrInfo.cpp
104–107 ↗(On Diff #115007)

This code has unexpected behaviour in that it accepts a mixture of zero or more things that look like instructions or comments ending with an optional .space <int> directive. If the .space directive is parsed and there is not zero or more spaces followed by the null byte, the value of space is discarded even if it was a positive number.

This bug was present in the original version, but it didn't occur to my knowledge as ".space <num>" was the only form used as far as I can see.

I think it is better to parse this properly so that people don't get tripped up over when trying to use this.

Aside: we may also want to handle .align, .fill at some point.

This revision now requires changes to proceed.Sep 13 2017, 5:11 AM
asb updated this revision to Diff 115059.Sep 13 2017, 9:42 AM
asb edited edge metadata.
asb marked an inline comment as done.

Update to address review comments from @sdardis (thanks, good catch).

The following will now produced a block with size 300+instlength:

.space 100 # comment
.space 200
addi r1, r2, 3

Unsupported parameters to space will still result in the directive being ignored, e.g. .space 1 << 20, .space 200+56. I'm wary of implementing any more .size variants, we don't want to reimplement the assembly parser and the motivating use case really is to make testing easier.

asb updated this revision to Diff 115060.Sep 13 2017, 9:44 AM

Sorry for the noise, the previous version left out part of the patch.

sdardis accepted this revision.Sep 15 2017, 5:56 AM

This LGTM but the original parser suffers from a bug / efficiency issue where we don't advance Str but do change state. This causes things like ".space 1000;nop;.space 1000"--when we permit the target specific separator to come after a .space <num> directive--to be measured as 1008 bytes rather than 2004.

Given this particular piece of code has existed for 10 years in this state, I don't think it's really an issue for this patch.

lib/CodeGen/TargetInstrInfo.cpp
83–85 ↗(On Diff #115060)

We implement a special case of the .space directive which takes only a single integer argument in base 10 that is the size in bytes. This is a restricted form of the GAS directive in that we only interpret simple--i.e. not a logical or arithmetic expression--size values without the optional fill value. This is primarily used for creating arbitrary sized inline asm blocks for testing purposes.

94 ↗(On Diff #115060)

Bug here: we don't advance the parser to the character past the '\n' or MAI.getSeparatorString() but we do change state...

101 ↗(On Diff #115060)

...so here we're looking at the '\n' or start of MAI.getSeparatorString() but our state says we're at the start of an instruction. I think this should be addressed in a separate patch.

111 ↗(On Diff #115060)

We should also handle MAI.getSeparatorString() here but that relies on implementing this function as better parser.

This revision is now accepted and ready to land.Sep 15 2017, 5:56 AM
This revision was automatically updated to reflect the committed changes.