This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Implement -mimplicit-it assembler option
ClosedPublic

Authored by olista01 on Jul 25 2016, 9:23 AM.

Details

Summary

This option, compatible with gas's -mimplicit-it, controls the
generation/checking of implicit IT blocks in ARM/Thumb assembly.

This option allows two behaviours that were not possible before:

  • When in ARM mode, emit a warning when assembling a conditional instruction that is not in an IT block. This is enabled with -mimplicit-it=never and -mimplicit-it=thumb.
  • When in Thumb mode, automatically generate IT instructions when an instruction with a condition code appears outside of an IT block. This is enabled with -mimplicit-it=thumb and -mimplicit-it=always.

The default option is -mimplicit-it=arm, which matches the existing
behaviour (allow conditional ARM instructions outside IT blocks without
warning, and error if a conditional Thumb instruction is outside an IT
block).

The general strategy for generating IT blocks in Thumb mode is to keep a
small list of instructions which should be in the IT block, and only
emit them when we encounter something in the input which means we cannot
continue the block. This could be caused by:

  • A non-predicable instruction
  • An instruction with a condition not compatible with the IT block
  • The IT block already contains 4 instructions
  • A branch-like instruction (including ALU instructions with the PC as the destination), which cannot appear in the middle of an IT block
  • A label (branching into an IT block is not legal)
  • A change of section, architecture, ISA, etc
  • The end of the assembly file.

Some of these, such as change of section and end of file, are parsed
outside of the ARM asm parser, so I've added a new virtual function to
AsmParser to ensure any previously-parsed instructions have been
emitted. The ARM implementation of this flushes the currently pending IT
block.

We now have to try instruction matching up to 3 times, because we cannot
know if the current IT block is valid before matching, and instruction
matching changes depending on the IT block state (due to the 16-bit ALU
instructions, which set the flags iff not in an IT block). In the common
case of not having an open implicit IT block and the instruction being
matched not needing one, we still only have to run the matcher once.

I've removed the ITState.FirstCond variable, because it does not store
any information that isn't already represented by CurPosition. I've also
updated the comment on CurPosition to accurately describe it's meaning
(which this patch doesn't change).

Diff Detail

Repository
rL LLVM

Event Timeline

olista01 updated this revision to Diff 65353.Jul 25 2016, 9:23 AM
olista01 retitled this revision from to [ARM] Implement -mimplicit-it assembler option.
olista01 updated this object.
olista01 added a reviewer: rengolin.
olista01 set the repository for this revision to rL LLVM.
olista01 added a subscriber: llvm-commits.
t.p.northover added inline comments.
include/llvm/MC/MCParser/MCTargetAsmParser.h
223 ↗(On Diff #65353)

Lower case 'f' would be better.

lib/Target/ARM/AsmParser/ARMAsmParser.cpp
59 ↗(On Diff #65353)

The 'm' is more an artefact of the GCC compatible driver naming conventions. In LLVM it should probably be "arm-implicit-it".

176 ↗(On Diff #65353)

I think this is probably trying too hard to match the encoding, especially as it has to be mangled anyway. I'd suggest a simple left-to-right bitfield (so ITT -> 0b1, ITTT ->0b11, ITTE -> 0b10).

I think that would make quite a few places easier to understand:

  • ParseInstruction wouldn't have to deal with quite so much weirdness
  • processInstruction could call getITEncoding (which would be about as simple as it is now).
  • rewind becomes just a right shift.
  • extend becomes a left shift + or.
8985 ↗(On Diff #65353)

I think you're missing some here. Grepping for "[^r]GPR[^n]" in ARMInstrThumb2.td suggests at least t2LDRB_PRE and friends.

I think it would be more robust to iterate through the operands looking for PC (plus checks for a few special instructions). It ought to be possible to use the MCInstrDesc to determine whether there's a write to PC</vigorous hand-waving>.

9033 ↗(On Diff #65353)

I'm not really convinced we need to rewind at all. Currently, we do the following (in both places it gets called):

  • Extend with a tautologous ITState.Cond
  • Hack in the real condition on success.
  • rewind on failure.

What can't we just extend with the correct condition when we know it's sensible?

9062–9065 ↗(On Diff #65353)

The condition argument actually seems to be entirely unused (except for an assertion). I'd split this function into startImplicitITBlock and extendImplicitITBlock, neither of which take a condition. Especially given its implementation is completely split anyway.

olista01 marked 4 inline comments as done.Jul 26 2016, 5:49 AM
olista01 added inline comments.
lib/Target/ARM/AsmParser/ARMAsmParser.cpp
176 ↗(On Diff #65353)

This encoding has the advantage of also encoding the length of the block. In your scheme, IT and ITE would have the same encoding, so we would have to record the block length in the parser as another operand, so this wouldn't really simplify things.

Using getITMaskEncoding in processInstruction simplifies that function (and avoids repeating the xoring code), so all of the complexity is now contained within the functions operating on ITState.

8985 ↗(On Diff #65353)

t2LDRB_PRE doesn't need to be in this list, as it is UNPREDICTABLE when Rn is PC. Unfortunately, this is one of the many UNPREDICTABLE instructions we don't currently diagnose.

However, I agree that it will be more robust to check the MCInstrDesc instead of having a hard-coded list of instructions.

There is MCInstrDesc::mayAffectControlFlow, which does almost what we want, but unfortunately can't be completely accurate without some instruction-specific knowledge, which I've had to keep in this function.

9033 ↗(On Diff #65353)

The problem here is that instruction matching changes depending on the IT state. There are checks in checkTargetMatchPredicate (called from MatchInstructionImpl), which cause the correct Thumb2 arithmetic instructions to be selected, depending on the IT state. If we didn't extend the IT block before matching, this would select the instructions that are valid outside an IT block, which we couldn't then extend the IT block to cover without changing their flag-setting behaviour.

However, instruction matching depends only on the presence of an IT block, not the condition of it, so we can hack in the condition later, and don't have to re-run matching with the two (when extending an existing block) or 15 (when creating a new block) possible conditions.

olista01 updated this revision to Diff 65501.Jul 26 2016, 5:52 AM
olista01 added a reviewer: t.p.northover.
olista01 removed rL LLVM as the repository for this revision.
olista01 marked an inline comment as done.
  • Fix case of flushPendingInstructions
  • Rename backend option to -arm-implicit-it
  • Refactor isITBlockTerminator to use MCInstrDesc
  • Use getITMaskEncoding in processInstruction
t.p.northover accepted this revision.Jul 26 2016, 6:47 AM
t.p.northover edited edge metadata.

Thanks Oli. I think this looks fine now.

Tim.

lib/Target/ARM/AsmParser/ARMAsmParser.cpp
176 ↗(On Diff #65501)

OK, I was just considering the later instructions. That makes reasonable sense.

9036 ↗(On Diff #65501)

Yes, sorry. Just considered that possibility this morning.

This revision is now accepted and ready to land.Jul 26 2016, 6:47 AM
This revision was automatically updated to reflect the committed changes.