Page MenuHomePhabricator

[X86] Remove the _alt forms of XOP VPCOM and AVX512 VPCMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more
ClosedPublic

Authored by craig.topper on Mar 14 2019, 5:00 PM.

Details

Summary

Previously we had a regular form of the instruction used when the immediate was 0-7. And _alt form that allowed the full 8 bit immediate. Codegen would always use the 0-7 form since the immediate was always checked to be in range. Assembly parsing would use the 0-7 form when a mnemonic like vpcomtrueb was used. If the immediate was specified directly the _alt form was used. The disassembler would prefer to use the 0-7 form instruction when the immediate was in range and the _alt form otherwise. This way disassembly would print the most readable form when possible.

The assembly parsing for things like vpcomtrueb relied on splitting the mnemonic into 3 pieces. A "vpcom" prefix, an immediate representing the "true", and a suffix of "b". The tablegenerated printing code would similarly print a "vpcom" prefix, decode the immediate into a string, and then print "b".

The _alt form on the other hand parsed and printed like any other instruction with no specialness.

With this patch we drop to one form and solve the disassembly printing issue by doing custom printing when the immediate is 0-7. The parsing code has been tweaked to turn "vpcomtrueb" into "vpcomb" and then the immediate for the "true" is inserted either before or after the other operands depending on at&t or intel syntax.

I'd rather not do the custom printing, but I tried using an InstAlias for each possible mnemonic for all 8 immediates for all 16 combinations of element size, signedness, and memory/register. The code emitted into printAliasInstr ended up checking the number of operands, the register class of each operand, and the immediate for all 256 aliases. This was repeated for both the at&t and intel printer. Despite a lot of common checks between all of the aliases, when compiled with clang at least this commonality was not well optimized. Nor do all the checks seem necessary. Since I want to do a similar thing for vcmpps/pd/ss/sd which have 32 immediate values and 3 encoding flavors, 3 register sizes, etc. This didn't seem to scale well for clang binary size. So custom printing seemed a better trade off.

I also considered just using the InstAlias for the matching and not the printing. But that seemed like it would add a lot of extra rows to the matcher table. Especially given that the 32 immediates for vpcmpps have 46 strings associated with them.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Mar 14 2019, 5:00 PM
RKSimon added inline comments.Mar 15 2019, 3:59 AM
lib/Target/X86/AsmParser/X86AsmParser.cpp
2517 ↗(On Diff #190752)

Is the plan to reuse this for the upcoming vcmpps/pd/ss/sd support?

lebedev.ri added inline comments.Mar 15 2019, 8:42 AM
test/tools/llvm-mca/X86/BdVer2/resources-xop.s
242 ↗(On Diff #190752)

So now that the immediate matters, should i add back these tests for the rest of immediates?

RKSimon added inline comments.Mar 15 2019, 9:18 AM
test/tools/llvm-mca/X86/BdVer2/resources-xop.s
242 ↗(On Diff #190752)

You can if you want, although IIRC the only thing that could show different behaviour is true/false - and we don't emit those.

lebedev.ri added inline comments.Mar 15 2019, 9:29 AM
test/tools/llvm-mca/X86/BdVer2/resources-xop.s
242 ↗(On Diff #190752)

What i mean is, there does not seem to be other test changes other than these,
so is there test coverage for printing the VPCOM with imm != 0?
I'm guessing no, and that is why i think the tests should be added.

craig.topper marked 3 inline comments as done.Mar 15 2019, 9:52 AM
craig.topper added inline comments.
lib/Target/X86/AsmParser/X86AsmParser.cpp
2517 ↗(On Diff #190752)

Yes.

test/tools/llvm-mca/X86/BdVer2/resources-xop.s
242 ↗(On Diff #190752)

There are MC tests for immediates other than 0, but they all used a special mnemonic and we verify the special mnemonic gets printed back out.

The only observable change in this patch is that if you use an explicit immediate of 0-7, that will now be printed with the special mnemonic.

Extend to cover VPCMP as well. Use TSFlags to make some of the printing decisions rather than switching on the enum. I think this should be better than the jump tables that switching on the enum would require.

Herald added a project: Restricted Project. · View Herald TranscriptMar 16 2019, 3:57 PM
craig.topper retitled this revision from [X86] Remove the _alt forms of XOP VPCOM instructions. Use a combination of custom printing and custom parsing to achieve the same result and more to [X86] Remove the _alt forms of XOP VPCOM and AVX512 VPCMP instructions. Use a combination of custom printing and custom parsing to achieve the same result and more.Mar 17 2019, 1:15 AM
RKSimon accepted this revision.Mar 17 2019, 11:29 AM

LGTM - if possible I'd probably prefer to have the XOP + AVX512 cases committed separately (not its not that vital).

This revision is now accepted and ready to land.Mar 17 2019, 11:29 AM
This revision was automatically updated to reflect the committed changes.