Page MenuHomePhabricator

[X86] Add MC level selection support for SHLD (64-bit only)
Needs ReviewPublic

Authored by avt77 on Nov 29 2017, 6:52 AM.

Details

Summary

Depends on D43813. Depends on D41278. We'd like to be albe properly to select SHLD/SHRD instrs or alternative code sequences in dependence on target CPUs. This patch shows the possibility to do it. It supports only one intruction: SHLD(16/32/64)rri8 but it shows how we could do it. If the suggested way is acceptable for us I'll continue with new extended patches.

I tried to split the patch on 2 steps: the debug extension and the support of SHLD itself. Unfortunately I could not find any usage of Machine Combiner except mul-lohi.ll test but this test is not informative enough to show the improvements in the debug print. On the other hand X86 target code does not have any alternative MC code patterns to show the created changes inside existing MC DEBUG() expressions. As result I created this rather big patch including all changes at once.

Most probably I'll create a new extended patch when (and if) we accept this one as initial way to resolve the issue.

Diff Detail

Event Timeline

avt77 created this revision.Nov 29 2017, 6:52 AM
avt77 updated this revision to Diff 129636.Jan 12 2018, 8:48 AM
avt77 retitled this revision from [X86] [Demo version] Add MC level selection support for SHLD/SHRD to [X86] Add MC level selection support for SHLD (64-bit only).

I removed all changes related to debug logging: they are published as D41278. In addition I completely implemented the possible substitutions for SHLD64mri8, SHLD64rrCL and SHLDrri8. I did not implement the same for SHLDmrCL because from my point of view the alternative code sequence is too long. Maybe I'm wrong?

Next steps I'm going todo:

  • Clang bootstrap with this patch included;
  • 32/16-bit implementation;
  • SHRD implementation.

But of course it'd be nice to commit 64-bit bootstrapped version before other changes.

Please can you make this dependent on D41278 to show the debug output that that patch will add? We should then be able to approve D41278.

Adding missing llvm-commits subscriptions

avt77 edited the summary of this revision. (Show Details)Jan 16 2018, 3:29 AM

I bootstrapped the compiler with this patch but unfortunatley there was not any machine-combiner activity. It means we need special test(s) to validate the given substitution.

avt77 updated this revision to Diff 130404.Jan 18 2018, 7:00 AM

I added special switch "mc-shld-enabled" to allow the job of this patch. As result we have only one changed test specially added by me previously. And now the whole patch is rather small.

The BDVER tests are changing despite not having a scheduler model attached - I'm not sure we want that, if there isn't scheduler model attached to a cpu we shouldn't use defaults as that's precisely what the feature flags are fighting against.

avt77 updated this revision to Diff 130592.Jan 19 2018, 5:25 AM

I removed the issue with BDVER1 tests. Now we have one diff for GENERIC CPU but I'm sure it's OK.

avt77 updated this revision to Diff 135265.Feb 21 2018, 8:06 AM

I re-based the sources and added a new test to demonstarte new debug output.

Gerolf added a subscriber: Gerolf.Feb 21 2018, 10:55 AM
Gerolf added inline comments.
lib/CodeGen/MachineCombiner.cpp
189 ↗(On Diff #135265)

Shouldn't a case like this be handled by computeOperandLatency?

418 ↗(On Diff #135265)

What is the point? Now the return values for if conditions are in-consistent and I find this harder to read.

570 ↗(On Diff #135265)

Why needed for this commit?

599 ↗(On Diff #135265)

Could you commit the size checks separately?

lib/Target/X86/X86InstrInfo.cpp
3809

Should the be part of isMCShldEnabled?

11620

IsMCShldEnabled?

test/CodeGen/X86/schedule-x86-64-shld.ll
1–2

Why do you need to change these test cases?

avt77 added inline comments.Feb 22 2018, 3:09 AM
lib/CodeGen/MachineCombiner.cpp
570 ↗(On Diff #135265)

Because the signature of print was changed with additional default arg and as result we did not pass TII here but w/o TII we can't print instructions' names. Is it OK?

599 ↗(On Diff #135265)

Should I combine this with changes in doSubstitute? And I think it should be called like "Proper support of OptSize in Machine Combiner". Is it OK?

test/CodeGen/X86/schedule-x86-64-shld.ll
1–2

Because now we have special switch to allow shld substitution. Or do you mean we should add 3 RUNs here?

avt77 updated this revision to Diff 135411.Feb 22 2018, 6:32 AM

I fixed all issues raised by Gerolf except 3 questions: I put my answers/comments on them.

avt77 updated this revision to Diff 135886.Feb 26 2018, 5:21 AM

I extracted the changes related to OptSize: they will be published in a separate patch.
In addition I extracted changes related to debug output: they were committed as a last step to close D41278.
As result we have shorter patch and all requirements from Gerolf were resolved.

avt77 edited the summary of this revision. (Show Details)Feb 28 2018, 4:53 AM
avt77 updated this revision to Diff 139421.Mar 22 2018, 2:37 AM

I re-based sources to reflect commit of D43813.

RKSimon resigned from this revision.Feb 19 2019, 10:31 AM