Page MenuHomePhabricator

[X86] Allow load folding into PUSH instructions

Authored by mkuper on Jul 19 2015, 3:54 AM.



Adds PUSHes to the folding tables.
This also required a fix to the TD definition, since the memory forms of the PUSH/POP instructions did not have the right mayLoad/mayStore flags.

Diff Detail

Event Timeline

mkuper updated this revision to Diff 30117.Jul 19 2015, 3:54 AM
mkuper retitled this revision from to [X86] Allow load folding into PUSH instructions.
mkuper updated this object.
mkuper added reviewers: spatel, RKSimon.
mkuper added a subscriber: llvm-commits.
RKSimon edited edge metadata.Jul 19 2015, 4:05 AM

Please can you test x86_64 as well - 32-bit pushes on 64-bit targets are rather awkward.....

You've updated the pop definitions as well - is your plan to fix these in a separate patch? In which case maybe not alter them here?

Unfortunately, aside from 32-bit pushes on 32-bit platforms, I'm not aware of anything that actually generates memory-form pushes at the moment. But if at any point we start doing that, I think having folding from the start would be a good thing. If you can suggest scenarios where that's generated, I'll happily add a test.

Regarding pops - I wasn't originally planning on touching them, but it seems like the current definition is obviously wrong (a POP64rmm may store...), and I'd rather not leave bugs to lurk. :-)
I can fix it in a separate patch, but, as above - I'm not sure how to test that, since we don't actually generate them.

spatel added inline comments.Jul 20 2015, 3:22 PM

Please check the feature flag directly rather than a CPU: -mattr=call-reg-indirect?

Also, can this be loosened to -mtriple=i686-unknown-unknown? There's nothing Windows-specific here?

mkuper added inline comments.Jul 21 2015, 1:33 AM

I'll change the feature flag, thanks.

And I'd prefer to leave windows, because the test is somewhat sensitive to the way the stack frame gets constructed. E.g. i686-pc-linux will not work. Given that, I'd rather not rely on unknown-unknown.

spatel added inline comments.Jul 21 2015, 8:47 AM

Missing period at end of sentence.


Spacing looks off here. Hoist Opcode into a local if 80-col is a problem?


As suggested earlier, please put the pop changes into their own commit with a checkin comment that explains the change and why there is no test (unless Simon or someone else knows how to test this).



Seems like 'call-reg-indirect' could use a more general name since it's being used with more than just calls now. But that can be a separate change. Add a TODO comment in the td file?

mkuper updated this revision to Diff 30340.Jul 22 2015, 4:37 AM
mkuper edited edge metadata.

Thanks, Simon, Sanjay.

I've removed the POP part, will commit that separately.
Regarding renaming the feature - will we still need an alias for the old name?

RKSimon accepted this revision.Jul 22 2015, 8:33 AM
RKSimon edited edge metadata.

Thanks Michael, this looks ready to go. Can you confirm what testing you've done with larger codebases that have a better chance of causing PUSH to be generated?

This revision is now accepted and ready to land.Jul 22 2015, 8:33 AM
spatel edited edge metadata.Jul 22 2015, 8:37 AM

Regarding renaming the feature - will we still need an alias for the old name?

My understanding is that we make no compatibility guarantees for developer tool command-line options, so unless this is exposed in clang, we're free to change it.

That said, I don't see that policy officially documented, so we should probably confirm with someone else. :)


Simon, I've done testing with spec cpu2k, did not have any issues.
Sanjay, I've confirmed it with Chandler. Will do the name-change in a separate patch.

This revision was automatically updated to reflect the committed changes.