This is an archive of the discontinued LLVM Phabricator instance.

Prefer AVX512 memcpy when applicable
ClosedPublic

Authored by davezarzycki on Sep 21 2019, 3:12 AM.

Details

Summary

When AVX512 is available and the preferred vector width is 512-bits or more, we should prefer AVX512 for memcpy().

https://bugs.llvm.org/show_bug.cgi?id=43240

Diff Detail

Event Timeline

davezarzycki created this revision.Sep 21 2019, 3:12 AM

Do we have a similar test for memset that should be updated?

llvm/lib/Target/X86/X86ISelLowering.cpp
2156 ↗(On Diff #221161)

VLX shouldn’t be needed. That’s only for 128/256 bit vectors.

The patch no longer depends on the AVX512 vector length extension.

Do we have a similar test for memset that should be updated?

Hi @craig.topper – I can try and update a test of memset, but I don't see one that is trying to be comprehensive. For whatever it may be worth, the test suite passes (but that's not an accomplishment). Do you have a request?

I added some memset tests in r372494 and the RUN line changes here. Can you rebase?

Hi @craig.topper – Thanks for writing the memset tests. I've expanded the non-zero one to also handle different preferred vector sizes.

craig.topper added inline comments.Sep 22 2019, 10:44 AM
llvm/test/CodeGen/X86/memset-nonzero.ll
512 ↗(On Diff #221217)

This doesn't look like it was generated by the update_llc_test_checks script.

davezarzycki added inline comments.Sep 22 2019, 11:24 AM
llvm/test/CodeGen/X86/memset-nonzero.ll
512 ↗(On Diff #221217)

How so? I tried running update_llc_test_checks against the test and nothing changed.

craig.topper added inline comments.Sep 22 2019, 11:46 AM
llvm/test/CodeGen/X86/memset-nonzero.ll
512 ↗(On Diff #221217)

I've never seen the script mix prefixes like this. As far as I know it always creates a block with the same prefix separated by a blank line from the other prefixes.

RKSimon added a subscriber: RKSimon.
RKSimon added inline comments.
llvm/test/CodeGen/X86/memset-nonzero.ll
512 ↗(On Diff #221217)

Something's definitely gone wrong here, it might be that you've not set the --llc-binary args correctly? I'd recommend manually deleting the all the 'AVX512' checks from all these cases and then regenerating to see what happens.

davezarzycki marked an inline comment as done.

This patch has now been run through update_llc_test_checks.

This revision is now accepted and ready to land.Sep 22 2019, 2:55 PM