When AVX512 is available and the preferred vector width is 512-bits or more, we should prefer AVX512 for memcpy().
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Do we have a similar test for memset that should be updated?
llvm/lib/Target/X86/X86ISelLowering.cpp | ||
---|---|---|
2156 | VLX shouldn’t be needed. That’s only for 128/256 bit vectors. |
Hi @craig.topper – I can try and update a test of memset, but I don't see one that is trying to be comprehensive. For whatever it may be worth, the test suite passes (but that's not an accomplishment). Do you have a request?
Hi @craig.topper – Thanks for writing the memset tests. I've expanded the non-zero one to also handle different preferred vector sizes.
llvm/test/CodeGen/X86/memset-nonzero.ll | ||
---|---|---|
512 | This doesn't look like it was generated by the update_llc_test_checks script. |
llvm/test/CodeGen/X86/memset-nonzero.ll | ||
---|---|---|
512 | How so? I tried running update_llc_test_checks against the test and nothing changed. |
llvm/test/CodeGen/X86/memset-nonzero.ll | ||
---|---|---|
512 | I've never seen the script mix prefixes like this. As far as I know it always creates a block with the same prefix separated by a blank line from the other prefixes. |
llvm/test/CodeGen/X86/memset-nonzero.ll | ||
---|---|---|
512 | Something's definitely gone wrong here, it might be that you've not set the --llc-binary args correctly? I'd recommend manually deleting the all the 'AVX512' checks from all these cases and then regenerating to see what happens. |
VLX shouldn’t be needed. That’s only for 128/256 bit vectors.