SROA sometimes preserves MD_mem_parallel_loop_access and MD_access_group metadata on loads/stores, and sometimes fails to do so. This change adds copying of the MD after other CreateAlignedLoad/CreateAlignedStores. Also fix a case where the metadata was being copied from a load, rather than the store.
Added a LIT test to catch one case.
Details
Diff Detail
Event Timeline
re-uploading to see if the unrelated Windows build failures go away, as they aren't related at all
Not an expert on llvm.access.group, so someone please chime in with more thoughts. My one comment here is that since this impacts more than a load, I'd like to see test cases added for all the relevant behavior (including store).
I really don't know enough to recreate the paths through all the places. The LIT test that I submitted is derived from an actual failure to propagate llvm.access.group in our compiler (and the change in visitIntrinsicInst(IntrinsicInst &II) from LI to SI). The others I just added to be consistent with existing calls to propagate the metadata that were already present.
llvm.access.group is used with llvm.loop.parallel_accesses to mark loads/stores that don't have loop carried memory dependencies
llvm/lib/Transforms/Scalar/SROA.cpp | ||
---|---|---|
2478 | NewAI is the alloca instruction; it does not have parallel loop access or access group metadata which only apply to memory accesses. | |
2538 | Did you mean LI instead of NewAI? | |
llvm/test/Transforms/SROA/mem-par-metadata-sroa-cast.ll | ||
8–9 | This tests the presplitLoadsAndStores case. Shouldn't it also have split the store? |
Removed the copies of metadata from NewAI, as they can't contain any metadata. Only keep the ones from the load/stores
llvm/test/Transforms/SROA/mem-par-metadata-sroa-cast.ll | ||
---|---|---|
8–9 | The store is replaced with: %0 = bitcast i32 %X371 to float %1 = bitcast i32 %X373 to float Should I add that to the LIT test? The important part was the loss of the llvm.access.group from the store |
llvm/lib/Transforms/Scalar/SROA.cpp | ||
---|---|---|
2478 | This one might have made sense | |
llvm/test/Transforms/SROA/mem-par-metadata-sroa-cast.ll | ||
8–9 | The bitcasts belong to the load. I guess SROA determined that the store is dead (stack memory before returning) and removed it. Not an indication of a good test case. I suggest to make %PREV a function pointer so this does not happen. Attach a separate !llvm.access.group !1 to the store and test it as well. Most other instructions in this function seems to be irrelevant. |
llvm/lib/Transforms/Scalar/SROA.cpp | ||
---|---|---|
2478 | Yes, you are right. I will pass in LI | |
llvm/test/Transforms/SROA/mem-par-metadata-sroa-cast.ll | ||
8–9 | If I make %PREV a function pointer, then the cast isn't done: compiler explorer |
Restored the copy of metadata for vectorized loads, passing in LI.
Trimmed the LIT test slightly
LGTM
llvm/test/Transforms/SROA/mem-par-metadata-sroa-cast.ll | ||
---|---|---|
8–9 | OK, I also tried a bit to get SROA to re-emit a store (and therefore test that the store's metadata is preserved), but without success. |
NewAI is the alloca instruction; it does not have parallel loop access or access group metadata which only apply to memory accesses.