addrspacecast X addrspace(M)* to Y addrspace(N)*
-->
bitcast X addrspace(M)* to Y addrspace(M)*
addrspacecast Y addrspace(M)* to Y addrspace(N)*
This canonicalization makes NVPTXFavorNonGenericAddrSpaces more effective, and can potentially benefit many other cases.
This patch is based on D2186 with fixes and more tests:
- Fix an issue in D2186 that causes InstCombine to run into infinite loops
- Add the failed test as @canonicalize_addrspacecast in test/Transforms/InstCombine/addrspacecast.ll
- Do bitcast before addrspacecast because addrspacecasts from non-generic to generic can be folded into load/store
- Updated all affected tests. One affected test (@test2_addrspacecast) in memcpy-from-global.ll is actually better optimized because of this canonicalization. See how alloca %T being transformed to alloca [128 x i8]
I would remove the comment about a specific target pass