Previously a lot of StoreRetval instructions with undef operand were generated
on NVPTX target when a big struct was returned by value. It resulted in a lot
of unneeded st.param.* instructions in final assembly. The patch solves the
issue by implementing the logic in NVPTX-specific part of DAG combiner.
Details
Diff Detail
Event Timeline
LGTM in principle, modulo tests as they need to actually check that we do not produce unwanted stores.
Also, the change is unlikely to affect the actual GPU code, only PTX, as ptxas is smart enough to eliminate undef stores: https://godbolt.org/z/fdb1hGdxW
llvm/test/CodeGen/NVPTX/store-retval.ll | ||
---|---|---|
33–38 | This only verifies that the known values were stored, but the test would still pass on the current code which also stores the undef values. |
@tra Thanks for your review! Fixed tests.
Also, the change is unlikely to affect the actual GPU code, only PTX, as ptxas is smart enough to eliminate undef stores: https://godbolt.org/z/fdb1hGdxW
Hmm, it's an interesting notice, however, I suppose that enhancing PTX code also looks valuable.
llvm/test/CodeGen/NVPTX/store-retval.ll | ||
---|---|---|
33–38 | Fixed, thanks! |
This only verifies that the known values were stored, but the test would still pass on the current code which also stores the undef values.
You do need a handful of CHECK-NOT for st-param before/after the stores of valid values.