This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Set Fast flag for unaligned memory accesses
ClosedPublic

Authored by luke on May 17 2023, 5:14 AM.

Details

Summary

The +unaligned-scalar-mem and +unaligned-vector-mem features were added in
D126085 and D149375 respectively to allow subtargets to indicate that
they supported misaligned loads/stores with "sufficient" performance.
This is separate from whether or not the target actually supports
misaligned accesses, which could be determined from Zicclsm.

This patch enables the Fast flag under the assumption that any subtarget
that declares support for +unaligned-*-mem will want to opt into
optimisations that take advantage of misaligned scalar accesses, such as
store merging.

Diff Detail

Event Timeline

luke created this revision.May 17 2023, 5:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 17 2023, 5:14 AM
luke requested review of this revision.May 17 2023, 5:14 AM
luke planned changes to this revision.Jun 6 2023, 2:09 AM

Returning for revision as this affects codegen in benchmarks quite significantly and I would like to evaluate the changes first

Reverse ping.

I spent some time looking at this over the last few days, and I think this makes sense to get landed. I do want to request that you also enabled the Fast flag for the vector analogy flag. From what I can tell, it causes no additional test diff, but doing them together is likely safer for two reasons. First, it corresponds to the tested downstream config. Second, a number of the merge transforms are in terms of comparisons of "how fast" operations are. Thus having one set and not the other can result in odd results.

Reverse ping.

I spent some time looking at this over the last few days, and I think this makes sense to get landed. I do want to request that you also enabled the Fast flag for the vector analogy flag. From what I can tell, it causes no additional test diff, but doing them together is likely safer for two reasons. First, it corresponds to the tested downstream config. Second, a number of the merge transforms are in terms of comparisons of "how fast" operations are. Thus having one set and not the other can result in odd results.

If I remember right setting fast for vector has some effect on memcpy and memset expansion.

Reverse ping.

I spent some time looking at this over the last few days, and I think this makes sense to get landed. I do want to request that you also enabled the Fast flag for the vector analogy flag. From what I can tell, it causes no additional test diff, but doing them together is likely safer for two reasons. First, it corresponds to the tested downstream config. Second, a number of the merge transforms are in terms of comparisons of "how fast" operations are. Thus having one set and not the other can result in odd results.

If I remember right setting fast for vector has some effect on memcpy and memset expansion.

It does, but it's currently pretty indirect. Our maximum size uses for memset/memcpy lowering is currently i64. The store merging code then kicks in to form broader loads and stores. As a result, the impact on mem* lowering is the impact on store merging and load combining. This change does enable merging for unaligned cases, and thus does impact the mem* lowering.

See test/CodeGen/RISCV/rvv/memset-inline.ll (once this patch is rebased)

luke updated this revision to Diff 542938.Jul 21 2023, 8:30 AM

Rebase and enable fast flag for vector accesses too

luke retitled this revision from [RISCV] Set Fast flag for unaligned scalar memory accesses to [RISCV] Set Fast flag for unaligned memory accesses.Jul 21 2023, 8:32 AM
luke edited the summary of this revision. (Show Details)

LGTM

p.s. I have a patch to implement getOptimalMemOpType which I'm drafting, and will post once this lands.

reames accepted this revision.Jul 21 2023, 12:02 PM
This revision is now accepted and ready to land.Jul 21 2023, 12:02 PM
This revision was landed with ongoing or failed builds.Jul 24 2023, 2:59 AM
This revision was automatically updated to reflect the committed changes.