This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Improve detection of unmergable stores, based on type size (WIP)
Needs ReviewPublic

Authored by fhahn on Apr 2 2019, 8:45 AM.

Details

Reviewers
niravd
Summary

Note: this is WIP, there is still one case where we exit early
unnecessarily.

By constructing the VT for a merged store upfront and using that to
check if merging is legal, we can exit early in some more cases.

This speeds up some variations of PR41263, where we have lots of i64
stores.

Event Timeline

fhahn created this revision.Apr 2 2019, 8:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 2 2019, 8:45 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
niravd added a comment.Apr 2 2019, 6:01 PM

It's not sufficient to check if you can merge two stores into a valid node; there are backends where you need 4 or more to get a legal merged store.

If you look at target-specific implementations of CanMergeStoresTo it essentially serves as a context-specific find maximum store which is what we need here. If you massage that interface a bit you can fold most of this check in there.

fhahn added a comment.Apr 3 2019, 2:55 PM

It's not sufficient to check if you can merge two stores into a valid node; there are backends where you need 4 or more to get a legal merged store.

If you look at target-specific implementations of CanMergeStoresTo it essentially serves as a context-specific find maximum store which is what we need here. If you massage that interface a bit you can fold most of this check in there.

Yeah, that might be a better approach. IIUC we would need the (potentially separate) limits for integer types, floating point types and vector types. And accounting for when we can convert to vector types. Does that make sense?

fhahn added a comment.May 3 2019, 8:48 AM

It's not sufficient to check if you can merge two stores into a valid node; there are backends where you need 4 or more to get a legal merged store.

If you look at target-specific implementations of CanMergeStoresTo it essentially serves as a context-specific find maximum store which is what we need here. If you massage that interface a bit you can fold most of this check in there.

Sorry for the long delay with getting back to this one! I had a look at the X86 and AArch64 implementations of CanMergeStoresTo and it looks like they are not too helpful with getting a reasonable upper bound. They return true for functions without noimplicitfloat and otherwise limit the size to 64 bits or 32 bits for X86 32 bit mode. I think one thing that's more important to check is whether the merged type is legal for the target. This will weed out stores with types too large for the target.

So I think we need to do more work than just checking canMergeStoresTo unfortunately. What we would need is the maximum size for integer, float and vector stores I think, but I am not sure if any existing interface provides that info.

niravd added a comment.May 6 2019, 7:30 AM

The current interfaces definitely do not provide anything like that. As I recall, most of the use cases requiring canMergeStoresTo involve ad hoc ways to locally declare a type/operation invalid localized to the specific target, so I suggest the just marginally expanding hte interface to capture the one issue we've seen so far (no-implict-float) and defer any deeper analysis for when we run into it.