I have a question about affine maps list in MemRef type in general. Just this is the second place on my memory, where only the first element of the list is used. Which brings a question, why MemRef uses list of affine maps and not a single affine map? I was under impression, that the list allows to store a composition of affine maps to make it simpler for some cases. Is this true or its wrong interpretation?
There is a minor semantic difference between the last map (though several places in the code use the first map) that is intended specifically for the virtual->physical layout and the remaining maps that are just reindexings of the virtual index space. Originally, the non-composed form was intended to preserve the information about data transformations for future analyses to remove the need for decomposing an affine map into a sequence of some primitives. Such decomposition is not unique and its computation non-trivial. I haven't seen any practical uses of this form, which certainly does not mean they don't exist. One can, for example, move some reindexings from the data space (memref) to the iteration space (have loops with different bounds) but not others to save on address computation.
Most pipelines in the main repository use the strided memref format, which is always encoded as a single map and is significantly easier to reason about than compositions of arbitrary maps. The subview operation was designed for this subset of memrefs, and you seem to be the first to hit the need of using it on map compositions.
@ftynse Thanks for clarification! Actually in our internal project we use decomposed list of affine maps exactly as in your example (separate virtual indexing permutation and strided form to get physical indexing). At least this approach doesn't contradict original MemRef design.
One more question. What about introducing getComposedAffineMap() method to MemRefType, which will incapsulate this composition logic? Just this new code was copy pasted from other place. The new method will allow to avoid this duplication. If you are OK with it, I will add it as a part of this patch.
I am afraid this is going in a direction that we do not want to support (at least not in subview).
I see composition of layout maps in the memref type as premature complexity introduction, similarly to the early days when affine.apply used to be multi-result.
I'd need to see strong evidence that this is useful and cannot be achieved with other means to be convinced of its relevance.
Since this has been around for almost 3y and it only materializes as checks and assertions that this is either empty or single element, I'd love to see the MemRefType cleaned up.
It would be straightforward to add back in a more specialized type later if it turns out to be eventually needed.
The thing is - MemRefType API allows to use lists of affine maps. But in various places there are assumption, that it contains single item (or it is empty). And sometimes those assumption is not even checked and just the first/last item is used without any check/assertion. IMHO, this makes the code base fragile.
From my point of view, I see two ways to overcome this:
- Add getComposedAffineMap method to MemRefType and use it in places, where single affine map is processed. This will allow to preserve the current MemRefType behavior.
- Forbid affine map lists and use only single affine map in MemRefType.
@nicolasvasilache I've started a discussion on discourse with this topic - https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968
Just I'd like to hear wider audience opinion on that question, since the 2nd option (list removal) breaks backward compatibility and potentially might affect 3rd party projects.
Actually it is shown in the test) We decompose virtual index permutation and strided form:
Instead of single
#map = affine_map<(n, c, h, w) -> (n * 64 + c + h * 16 + w * 2)>
we store 2 maps:
#map1 = affine_map<(n, c, h, w) -> (n, h, w, c)> #map2 = affine_map<(n, h, w, c) -> (n * 64 + h * 16 + w * 2 + c)>
This patch itself looks good to me as this is really fixing the memref.subview op -- because the memref type does have a list of maps and everything other than the last map in the list was silently being ignored even though the utilities to handle these do exist in a simple and clean way. I'm surprised there is discussion on this patch itself and this is being mixed with the design choice of "single affine map" + "list of affine maps" for the layout map.
Use braces for then block if using braces for the else block.
which is what the semantics of this op is defined for.
But this isn't reflected anywhere in the described semantics AFAICS:
Besides there isn't a need to restrict it to a single map since one can check in practically constant time whether a list of affine maps conforms to a stride specification -- just compose the maps (as shown in this revision) + check if the composed map is in stride + offset form (utility already exists as you know) and have the verifier emit error if it doesn't if one wishes to restrict this op to strided memrefs. In fact, like reinterpret_cast, this is another op (and perhaps the one that introduced this pattern) that is storing data that's already carried or can be carried in the type (even with a single affine map): the three trailing attributes here don't need to be attributes - you just need three accessors on the op if they are stored in the type (part of the layout map) - you even get an accurate dump/print of the memref IR Type
let arguments = (ins AnyMemRef:$source, Variadic<Index>:$offsets, Variadic<Index>:$sizes, Variadic<Index>:$strides, I64ArrayAttr:$static_offsets, I64ArrayAttr:$static_sizes, I64ArrayAttr:$static_strides );
Storing them in the op would just be redundancy, potential inconsistency risk, extra memory footprint throughout the lifetime of the op (FWIW), and more importantly creating the need to look at the op definition or track the origin of the op to recover this information (which would be complex if it's across functions or even intra-function with "isolated-from-above" ops or ops that take explicit arguments like gpu.launch).
The verifier failing is an issue to address separately. However, note that composing maps can lead to an identity map as well (for eg. permutations canceling each other!) and that would still be valid for memref.subview even if the semantics of this op are to always expect an identity map! Let me make it clear: this PR *itself* is fixing things and making things *strictly* better - the previous code was just wrong in simply picking the last map. If the verifier has to be updated, that should be addressed in a separate revision. One can still compose the maps so long as there is a list and that is the right thing to do. To come back, @nicolasvasilache, since you've marked this "Requested changes", could you say what exactly is wrong with this revision itself?