This patch add __builtin_matrix_column_major_store to Clang,
as described in clang/docs/MatrixTypes.rst. In the initial version,
the stride is not optional yet.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Ping.
Applied feedback from D72778 to this patch, improved tests, support conversions/placeholders.
One thing I am not sure is how to properly handle template substitutions for the pointer expression for code like the one below, where we need to apply substitutions to get the actual pointer type. Currently the patch looks through SubstTemplateTypeParmType types in Sema to construct the result type. Should we look through SubstTemplateTypeParmType in IRGen too to decide whether to call EmitPointerWithAlignment or EmitArrayToPointerDecay? Or is there a place in sema that should get rid of the substitution (perhaps in SemaChecking.cpp)?
template <typename T> struct remove_pointer { typedef T type; }; template <typename T> struct remove_pointer<T *>{ typedef typename remove_pointer<T>::type type; }; // Same as column_major_load_with_stride, but with the PtrT argument itself begin a pointer type. template <typename PtrT, unsigned R, unsigned C, unsigned S> matrix_t<typename remove_pointer<PtrT>::type, R, C> column_major_load_with_stride2(PtrT Ptr) { return __builtin_matrix_column_major_load(Ptr, R, C, S); } void call_column_major_load_with_stride2(float *Ptr) { matrix_t<float, 2, 2> m = column_major_load_with_stride2<float *, 2, 2, 2>(Ptr); }
SubstTemplateTypeParmType is a "sugar" type node, like a typedef, and code should generally be looking through it automatically by using getAs rather than isa / dyn_cast.
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
15107 | You should be doing DefaultFunctionArrayLvalueConversion here, which will eliminate all the special cases for arrays, both below and in IRGen. It would've been fine to do that for your other builtin, too, it just wasn't necessary because it can never allow pointers. | |
15110 | Probably best to write it back into the call immediately at this point. | |
15116 | Thinking that you need to do this is a huge indicator that you're doing something wrong later. You should not have to remove type sugar. | |
15121 | You need to allow this expression to be dependently-typed. There's a generic DependentTy that you can use as the result type of the call in this case. | |
15128 | getAs<PointerType>() is the right way to do this. (You won't need getAsArrayType if you decay arrays properly above.) | |
15130 | It's almost never correct to do "local" qualifier manipulation in Sema. You want to remove the const qualifier, which means removing it through however much type sugar might be wrapping it. In reality, though, you actually want to remove *all* qualifiers, not just const. So you should just use getUnqualifiedType(). (And then you need to make sure that the IRGen code works on arbitrarily-qualified pointers. It should already handle address spaces unless you're doing something very strange. To handle volatile, you just need to be able to pass down a volatile flag to your helper function. The other qualifiers should all either not require special handling or not be allowed on integer/float types.) | |
15138 | Value dependence implies type dependence. Butt you can't do these checks until after you've at least lowered placeholders. It's not really necessary to build a DependentSizedMatrixType here rather than just using DependentTy. It's not a bad thing to do — it *could* enable better type-checking of templates, like if you did this load and then had code trying to do a non-matrix operation on the result you could maybe reject that immediately instead of waiting for instantiation — but it's not really necessary, either. | |
15178 | It'd be nice to have comments for these magic values, like /*stride*/ 2. |
Updated to
- use DefaultFunctionArrayLvalueConversion for pointer conversion, use getAs<PointerType> subsequently
- return Context.DependentTy if any part of the result matrix type is still type-dependent
- add assertion & todo for volatile
- pass string arguments to diagnostics instead of magic integers
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
15107 | Great thanks! That & together with DependentTy seems to solve the issue related to pointer type template expressions. | |
15110 | Updated to update the call immediately after conversions here and below. | |
15116 | Not needed anymore, as mentioned above. Now the remove_pointer test also works :) | |
15121 | I've updated the code to use DependentTy if any of the parts of the result matrix type is still dependently-typed. | |
15130 |
Updated. Currently volatile cannot be specified for the @llvm.matrix.columnwise.load/store builtins. I'll put up an update for the intrinsics, for now I added an assertion in IRGen. I recently put up a patch that allows adding nuw/nsw info to the multiply builtin using operand bundles (D81166). For volatile, we could add another bundle or a boolean argument like we have for memcpy intrinsic I think. I am leaning towards an operand bundle version for this optional argument. Do you have any preference? | |
15138 |
I moved the conversion earlier.
Thanks, I opted to use DependenTy. getDependentSizedMatrixType asserts that the element type is a valid matrix element type (that's the reason for initially peeking through the template substitution expressions). We could still return a DependentSizedMatrixType if only the row or column expressions are dependently-typed, but from your comment I think it probably won't be worth it. | |
15178 | On second thought, the benefits of the magic numbers is rather small. I updated the diagnostics to take strings with the names of the arguments directly. |
clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
---|---|---|
10789 | These are the same now. | |
clang/lib/CodeGen/CGBuiltin.cpp | ||
2403 | This can be simplified now. | |
2409 | You should honor the alignment of Src. If you emit a bunch of scattered loads, e.g. if the stride is not a constant, it might just be a cap on the alignment of the individual loads rather than a general optimization; but still, you should honor it. | |
clang/lib/Sema/SemaChecking.cpp | ||
15094 | You do need to check whether your extension is enabled in this builtin. | |
15112 | You can just bail out early here (set a dependent type on the expression and return) if PtrExpr is type-dependent. | |
15178 | Actually, isn't this diagnostic redundant with the conversion you do in ApplyArgumentConversions? | |
15185 | Might as well hoist the MaybeRows check up so that we skip this whole thing if we don't have a row count. |
clang/lib/Sema/SemaChecking.cpp | ||
---|---|---|
15130 | The only thing I really care about is that it can't be dropped implicitly. That's not legal with a bundle, but it's maybe a little more likely as an error of omission. On the other hand, you do need to pass down an alignment, and that really shouldn't be an optional argument, so maybe that's an opportunity to add a volatile argument as well. |
Simplified code as suggested, check if matrix type extensions is enabled (and add test) and set align attribute for pointer argument.
clang/include/clang/Basic/DiagnosticSemaKinds.td | ||
---|---|---|
10789 | Ah yes, it has been unused actually. Dropped. | |
clang/lib/CodeGen/CGBuiltin.cpp | ||
2403 | Folded the 2 statements. | |
2409 | Pass the alignment through to the builder. It sets the align attribute for the pointer argument now | |
clang/lib/Sema/SemaChecking.cpp | ||
15094 | Done and also added a test. | |
15112 | added early exit. | |
15130 | I think for alignment we can use the align call attribute, which is what I am using in the latest update. | |
15185 | Moved to the outer if. |
clang/include/clang/Sema/Sema.h | ||
---|---|---|
4708 | Please spell out "type" in the method name. | |
12126 | I don't think the word "overload" is doing anything in either of these method names. | |
clang/lib/Sema/SemaChecking.cpp | ||
15130 | Is there a reason this intrinsic can't be changed? You don't need to do it in this patch, but using the "align" attribute as call-site attribute that's only meaningful on certain initrinsics seems like a really poor representation choice, especially for something as semantically important as an alignment assumption. |
clang/include/clang/Sema/Sema.h | ||
---|---|---|
12126 | Removed Overload here and for SemaBuiltinMatrixTranspose | |
clang/lib/Sema/SemaChecking.cpp | ||
15130 | I think we should be able to change them. I put up D81472 to update the load/store intrinsics to update the name, types of stride/rows/columns and add a IsVolatile flag. We could also pass the alignment as an extra parameter, but it seems like the align attribute already provides a way to specify alignment on a per-argument basis. Using it would mean we don't have to teach various passes that use/propagate alignment info about the new intrinsics. |
Thank you very much again John! This patch is pending on a few smallish improvements to the load/store intrinsics (D81472) and I'll land once that one is wrapped up.
These are the same now.