Page MenuHomePhabricator

[Matrix] Add __builtin_matrix_column_load to Clang.
ClosedPublic

Authored by fhahn on Jan 15 2020, 9:25 AM.

Details

Summary

This patch add __builtin_matrix_column_major_store to Clang,
as described in clang/docs/MatrixTypes.rst. In the initial version,
the stride is not optional yet.

Diff Detail

Event Timeline

fhahn created this revision.Jan 15 2020, 9:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 15 2020, 9:25 AM
Herald added a subscriber: tschuett. · View Herald Transcript

Unit tests: unknown.

clang-tidy: unknown.

clang-format: unknown.

Build artifacts: diff.json, console-log.txt

fhahn retitled this revision from [Matrix] Add __builtin_matrix_column_load to Clang (WIP). to [Matrix] Add __builtin_matrix_column_load to Clang..May 13 2020, 4:05 PM
fhahn edited the summary of this revision. (Show Details)
fhahn added reviewers: rjmccall, rsmith, jfb, Bigcheese.
fhahn updated this revision to Diff 263879.May 13 2020, 4:07 PM
fhahn edited the summary of this revision. (Show Details)

ping. Simplify code, extend tests. This should now be ready for review.

fhahn updated this revision to Diff 269249.Jun 8 2020, 8:25 AM

Ping.

Applied feedback from D72778 to this patch, improved tests, support conversions/placeholders.

One thing I am not sure is how to properly handle template substitutions for the pointer expression for code like the one below, where we need to apply substitutions to get the actual pointer type. Currently the patch looks through SubstTemplateTypeParmType types in Sema to construct the result type. Should we look through SubstTemplateTypeParmType in IRGen too to decide whether to call EmitPointerWithAlignment or EmitArrayToPointerDecay? Or is there a place in sema that should get rid of the substitution (perhaps in SemaChecking.cpp)?

template <typename T> struct remove_pointer {
typedef T type;
};

template <typename T> struct remove_pointer<T *>{
typedef typename remove_pointer<T>::type type;
};

// Same as column_major_load_with_stride, but with the PtrT argument itself begin a pointer type.
template <typename PtrT, unsigned R, unsigned C, unsigned S>
matrix_t<typename remove_pointer<PtrT>::type, R, C> column_major_load_with_stride2(PtrT Ptr) {
return __builtin_matrix_column_major_load(Ptr, R, C, S);
}

void call_column_major_load_with_stride2(float *Ptr) {
matrix_t<float, 2, 2> m = column_major_load_with_stride2<float *, 2, 2, 2>(Ptr);
}

SubstTemplateTypeParmType is a "sugar" type node, like a typedef, and code should generally be looking through it automatically by using getAs rather than isa / dyn_cast.

clang/lib/Sema/SemaChecking.cpp
15134

You should be doing DefaultFunctionArrayLvalueConversion here, which will eliminate all the special cases for arrays, both below and in IRGen.

It would've been fine to do that for your other builtin, too, it just wasn't necessary because it can never allow pointers.

15137

Probably best to write it back into the call immediately at this point.

15143

Thinking that you need to do this is a huge indicator that you're doing something wrong later. You should not have to remove type sugar.

15148

You need to allow this expression to be dependently-typed. There's a generic DependentTy that you can use as the result type of the call in this case.

15155

getAs<PointerType>() is the right way to do this. (You won't need getAsArrayType if you decay arrays properly above.)

15157

It's almost never correct to do "local" qualifier manipulation in Sema. You want to remove the const qualifier, which means removing it through however much type sugar might be wrapping it.

In reality, though, you actually want to remove *all* qualifiers, not just const. So you should just use getUnqualifiedType(). (And then you need to make sure that the IRGen code works on arbitrarily-qualified pointers. It should already handle address spaces unless you're doing something very strange. To handle volatile, you just need to be able to pass down a volatile flag to your helper function. The other qualifiers should all either not require special handling or not be allowed on integer/float types.)

15165

Value dependence implies type dependence. Butt you can't do these checks until after you've at least lowered placeholders.

It's not really necessary to build a DependentSizedMatrixType here rather than just using DependentTy. It's not a bad thing to do — it *could* enable better type-checking of templates, like if you did this load and then had code trying to do a non-matrix operation on the result you could maybe reject that immediately instead of waiting for instantiation — but it's not really necessary, either.

15205

It'd be nice to have comments for these magic values, like /*stride*/ 2.

fhahn updated this revision to Diff 269363.Jun 8 2020, 2:55 PM

Updated to

  • use DefaultFunctionArrayLvalueConversion for pointer conversion, use getAs<PointerType> subsequently
  • return Context.DependentTy if any part of the result matrix type is still type-dependent
  • add assertion & todo for volatile
  • pass string arguments to diagnostics instead of magic integers
fhahn marked 12 inline comments as done.Jun 8 2020, 3:07 PM
fhahn added inline comments.
clang/lib/Sema/SemaChecking.cpp
15134

Great thanks! That & together with DependentTy seems to solve the issue related to pointer type template expressions.

15137

Updated to update the call immediately after conversions here and below.

15143

Not needed anymore, as mentioned above. Now the remove_pointer test also works :)

15148

I've updated the code to use DependentTy if any of the parts of the result matrix type is still dependently-typed.

15157

In reality, though, you actually want to remove *all* qualifiers, not just const. So you should just use getUnqualifiedType(). (And then you need to make sure that the IRGen code works on arbitrarily-qualified pointers. It should already handle address spaces unless you're doing something very strange. To handle volatile, you just need to be able to pass down a volatile flag to your helper function. The other qualifiers should all either not require special handling or not be allowed on integer/float types.)

Updated. Currently volatile cannot be specified for the @llvm.matrix.columnwise.load/store builtins. I'll put up an update for the intrinsics, for now I added an assertion in IRGen. I recently put up a patch that allows adding nuw/nsw info to the multiply builtin using operand bundles (D81166). For volatile, we could add another bundle or a boolean argument like we have for memcpy intrinsic I think. I am leaning towards an operand bundle version for this optional argument. Do you have any preference?

15165

Value dependence implies type dependence. Butt you can't do these checks until after you've at least lowered placeholders.

I moved the conversion earlier.

It's not really necessary to build a DependentSizedMatrixType here rather than just using DependentTy. It's not a bad thing to do — it *could* enable better type-checking of templates, like if you did this load and then had code trying to do a non-matrix operation on the result you could maybe reject that immediately instead of waiting for instantiation — but it's not really necessary, either.

Thanks, I opted to use DependenTy. getDependentSizedMatrixType asserts that the element type is a valid matrix element type (that's the reason for initially peeking through the template substitution expressions). We could still return a DependentSizedMatrixType if only the row or column expressions are dependently-typed, but from your comment I think it probably won't be worth it.

15205

On second thought, the benefits of the magic numbers is rather small. I updated the diagnostics to take strings with the names of the arguments directly.

rjmccall added inline comments.Jun 8 2020, 3:15 PM
clang/include/clang/Basic/DiagnosticSemaKinds.td
10793

These are the same now.

clang/lib/CodeGen/CGBuiltin.cpp
2405

This can be simplified now.

2411

You should honor the alignment of Src. If you emit a bunch of scattered loads, e.g. if the stride is not a constant, it might just be a cap on the alignment of the individual loads rather than a general optimization; but still, you should honor it.

clang/lib/Sema/SemaChecking.cpp
15121

You do need to check whether your extension is enabled in this builtin.

15139

You can just bail out early here (set a dependent type on the expression and return) if PtrExpr is type-dependent.

15205

Actually, isn't this diagnostic redundant with the conversion you do in ApplyArgumentConversions?

15212

Might as well hoist the MaybeRows check up so that we skip this whole thing if we don't have a row count.

rjmccall added inline comments.Jun 8 2020, 3:44 PM
clang/lib/Sema/SemaChecking.cpp
15157

The only thing I really care about is that it can't be dropped implicitly. That's not legal with a bundle, but it's maybe a little more likely as an error of omission. On the other hand, you do need to pass down an alignment, and that really shouldn't be an optional argument, so maybe that's an opportunity to add a volatile argument as well.

fhahn updated this revision to Diff 269378.Jun 8 2020, 4:37 PM
fhahn marked 5 inline comments as done.

Simplified code as suggested, check if matrix type extensions is enabled (and add test) and set align attribute for pointer argument.

Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2020, 4:37 PM
fhahn marked 9 inline comments as done.Jun 8 2020, 4:41 PM
fhahn added inline comments.
clang/include/clang/Basic/DiagnosticSemaKinds.td
10793

Ah yes, it has been unused actually. Dropped.

clang/lib/CodeGen/CGBuiltin.cpp
2405

Folded the 2 statements.

2411

Pass the alignment through to the builder. It sets the align attribute for the pointer argument now

clang/lib/Sema/SemaChecking.cpp
15121

Done and also added a test.

15139

added early exit.

15157

I think for alignment we can use the align call attribute, which is what I am using in the latest update.

15212

Moved to the outer if.

rjmccall added inline comments.Jun 8 2020, 7:29 PM
clang/include/clang/Sema/Sema.h
4707

Please spell out "type" in the method name.

12130

I don't think the word "overload" is doing anything in either of these method names.

clang/lib/Sema/SemaChecking.cpp
15157

Is there a reason this intrinsic can't be changed? You don't need to do it in this patch, but using the "align" attribute as call-site attribute that's only meaningful on certain initrinsics seems like a really poor representation choice, especially for something as semantically important as an alignment assumption.

fhahn updated this revision to Diff 269556.Jun 9 2020, 8:38 AM
fhahn marked 2 inline comments as done.

Adjust naming as suggested, pass through volatile flag.

fhahn marked 3 inline comments as done.Jun 9 2020, 8:45 AM
fhahn added inline comments.
clang/include/clang/Sema/Sema.h
12130

Removed Overload here and for SemaBuiltinMatrixTranspose

clang/lib/Sema/SemaChecking.cpp
15157

I think we should be able to change them. I put up D81472 to update the load/store intrinsics to update the name, types of stride/rows/columns and add a IsVolatile flag. We could also pass the alignment as an extra parameter, but it seems like the align attribute already provides a way to specify alignment on a per-argument basis. Using it would mean we don't have to teach various passes that use/propagate alignment info about the new intrinsics.

rjmccall accepted this revision.Jun 9 2020, 10:00 PM

LGTM.

clang/include/clang/Sema/Sema.h
12130

Thanks.

clang/lib/Sema/SemaChecking.cpp
15157

Ah, I'd forgotten that when we updated memcpy etc., we made it specify an alignment with an alignment parameter attribute instead of a separate argument. Yes, that's fine to imitate.

This revision is now accepted and ready to land.Jun 9 2020, 10:00 PM

LGTM.

Thank you very much again John! This patch is pending on a few smallish improvements to the load/store intrinsics (D81472) and I'll land once that one is wrapped up.

fhahn updated this revision to Diff 271613.Jun 18 2020, 2:03 AM

Rebased after recent parent patches landed. Will commit shortly.

This revision was automatically updated to reflect the committed changes.