This is an archive of the discontinued LLVM Phabricator instance.

[Polly] A new algorithm for identification of a SCoP statement that implement a matrix multiplication
ClosedPublic

Authored by gareevroman on Jan 5 2017, 8:18 AM.

Details

Summary

The current identification of a SCoP statement that implement a matrix multiplication does not help to identify different permutations of loops that contain it and check for dependencies, which can prevent it from being optimized. It also requires external determination of the operands of the matrix multiplication. This patch contains the implementation of a new algorithm that helps to avoid these issues.

Diff Detail

Repository
rL LLVM

Event Timeline

gareevroman updated this revision to Diff 83239.Jan 5 2017, 8:18 AM
gareevroman retitled this revision from to [Polly] A new algorithm for identification of a SCoP statement that implement a matrix multiplication.
gareevroman updated this object.
gareevroman added a subscriber: pollydev.

I've found out that test/ScheduleOptimizer/pattern-matching-based-opts.ll, test/ScheduleOptimizer/pattern-matching-based-opts_2.ll, test/ScheduleOptimizer/pattern-matching-based-opts_3.ll contain code that produced array accesses, which should be delinearized to be identified by the algorithm. Since, as far as I understand, they are delinearized in common case, these test cases were modified. Should we commit it in a separate patch?

Meinersbur edited edge metadata.Jan 11 2017, 4:22 AM

I did not review the soundness of the detection algorithm.

include/polly/ScheduleOptimizer.h
24–67 ↗(On Diff #83239)

Consider moving these into the polly namespace to not pollute the global namespace in file that include this header. This should also make the polly:: part unnecessary.

lib/Transform/ScheduleOptimizer.cpp
489 ↗(On Diff #83239)

Consider making DimInPos an unsigned or i an int. I'd prefer the latter.

504 ↗(On Diff #83239)

This is the same \brief as for isMatMulOperandConstraint. Can you make them more specific?

518 ↗(On Diff #83239)

Please consider making i an unsigned or store the result of isl_local_space_dim in a temporary int.

560 ↗(On Diff #83239)

As a general rule, please consider using int (or another signed type if you need more bits) for loop induction variables. This allows the compiler to do more optimizations, static analyzers warn you that a looparound could occur nad ubsan to warn you if a looparound does oocur.

610 ↗(On Diff #83239)
Map = isl_map_set_tuple_id(Map, DimType, DimId);
642 ↗(On Diff #83239)

Please add braces

643–647 ↗(On Diff #83239)

Maybe store (*MemA) in a variable?

661–664 ↗(On Diff #83239)

k must also be at position Pos.

668 ↗(On Diff #83239)

I don't understand this description of Pos

686 ↗(On Diff #83239)

Using isl_set_plain_get_val_if_fixed makes the function a best-effort algorithm. isl_set_plain_get_val_if_fixed can return NULL and I don't see code handling this case.

715–716 ↗(On Diff #83239)

This could use foreachEltWithBreak and with isMatMulOperandBasicMap made a lambda.

723 ↗(On Diff #83239)

Consider using something more specific than check.

737 ↗(On Diff #83239)

Consider storing (*MemA) to a variable.

777 ↗(On Diff #83239)

Not sure whether using a unicode symbol (→) in source code is a good idea

778–780 ↗(On Diff #83239)

There are ... expect ...

Did you mean

There are no ... except ...

?

796 ↗(On Diff #83239)

The last access could also be an arbitrary scalar access. Not sure whether on relying on the access' order is such a good idea. Could you use that last non-scalar access instead? These should be in-order.

1098–1099 ↗(On Diff #83239)

Consider declaring DimOutNum as int.

grosser edited edge metadata.Jan 17 2017, 11:56 AM

Hi Roman,

here some more comments.

lib/Transform/ScheduleOptimizer.cpp
618 ↗(On Diff #83239)

doesn -> does

663 ↗(On Diff #83239)

correspondS to the dependenCY produced by

775 ↗(On Diff #83239)

relations: MA2, MA3, and MD4 represent

test/ScheduleOptimizer/pattern-matching-based-opts.ll
65 ↗(On Diff #83239)

I don't think these attributes are needed, are they?

test/ScheduleOptimizer/pattern-matching-based-opts_2.ll
67 ↗(On Diff #83239)

Drop these.

Thanks for comments! I've updated the patch according to them.

P.S.: I've also tried to fix possible issues related to usage of ScheduleTreeOptimizer::createMacroKernel with band nodes that have more than three dimensions.

This is the same \brief as for isMatMulOperandConstraint. Can you make them more specific?

Right. Probably there is a duplication of code. I've tried to factor out the checking of coefficients of input and output dimensions.

This could use foreachEltWithBreak and with isMatMulOperandBasicMap made a lambda.

Wouldn't this require to capture variable to pass DimInPos and also rewrite isMatMulOperandBasicMap, using IslPtrs?

It seems that it wouldn't make the code more readable. Furthermore, according to my discussion with Tobias, probably it wouldn't good for the style of ScheduleOptimizer.cpp, if some of its functions used raw pointers and others used something else.

The last access could also be an arbitrary scalar access. Not sure whether on relying on the access' order is such a good idea. Could you use that last non-scalar access instead? These should be in-order.

I think we could do it.

This could use foreachEltWithBreak and with isMatMulOperandBasicMap made a lambda.

Wouldn't this require to capture variable to pass DimInPos and also rewrite isMatMulOperandBasicMap, using IslPtrs?

It seems that it wouldn't make the code more readable. Furthermore, according to my discussion with Tobias, probably it wouldn't good for the style of ScheduleOptimizer.cpp, if some of its functions used raw pointers and others used something else.

It would make capturing FirstPost and SecondPos a lot easier as they don't need to be aggregated into a DimInPos and casted around. However, I agree that we may not want to introduce an IslPtr style here only because of this.

Michael

include/polly/ScheduleOptimizer.h
75 ↗(On Diff #84945)

I don't know why this is not also in the polly namespace. However, this is an unrelated change.

157 ↗(On Diff #84945)

multiplication

lib/Transform/ScheduleOptimizer.cpp
659 ↗(On Diff #84945)

Another unicode character

Hi Michael,

Thanks for comments! I've tried to address them, fix issues related to permuteDimensions and also simplify the identification.

It would make capturing FirstPost and SecondPos a lot easier as they don't need to be aggregated into a DimInPos and casted around. However, I agree that we may not want to introduce an IslPtr style here only because of this.

OK. I'll fix it when everybody agrees to introduce an IslPtr style in ScheduleTreeOptimizer.

I don't know why this is not also in the polly namespace. However, this is an unrelated change.

OK. I'll try to fix it soon.

Meinersbur accepted this revision.Jan 31 2017, 6:39 AM

I don't really know how this new algorithm is better than the current one. Unless Tobias has a remark regarding that, I'd trust you and accept the change.

test/ScheduleOptimizer/pattern-matching-based-opts_3.ll
193 ↗(On Diff #85866)

Probably Tobias meant to drop this one as well.

This revision is now accepted and ready to land.Jan 31 2017, 6:39 AM
grosser accepted this revision.Feb 1 2017, 10:44 PM

LGTM

lib/Transform/ScheduleOptimizer.cpp
473 ↗(On Diff #85866)

has only (drop THE)

This revision was automatically updated to reflect the committed changes.