This is an archive of the discontinued LLVM Phabricator instance.

add pragmas to control Software Pipelining optimisation
ClosedPublic

Authored by avl on Dec 14 2018, 9:41 AM.

Details

Summary
[PIPELINER] add two pragmas to control Software Pipelining optimisation.
    
  #pragma clang loop pipeline(disable)
    
      Disable SWP optimization for the next loop.
      “disable” is the only possible value.
    
  #pragma clang loop pipeline_initiation_interval(number)
    
      Set value of initiation interval for SWP
      optimization to specified number value for
      the next loop. Number is the positive value
      greater than 0.
    
  These pragmas could be used for debugging or reducing
  compile time purposes. It is possible to disable SWP for
  concrete loops to save compilation time or to find bugs
  by not doing SWP to certain loops. It is possible to set
  value of initiation interval to concrete number to save
  compilation time by not doing extra pipeliner passes or
  to check created schedule for specific initiation interval.

  That is clang part of the fix

Diff Detail

Event Timeline

avl created this revision.Dec 14 2018, 9:41 AM
avl updated this revision to Diff 178264.Dec 14 2018, 12:22 PM

deleted small typo in CGLoopInfo.cpp

avl added subscribers: marksl, bcahoon, kparzysz, yakush.
aaron.ballman added inline comments.Dec 17 2018, 6:15 AM
include/clang/Basic/Attr.td
2871–2872

Missing full stops at the end of the comments. Also, having read "for only 'Value' value", I'm still not certain what's happening. I think this is a count of some kind, so perhaps "Tries to pipeline 'Values' times." but I don't know what the verb "pipeline" means in this context.

Are users going to understand what the ii means in the user-facing name?

include/clang/Basic/AttrDocs.td
2582

Vector width, interleave count, unrolling count, and the initiation interval for pipelining can be explicitly specified.

2654

The disable state can only be specified for `#pragma clang loop pipeline`.

2663

Spell out ii

instructs software -> instructs the software

2664

to use only specified -> to use the specified

Remove the space before the colon as well.

Having read the docs, I would have no idea how to pick a value for the initiation interval. I'm still not even certain what it controls or does. Can you expand the documentation for that (feel free to link to other sources if there are authoritative places we can point the user to)?

2676

details including -> details, including

include/clang/Basic/DiagnosticParseKinds.td
1187–1188

80-col

1202

Can you roll this into the above diagnostic with another %select, or does that make it too unreadable?

lib/CodeGen/CGLoopInfo.cpp
29

!Attrs.PipelineDisabled

126

Formatting is incorrect here (and below) -- you should run the patch through clang-format.

lib/CodeGen/CGLoopInfo.h
70

Missing full-stop. Here and elsewhere -- can you look at all the comments and make sure they're properly punctuated?

71

One good refactoring (don't feel obligated to do this) would be to use in-class initializers here.

test/Parser/pragma-pipeline.cpp
25

Can you add a few tests:

#pragma clang loop pipeline_ii_count(1 to show that we also diagnose the missing closing paren
#pragma clang loop pipeline_ii_count(1.0) to show that we diagnose a non-integer literal count
#pragma clang loop pipeline enable to show that we diagnose the missing open paren

avl marked 2 inline comments as done.Dec 17 2018, 7:32 AM

Ok, I will address all issues.

include/clang/Basic/Attr.td
2871–2872

As to ii - yes that should be understood by users, because it is important property of software pipelining optimization. Software Pipelining optimization tries to reorder instructions of original loop(from different iterations) so that resulting loop body takes less cycles. It started from some minimal value of ii and stopped with some maximal value. i.e. it tries to built schedule for min_ii, then min_ii+1, ... until schedule is found or until max_ii reached. If resulting value of ii already known then instead of searching in range min_ii, max_ii - it is possible to create schedule for already known ii.

probably following spelling would be better :

pipeline_ii_count: create loop schedule with initiation interval equals 'Value'

include/clang/Basic/DiagnosticParseKinds.td
1202

yes, it makes things too complicated. Though I could do it if necessary.

aaron.ballman added inline comments.Dec 17 2018, 10:29 AM
include/clang/Basic/Attr.td
2871–2872

because it is important property of software pipelining optimization.

My point is that I have no idea what "ii" means and I doubt I'll be alone -- does the "ii" differentiate this from other kinds of pipeline loop pragmas we plan to support, or is expected that "pipeline_ii_count" be the only pipeline count option? Can we drop the "ii" and not lose anything?

pipeline_ii_count: create loop schedule with initiation interval equals 'Value'

equals 'Value' -> equal to 'Value'

include/clang/Basic/DiagnosticParseKinds.td
1202

Not required, but I also didn't think the duplication here and below was a good approach either. But yeah, I'm not certain there's a better way to do this even if the list were rearranged.

avl marked an inline comment as done.Dec 17 2018, 10:33 AM
avl added inline comments.
include/clang/Basic/Attr.td
2871–2872

Do you think spelling out ii would help ?

pipeline_initiation_interval(number)

aaron.ballman added inline comments.Dec 17 2018, 10:38 AM
include/clang/Basic/Attr.td
2871–2872

I think it would help. I'm less concerned about the internal names of things than I am for the user-facing identifiers and whether it will be understandable to people other than compiler and optimizer writers.

hfinkel added inline comments.
include/clang/Basic/Attr.td
2871–2872

Yes, I also think that it would help to spell this out.

avl added a comment.Dec 17 2018, 12:39 PM

Ok, So thank you for the suggestions. I will implement that way.

avl updated this revision to Diff 178956.Dec 19 2018, 2:29 PM
avl edited the summary of this revision. (Show Details)

Please consider this change:

  1. addressed style issues and formatted patch with clang-format
  2. renamed pragma pipeline_ii_count to pipeline_initiation_interval
  3. left LoopAttributes inclass initializers refactoring for some another fix :-)
  4. added more tests
aaron.ballman added inline comments.Dec 20 2018, 5:28 AM
include/clang/Basic/AttrDocs.td
2673

"Check" -- check where? Perhaps "Search for" but I still don't think that's very satisfying for our documentation as there's no telling what users will find or how it relates (if at all) to this feature.

on optimisation -> on this optimization

I find this sentence doesn't really direct me to much of value (research presentations and class lectures, mostly), so I'd rather see the important details spelled out in our docs so that users don't have to guess as much.

2677–2678

There is nothing in the linked documentation that describes pipeline initiation intervals, so I'm still left wondering how to pick a value for the attribute argument. I think the docs need a bit more detail for what that value means and how it impacts the user's code.

avl marked an inline comment as done.Dec 20 2018, 5:36 AM
avl added inline comments.
include/clang/Basic/AttrDocs.td
2673

I am just unsure which references to include here ... links to wikipedia or some concrete resources do not look good.

So you are suggesting to spell things out here... Ok, I will prepare short explanation...

aaron.ballman added inline comments.Dec 20 2018, 5:42 AM
include/clang/Basic/AttrDocs.td
2673

I am just unsure which references to include here ... links to wikipedia or some concrete resources do not look good.

If there's a definitive source for the information, like a research paper or a Wikipedia entry that does a great job explaining it, then feel free to link to it. We can always update dead links when they arise.

So you are suggesting to spell things out here... Ok, I will prepare short explanation...

If there's not a good place to link to that has the information, then yes, spelling it out would be great. Basically, someone reading these docs should have enough information available (directly or through links) to help them understand the feature even if it doesn't make them an expert in the topic. My concern with the docs right now is that I have no idea what value to put there (Does it have to be positive? Nonzero? Is there an upper bound?) because I can't find an explanation about what that value actually controls.

avl marked an inline comment as done.Dec 20 2018, 8:12 AM
avl added inline comments.
include/clang/Basic/AttrDocs.td
2677–2678

How about this variant :

Software Pipelining optimization is a technique used to optimize loops by
utilizing instructions level parallelism. It reorders loop instructions to
overlap iterations. As the result new iteration started before previous
have finished. The Modulo Scheduling technique creates schedule for one
iteration such that when repeating at regular interval no inter-iteration
dependence violated. This constant interval(in cycles) between the start
of iterations called initiation interval. Cycles number of one iteration
of newly generated loop matches with Initiation Interval. For further
details see https://en.wikipedia.org/wiki/Software_pipelining and
"Swing Modulo Scheduling: A Lifetime-Sensitive Approach", by J. Llosa,
A. Gonzalez, E. Ayguade, and M. Valero.

`#pragma clang loop pipeline and #pragma loop pipeline_initiation_interval``
could be used as hints for Software Pipelining optimization.

Using `#pragma clang loop pipeline(disable)` avoids software pipelining optimization. The disable state can only be specified:

.. code-block:: c++

#pragma clang loop pipeline(disable)
for (...) {
  ...
}

Using `#pragma loop pipeline_initiation_interval`
instructs the software pipeliner to check the specified initiation interval.
If schedule found the result loop iteration would have specified
cycle count:

.. code-block:: c++

#pragma loop pipeline_initiation_interval(10)
for (...) {
  ...
}
avl updated this revision to Diff 179145.Dec 20 2018, 1:47 PM

update documentation section.

avl updated this revision to Diff 179248.Dec 21 2018, 1:10 AM

put a better link to doc

aaron.ballman added inline comments.Dec 21 2018, 11:40 AM
include/clang/Basic/AttrDocs.td
2650

instructions level -> instruction-level

2651

As a result, the next iteration starts before the previous iteration has finished.

2652–2654

The module scheduling technique creates a schedule for one iteration such that when repeating at regular intervals, no inter-iteration dependencies are violated.

2655

is called the initiation interval.

I can't quite parse the second sentence, though. Is it trying to say that the initiation interval is the number of cycles for a single iteration of the optimized loop?

2676

I'm can't quite parse this sentence either. Is it trying to say that the scheduler will try to find a loop iteration cycle count that most-closely matches the specified initiation interval?

2677

The initiation interval must be a positive number greater than zero.

avl marked 2 inline comments as done.Dec 21 2018, 1:30 PM

will correct all mistakes. please check explanations for the questions.

include/clang/Basic/AttrDocs.td
2655

Not quite right : initiation interval is not a number of cycles for a single iteration of the optimized loop. But initiation interval matches with number of cycles for a single iteration of the optimized loop. Initiation interval is the number of cycles between next and previous iteration of original loop. New loop created so that single iteration of the optimized loop has the same number cycles as initiation interval( thus every new iteration of original loop started each time when new iteration of optimized loop started - difference between iterations is initiation interval).

trying to rephrase : number of cycles for a single iteration of the optimized loop matches with number of cycles of initiation interval.

2676

I wanted to say that pipeliner will try to find a schedule that exactly matches the specified initiation interval. And if schedule would be found then single iteration of the optimized loop would have exactly the same amount of cycles as initiation interval. trying to rephrase :

If schedule would be found then single iteration of the optimized loop would have exactly the same amount of cycles as initiation interval has.

aaron.ballman added a subscriber: tonic.

Adding @tonic as a reviewer, who may have suggestions on how to improve the documentation wording.

include/clang/Basic/AttrDocs.td
2655

I am still a bit fuzzy on the details (optimizations are not my area of expertise). Is this an accurate what to rephrase it?

"The initiation interval is the number of cycles between two iterations of an unoptimized loop. These pragmas cause a new. optimized loop to be created such that a single iteration of the loop executes in the same number of cycles as the initiation interval."

2676

Ah, okay! So what happens if an exact match cannot be found? Does it behave as though the pragma was not specified at all (as in, the pragma is ignored)?

avl marked 2 inline comments as done.Dec 24 2018, 12:09 AM
avl added inline comments.
include/clang/Basic/AttrDocs.td
2655

I will take it with small modifications.

2676

yes. if schedule was not found then the loop would stay unchanged.

avl updated this revision to Diff 179459.Dec 24 2018, 12:20 AM

please consider changes in AttrDocs.td

The docs are getting much closer -- just a few grammatical things left to fix, I believe.

include/clang/Basic/AttrDocs.td
2581–2584

Can you combine the new sentence with the previous one? The directive allows vectorization, interleaving, pipelining, and unrolling to be enabled or disabled.

2657

in newly created schedule -> in the newly created schedule

New optimized loop -> A new, optimized loop

2662

for Software Pipelining optimization. -> for the software pipelining optimization.

2663

or c++11 range-based for loop. -> or a C++11 range-based for loop.

2666

avoids software -> avoids the software

2678

If schedule was found -> If a schedule was found

2679

would have specified cycle count. -> would have the specified cycle count.

If schedule was not -> If a schedule was not
would stay -> remains

avl marked an inline comment as done.Jan 2 2019, 6:30 AM
avl added inline comments.
include/clang/Basic/AttrDocs.td
2581–2584

In that case it would change a meaning and become incorrect. Pipelining could only be disabled. It could not be enabled.

avl updated this revision to Diff 179843.Jan 2 2019, 6:34 AM

Thank you. addressed all grammar things. please check it once more.

aaron.ballman added inline comments.Jan 2 2019, 8:01 AM
include/clang/Basic/AttrDocs.td
2581–2584

Ah, yes, that would change the meaning then. I'd go with: The directive allows pipelining to be disabled, or vectorization, interleaving, and unrolling to be enabled or disabled.

2663

I'd like to see a Sema test where the pragma is put at global scope to see if the diagnostic is reasonable or not. e.g.,

#pragma clang loop pipeline(disable)

int main() {
  for (int i = 0; i < 10; ++i)
    ;
}
2680–2681

Please add a Sema test with a negative number as the initiation interval.

include/clang/Basic/DiagnosticParseKinds.td
1188

pipeline_initiation_interval or distribute -> pipeline_initiation_interval, or distribute

(Keeps the Oxford comma as in the original diagnostic.)

test/Parser/pragma-pipeline.cpp
29

This isn't really a parser test -- it should probably be in Sema instead.

36–48

These should also move to Sema.

avl updated this revision to Diff 179933.Jan 2 2019, 1:37 PM

Thank you. I updated doc, splitted Parser test, added Sema tests. Please check it once more.

aaron.ballman added inline comments.Jan 3 2019, 9:47 AM
test/Sema/pragma-pipeline.cpp
3

I think this error is pretty unfortunate -- it doesn't really help the user to understand what's wrong with their code. Can it be improved?

avl marked an inline comment as done.Jan 3 2019, 11:42 AM
avl added inline comments.
test/Sema/pragma-pipeline.cpp
3

Agreed, this does not look very informative. It surely can be improved. Though it looks like not related to my fix. My fix is to add two additional loop hints. That kind of diagnostic related to all loop hints "clang loop". I.e. all pragmas "clang loop" can be checked for various cases and for better diagnostic. It looks like separate task. But OK, I will check that case.

aaron.ballman accepted this revision.Jan 3 2019, 12:18 PM

LGTM!

test/Sema/pragma-pipeline.cpp
3

That's a fair point -- I'm fine with doing that work in a separate patch. I don't think we need to hold this patch up further for it, anyway.

This revision is now accepted and ready to land.Jan 3 2019, 12:18 PM
avl added a comment.Jan 3 2019, 1:18 PM

Thank you! Aaron, Could you integrate this patch, please? I do not have commit access yet.

aaron.ballman closed this revision.Jan 4 2019, 9:28 AM

I commit this in r350414, thank you for the patch!