Page MenuHomePhabricator

[NewPassManager] Adding pass tuning options: loop vectorize.

Authored by asbirlea on Mar 22 2019, 3:29 PM.



Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.

Diff Detail


Event Timeline

asbirlea created this revision.Mar 22 2019, 3:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2019, 3:29 PM
chandlerc requested changes to this revision.Mar 29 2019, 12:24 AM
chandlerc added inline comments.
69 ↗(On Diff #191964)

Maybe: "Tunable parameters for passes in the default pipelines."

70 ↗(On Diff #191964)

Especially in the type name I'd like to emphasize that this isn't really for fundamental options, but for tuning.

Based on my comment below about how to use this, I would suggest: PipelineTuningOptions.

72 ↗(On Diff #191964)

Make this a doxygen comment and add more details? Assume the reader has just started trying to use this from a library like Halide. What do they need to know?

74–75 ↗(On Diff #191964)

For consistency and readability, let's use positive bools and just change the defaults?

74–75 ↗(On Diff #191964)

Vertical whitespace and doxygen comments on each option since these are expected to be directly controlled by users?

86 ↗(On Diff #191964)

Maybe PTO?

203 ↗(On Diff #191964)

Given that the "none" behavior is to just use the default constructor, I'd rather just accept by value and have the default argument be the default constructed value. That will in turn let us trivially initialize things.

219 ↗(On Diff #191964)

Why not support a CLI here?

859 ↗(On Diff #191964)

If I understand correctly, the first parameter isn't about *unrolling* but about interleaving. We should be careful to use that terminology as disabling unrolling is a potentially reasonable thing to add.

201 ↗(On Diff #191964)

I wouldn't use the tuning options here. If we want to support explicitly controlled behavior of the vectorization pass here, we should do so by *parsing* values out of the textual pipeline description much as we do for a few other passes. The tuning struct seems more appropriate for the pipeline above exclusively (and I would name it accordingly).

This revision now requires changes to proceed.Mar 29 2019, 12:24 AM
hsaito added a subscriber: hsaito.Mar 29 2019, 12:33 PM
hsaito added inline comments.
859 ↗(On Diff #191964)

@chandlerc, you are right about the terminology mess. It is the vectorizer that is "incorrectly" piggybacking on the unroll flag. If we are to start fixing the terminology here, we'd need to create DisableInterleaveLoops. If we are going to that direction, I suggest starting from setting the same value as DisableUnrollLoops so that fixing the terminology mess is NFC.

Potentially bigger mess is programmer education, to teach that interleaving and unrolling are different, as they often tend to result in identical ASM after instruction reordering --- but I fully agree that we still need to educate them (including compiler writers) to use the appropriate terminology.

asbirlea updated this revision to Diff 192904.Mar 29 2019, 1:53 PM
asbirlea marked 11 inline comments as done.

Address comments and asking for clarifications.

asbirlea added inline comments.Mar 29 2019, 1:53 PM
859 ↗(On Diff #191964)

Yes, that's true. I used the same naming and values as the ones in the old pass manager (including not having a cl option).

Updated now (also means the old and new pass managers are now inconsistent as far as namings, defaults etc, but it's probably preferable to update the old one to be consistent with this?).

201 ↗(On Diff #191964)

I removed the options for now, but I'm still confused how things work TBH.

What if the defaults in the cl::opts are different from the constructor defaults?

Regarding parsing, I tried to understand how this is done for LoopUnroll and I only got more confused.
There's an -unroll-runtime "global" cl::opt in LoopUnrollPass.cpp that feeds into a TargetTransformInfo::UnrollingPreferences. In the new pass manager if you add in the pipeline -passes='unroll<runtime>', that's parsed into an LoopUnrollOptions and passed to the pass in PassRegistry.def.
It looks like the two managers intersect in tryToUnrollLoop, where (I assume based on the tests that) the explicit values that exist only for the NPM can overwrite the "global" ones. So there's two ways of setting defaults, and one can overwrite the other?

The tuning I had in mind to add next was -max-acc-licm-promotion in LICM. Ideally, I'd like that the value of this flag be used always as the default. When this is changed with the flag, use that new default. When it's changed in the PassBuilder, again, use that new default. Whereas if I add the parsing approach here, there's a "second source of a default".

I'd also prefer to not have the defaults embedded in the constructor, because, again, that's a "second source of defaults". If we later decide to change the default in the cl::opt option, then we also need to change the default in the constructor, no? (I am thinking of unsigned values here, caps, not bools that enable/disable things that are less likely to change often)

I still haven't understood how everything here works, so any kind of clarifications are very much welcome!

asbirlea updated this revision to Diff 194188.Apr 8 2019, 12:10 PM

Attempting to merge defaults to a single cl::opt.
Please note the comments on defaults in lib/Transforms/Vectorize/LoopVectorize.cpp. Feedback there very welcome.

The default for LoopVectorize in the PassManagerBuilder is false. (opt checks for this, and sets LoopVectorize to true based on Olevel/size)
The default when creating the LoopVectorize with no arguments, is true (VectorizeOnlyWhenForced = false).

I think the approach here looks really good. Comments below.

77 ↗(On Diff #194188)

Comment needs updating

77–81 ↗(On Diff #194188)

Comment needs updating.

78–82 ↗(On Diff #194188)

I think we should name these consistently. Two options that seem equally good to me:

InterleaveLoops and VectorizeLoops


LoopInterleaving and LoopVectorization

I think I very mildly lean toward the second, but it is a very mild preference.

278 ↗(On Diff #194188)

You should be able to just define this as llvm::SetLoopsInterleaved?

280–283 ↗(On Diff #194188)

Following whatever pattern you pick above for the PassTuningOptions, I'd call these EnableFooLoops or EnableLoopFoo for both interleaving and vectorization.

281 ↗(On Diff #194188)

Update description

288–293 ↗(On Diff #194188)

Not really sure what this commenting structure is trying to convey, but I don't think you want to use commented out code. I'd just write prose in a comment on the method that describes what's going on here.

chandlerc added inline comments.Apr 8 2019, 2:24 PM
859 ↗(On Diff #191964)

I think Alina's patch does a reasonable job of starting to fix this.

I don't think the education problem is that bad. Unrolling and interleaving produce pretty different results in LLVM.

asbirlea marked an inline comment as done.Apr 8 2019, 2:37 PM
asbirlea added inline comments.
288–293 ↗(On Diff #194188)

I *really* don't want to leave this as is. It was meant as a discussion point. Sorry for not being clearer.

I wanted to use the "expected defaults", where interleaving and vectorization are enabled by default, but vectorization is disabled by default in the old pass manager.
Why is this?
[The interleaving is set to false already in the old pass manager, the DisableUnrollLoops variable.]

I guess I am asking why are having vectorization disabled by default in the old pass manager, but enabled by default when creating the LoopVectorizePass with no params?

We also have a strange (to me) behavior in opt.cpp:351:

  • if an opt-specific option to disable-loop-vectorize is set, then it's disabled.
  • Otherwise, if it's explicitly enabled with -vectorize-loops or -loop-vectorize (there's two?), then it's enabled.
  • Otherwise, it assumes it's disabled (that's the default here...), and enables it if OptLevel > 1 && SizeLevel < 2.

So I can't really change the default to enable vectorization, without some changes here.

asbirlea updated this revision to Diff 194396.Apr 9 2019, 2:21 PM
asbirlea marked 10 inline comments as done.

Address comments.

288–293 ↗(On Diff #194188)

To make progress here, I moved the explanation to a comment block.
This way, this patch can go in as is since it doesn't change any defaults, it's just flag refactoring and plumbing. Any change that *does* change the defaults can be a follow up to this patch.

asbirlea updated this revision to Diff 195777.Apr 18 2019, 10:20 AM

Rebase after change that parses vectorize flags.

Meinersbur added inline comments.
288–293 ↗(On Diff #194188)

I guess I am asking why are having vectorization disabled by default in the old pass manager, but enabled by default when creating the LoopVectorizePass with no params?

Quite confusingly, clang enables loop vectorization in the legacy pipeline builder by default again. That is, opt -O3 disables LoopVectorize, but clang -O3 has it enabled.

chandlerc accepted this revision.Apr 18 2019, 4:28 PM

LGTM with comments below addressed. All pretty minor.

92–100 ↗(On Diff #195777)

I'd move this to be a doxygen comment for the method -- people are likely to want to understand this when using the default constructor.

102 ↗(On Diff #195777)

I'd name the variables exactly the same as the members. It'll also help people use descriptive comments when calling it.

288–293 ↗(On Diff #194188)

FWIW, all of this makes "sense" now to me.

I'm pretty happy w/ you changing the defaults to be more consistent Alina, but like your approach of making that a separate patch.

And yeah, the fact that clang and opt have different defaults I think is just because we didn't wire things together in a way that would make them line up easily the first time around.

287–288 ↗(On Diff #195777)

stray commented out bit left over here...

This revision is now accepted and ready to land.Apr 18 2019, 4:28 PM
asbirlea updated this revision to Diff 195844.Apr 18 2019, 5:09 PM
asbirlea marked 3 inline comments as done.

Address comments. Thank you for the review!

This revision was automatically updated to reflect the committed changes.