This is an archive of the discontinued LLVM Phabricator instance.

[LV][NFC-ish] Allow vector widths over 256 elements
ClosedPublic

Authored by simoll on Nov 16 2020, 12:41 AM.

Details

Summary

The assertion that vector widths are <= 256 elements was hard wired in the LV code. Eg, VE allows for vectors up to 512 elements. Test again the TTI vector register bit width instead - this is an NFC for non-asserting builds.

Diff Detail

Event Timeline

simoll created this revision.Nov 16 2020, 12:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2020, 12:41 AM
simoll requested review of this revision.Nov 16 2020, 12:41 AM
kaz7 added a comment.Nov 17 2020, 5:38 AM

I think removing this is good idea, but I'm not sure why the maximum vector size was limited to 64 and recently jumped up to 256. So, I cannot say LGTM atm. Does anyone know background on this?

Here is that prior commit that changed this - i guess, we can just commit our change, if this looks good to you.

fhahn added inline comments.Nov 18 2020, 12:02 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5352

Could we just use TTI->getRegisterBitWidth as upper bound? That would allow us to keep the assert as a sanity check.

simoll updated this revision to Diff 306330.Nov 19 2020, 12:44 AM
simoll edited the summary of this revision. (Show Details)

Followed @fhahn 's suggestion to test against the widest vector register width instead.

kaz7 added a comment.Nov 19 2020, 12:52 AM

I thought it's a good idea when I hear it from @fhahn, but... I think It's not a good idea since 1) WidestRegister holds bit width, 2) MaxVectorSize is calculated from TTI->getRegisterBitWidth anyway.

fhahn accepted this revision.Nov 19 2020, 1:37 AM

I thought it's a good idea when I hear it from @fhahn, but... I think It's not a good idea since 1) WidestRegister holds bit width, 2) MaxVectorSize is calculated from TTI->getRegisterBitWidth anyway.

I think it somewhat preserves the spirit of the assertion, which IIUC was added to ensure none of the calculations above go rouge on lead to huge vectorization factors. The way the computation is supposed to work, dividing the widest register in bits by the smallest possible type width in bits (1) seems a suitable upper bound to preserve the spirit of the assert (I'd say it's debatable whether the assert itself adds a lot of protection, but it does at least add a little bit).

This LGTM, but happy to discuss this further.

This revision is now accepted and ready to land.Nov 19 2020, 1:37 AM
kaz7 added a comment.Nov 19 2020, 1:41 AM

dividing the widest register in bits by the smallest possible type width in bits (1) seems a suitable upper bound to preserve the spirit of the assert

That's make sense. I understand now. Thank you for explanations.

This revision was landed with ongoing or failed builds.Nov 19 2020, 1:58 AM
This revision was automatically updated to reflect the committed changes.

The current versions seems a good middle ground between computing a tighter bound and abolishing the assertion altogether. Trying to be smart in assertions is usually a bad idea because people might pull the carpet (underlying assumptions) from under you some day.