This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Fix PR43799: Crash on different sizes of GEP indices.
ClosedPublic

Authored by ABataev on Oct 30 2019, 10:09 AM.

Details

Summary

If the GEP instructions are going to be vectorized, the indices in those
GEP instructions must be of the same type. Otherwise, the compiler may
crash when trying to build the vector constant.

Diff Detail

Event Timeline

ABataev created this revision.Oct 30 2019, 10:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 30 2019, 10:09 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
spatel added inline comments.Oct 30 2019, 11:04 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
4170–4171

I think it's already a fuzzer corner case that we have different types in the bug report example, but let's take that 1 step further: what if the index type in IR is larger than getIntPtrType()? We might illegally truncate a large constant value.

Safer to just give up completely if we find mismatched GEP index types?

ABataev marked an inline comment as done.Oct 30 2019, 11:17 AM
ABataev added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
4170–4171

Maybe, just give up if GEP at least one index type is larger than intptr?

ABataev marked an inline comment as not done.Oct 30 2019, 11:20 AM
spatel added inline comments.Oct 30 2019, 12:51 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
4170–4171

That should work, but it seems like we are adding code complexity for no real-world reason. Unless I'm misunderstanding - do we have real code / tests where the GEP indexes are different sizes?

ABataev added inline comments.Oct 30 2019, 12:58 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
4170–4171

It does not add a lot of complexity here, just a few additional lines of code.

xbolva00 added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
4170–4171

@spatel:
Recent memcmp related patch uses CreateGEP_64 (dont know exact name atm; index is i64) - could it be affected with your proposed stronger bail out?. We should be careful to not pessimize the code generated by own passes.

ABataev updated this revision to Diff 227167.Oct 30 2019, 1:09 PM

Added a check for GEP indexes greater than pointer size.

lebedev.ri added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
4170–4171

https://llvm.org/docs/LangRef.html#langref-datalayout

Data Layout
p[n]:<size>:<abi>:<pref>:<idx>
This specifies the size of a pointer and its <abi> and <pref>erred alignments for address space n. The fourth parameter <idx> is a size of index that used for address calculation. If not specified, the default index size is equal to the pointer size. All sizes are in bits. The address space, n, is optional, and if not specified, denotes the default address space 0. The value of n must be in the range [1,2^23).

spatel added inline comments.Oct 31 2019, 8:26 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
4170–4171

For reference, I think the patch that @xbolva00 mentioned is:
D69507

(...unfortunately, expandmemcmp is still a codegen pass - D60318 / rL371507 - so that particular case can't be in play yet for the question/concern raised in this patch)

Given the LangRef specification, I'm surprised that we're encouraging hard-coding the index size via the Builder API rather than deriving it from the DataLayout.

So it seems like we've made a mess, and now we want to deal with that mess here. I still don't see any real-world reason for this patch to try so hard, but if we insist on it, then we should be using:

DL->getIndexTypeSizeInBits()

?

ABataev updated this revision to Diff 227337.Oct 31 2019, 1:52 PM

Address comments.

spatel accepted this revision.Nov 1 2019, 1:10 PM

LGTM

This revision is now accepted and ready to land.Nov 1 2019, 1:10 PM
This revision was automatically updated to reflect the committed changes.