If the unaligned access has a dynamic offset, it may be odd which
would make the adjusted alignment incorrect to use.
Details
- Reviewers
asbirlea • tstellarAMD jlebar escha
Diff Detail
Event Timeline
I didn't know about that one. Switching to that mostly works, except it misses the case (e.g. store_unknown_offset_align1_i32) where it should be OK because the indexing should avoid an unaligned offset because it uses stripPointerCasts instead of GetUnderlyingObject
Switch to using getOrEnforceKnownAlignment, even though this regresses the case. Maybe it should be extended to handle this case
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
829 | I'm sort of confused about what you're doing here... what is StackAdjustedAlignment supposed to be? |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
829 | It's supposed to be a normal "good" stack alignment value that will generally allow vectorization. It's a hardcoded 4 because the GPUs we care about this always works. This should be something smarter, but that's a separate change |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
829 | Okay... but if StackAdjustedAlignment is the alignment which allows vectorization, shouldn't you be checking "NewAlign <= StackAdjustedAlignment", as opposed to "NewAlign <= Alignment"? |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
830 | getOrEnforceKnownAlignment returns 0 on failure, maybe we should check for that instead? (If you want, you could also assert that NewAlign is either 0 or >= StackAdjustedAlignment.) | |
1057–1058 | Why this change? It's equivalent to the old code assuming the alignment is a power of two (which, if we depend on this, we should assert?), but I don't see why we need to make the change at all. |
It looks like you only addressed one of the two comments I left. efriedma's comment is also still outstanding afaict. Was that intentional?
Apparently I never saved the file
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
830 | It doesn't actually return 0 on failure, it returns the known alignment. In the motivating testcase, load_unknown_offset_align1_i8, it returns 1 when it can't find the constant offset |
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
830 | I looked through the code and agree that it is as you describe. It may be worth fixing the comment in Local.h. |
I'm sort of confused about what you're doing here... what is StackAdjustedAlignment supposed to be?