Support accesses with differently sized types to the same array This allows code such as: void multiple_types(char *Short, char *Float, char *Double) { for (long i = 0; i < 100; i++) { Short[i] = *(short *)&Short[2 * i]; Float[i] = *(float *)&Float[4 * i]; Double[i] = *(double *)&Double[8 * i]; } } To model such code we use as canonical element type of the modeled array the smallest element type of all original array accesses, if type allocation sizes are multiples of each other. Otherwise, we use a newly created iN type, where N is the gcd of the allocation size of the types used in the accesses to this array. Accesses with types larger as the canonical element type are modeled as multiple accesses with the smaller type. For example the second load access is modeled as: { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 } To support code-generating these memory accesses, we introduce a new method getAccessAddressFunction that assigns each statement instance a single memory location, the address we load from/store to. Currently we obtain this address by taking the lexmin of the access function. We may consider keeping track of the memory location more explicitly in the future. We currently do _not_ handle multi-dimensional arrays and also keep the restriction of not supporting accesses where the offset expression is not a multiple of the access element type size. This patch adds tests that ensure we correctly invalidate a scop in case these accesses are found. Both types of accesses can be handled using the very same model, but are left to be added in the future. We also move the initialization of the scop-context into the constructor to ensure it is already available when invalidating the scop.
Details
Diff Detail
Event Timeline
I generally like this idea but I still think the distinction between the last two test cases is not justified at all. All char pointers could have an arbitrary initial alignment in the last test case and it is in general not a problem to have unaligned accesses. This patch prevents some unaligned accesses from being represented but allows less obvious ones anyway.
include/polly/Support/ScopHelper.h | ||
---|---|---|
151 | Can we use this somewhere else too? If so we might want to do so and commit it ahead of time. | |
lib/Analysis/ScopInfo.cpp | ||
205 | Maybe you can use: APInt GreatestCommonDivisor (const APInt &Val1, const APInt &Val2) or std::experimental::gcd from here http://en.cppreference.com/w/cpp/experimental/gcd . | |
489 | Shouldn't the structute allow us to look for the lower bound instead of the lexmin? | |
test/ScopInfo/multiple-types-non-power-of-two.ll | ||
25 | I do not get the part about the allocation sizes. Where do you verify them in this test case? | |
test/ScopInfo/multiple-types-unaligned.ll | ||
5 ↗ | (On Diff #46885) | You don't know that. Same as you don't know that they are in the example below. As long as there is no alignment information attached to the pointers it is not clear how the alignment is and how it has to be in order to access the memory through this pointers. Not all machines require aligned accesses... |
Hi Johannes,
thanks for the quick review. I replied inline.
Regarding the alignment test case. I renamed it and made clear that this is not about alignment, but about the offset expression being a multiple of the element size. A restriction, we have today and which is left untouched by this patch.
Best,
Tobias
include/polly/Support/ScopHelper.h | ||
---|---|---|
151 | I just went through the code, but did not find an obvious example. I could add it e.g. in the invariant load hoisting, but it would just add an unnecessary layer of indirection. So I think we can not move this out of this patch. | |
lib/Analysis/ScopInfo.cpp | ||
205 | I now use GreatestCommonDivisor64 from include/llvm/Support/MathExtras.h | |
489 | What do you mean by "lower bound". Is there another isl function I could use instead of lexmin? (for me lexmin is the smallest value, so it seems to be some kind of lowerbound). | |
test/ScopInfo/multiple-types-non-power-of-two.ll | ||
25 | I added the following text: ; The allocation size discussed above defines the number of canonical array ; elements accessed. For example, even though i27 only consists of 3 bytes, ; its allocation size is 4 bytes. Consequently, we model the access to an ; i27 element as an access to four canonical elements resulting in access ; relation constraints '4i0 <= o0 <= 3 + 4i0' instead of '3i0 <= o0 <= 2 + 3i0'. Did this make the test case more clear? | |
test/ScopInfo/multiple-types-unaligned.ll | ||
5 ↗ | (On Diff #46885) | For LLVM some alignment information is always known, it is either given explicitly with an alignment annotation or implicitly through the target data interface. For most architectures this means that types are by-default aligned to a multiple of their type size. So yes, we always get some information about alignment guarantees. However, the term alignment is a little confusing here and the above alignment is not what this is about. We use the term unaligned in Polly to talk about access functions for which the offset from the base pointer can not be evenly divided by the type size, as we otherwise can not generate an access function for it (without falling back to smaller types). This restriction has been introduced by you as a correctness fix in https://llvm.org/svn/llvm-project/polly/trunk@252942. Regarding this patch: we just leave the existing restriction in place (and we just add a test to verify that this is indeed the case). As written in the comment, this could be allowed as a straightforward extension. However, to keep this patch small, I focused on what I see as the more common case and left this for a later extension. Even so this is surely a nice extension, I do not see an inherent need why this needs to be part of an initial implementation. I could add a brief explanation to this test case to make this difference clear. |
include/polly/Support/ScopHelper.h | ||
---|---|---|
151 | getDebugLoc() is declared in llvm::Instruction. Just use const llvm::DebugLoc &getDebugLoc() const { return I->getDebugLoc(); } |
Regarding the "alignment" test cases:
I get your point now. The comment in the test cases confused me and the new comment makes it clear why we do not handle these cases _yet_.
lib/Analysis/ScopInfo.cpp | ||
---|---|---|
205 | cool. I did not find that one. | |
489 | I thought so but I cannot find it now.. let's just stick with lexmin. Sorry for the noise. | |
test/ScopInfo/multiple-types-non-power-of-two.ll | ||
25 | This does not help. Sorry.
|
include/polly/Support/ScopHelper.h | ||
---|---|---|
153 | This method (as well as most of the ones above) is now just a plain wrapper around the same one in llvm::Instruction. I argued this before and will do it again: __ Do not copy all these functions but instead use the original __ MAcc.asInstruction()->getDebugLoc(); is not soo bad is it? |
include/polly/Support/ScopHelper.h | ||
---|---|---|
153 | The model CallSite class also has some of these forwarders. In my original design, MemAccInst was derived from llvm::Instruction st this was no issue. Suggestion: Instruction *operator->() const { return I; } and use like this: MAcc->getDebugLoc(); |
Last comments inline.
include/polly/Support/ScopHelper.h | ||
---|---|---|
153 | I still like the shorter version. Adding new, trivial functions out-of-line seems to be minimal cost whereas adding (even so minimal) complexity in already non-trivial functions is something I am slightly worried about. Clearly this is nothing super important, so if you feel strong about this I could change this. | |
test/ScopInfo/multiple-types-non-power-of-two.ll | ||
27 | Sorry, i27 was a mistake. It should be i24. Then the explanation makes sense. Regarding the allocation size. We derived it using DL.getTypeAllocSize(), so it was already target specific. I now use the function getTypeAllocSizeInBits(), which allows us to also get rid of the magic constant '8'. |
Can we use this somewhere else too? If so we might want to do so and commit it ahead of time.