Page MenuHomePhabricator

[POLLY] Support accesses with differently sized types to the same array

Authored by grosser on Feb 4 2016, 2:09 AM.


Support accesses with differently sized types to the same array

This allows code such as:

void multiple_types(char *Short, char *Float, char *Double) {
  for (long i = 0; i < 100; i++) {
    Short[i] = *(short *)&Short[2 * i];
    Float[i] = *(float *)&Float[4 * i];
    Double[i] = *(double *)&Double[8 * i];

To model such code we use as canonical element type of the modeled array the
smallest element type of all original array accesses, if type allocation sizes
are multiples of each other. Otherwise, we use a newly created iN type, where N
is the gcd of the allocation size of the types used in the accesses to this
array. Accesses with types larger as the canonical element type are modeled as
multiple accesses with the smaller type.

For example the second load access is modeled as:

  { Stmt_bb2[i0] -> MemRef_Float[o0] : 4i0 <= o0 <= 3 + 4i0 }

To support code-generating these memory accesses, we introduce a new method
getAccessAddressFunction that assigns each statement instance a single memory
location, the address we load from/store to. Currently we obtain this address by
taking the lexmin of the access function. We may consider keeping track of the
memory location more explicitly in the future.

We currently do _not_ handle multi-dimensional arrays and also keep the
restriction of not supporting accesses where the offset expression is not a
multiple of the access element type size. This patch adds tests that ensure
we correctly invalidate a scop in case these accesses are found. Both types of
accesses can be handled using the very same model, but are left to be added in
the future.

We also move the initialization of the scop-context into the constructor to
ensure it is already available when invalidating the scop.

Diff Detail


Event Timeline

grosser updated this revision to Diff 46885.Feb 4 2016, 2:09 AM
grosser retitled this revision from to Support accesses with differently sized types to the same array.
grosser updated this object.
grosser added reviewers: jdoerfert, Meinersbur.
grosser added subscribers: llvm-commits, pollydev.
jdoerfert edited edge metadata.Feb 4 2016, 2:49 AM

I generally like this idea but I still think the distinction between the last two test cases is not justified at all. All char pointers could have an arbitrary initial alignment in the last test case and it is in general not a problem to have unaligned accesses. This patch prevents some unaligned accesses from being represented but allows less obvious ones anyway.

151 ↗(On Diff #46885)

Can we use this somewhere else too? If so we might want to do so and commit it ahead of time.

205 ↗(On Diff #46885)

Maybe you can use:

APInt GreatestCommonDivisor (const APInt &Val1, const APInt &Val2)



from here .

496 ↗(On Diff #46885)

Shouldn't the structute allow us to look for the lower bound instead of the lexmin?

24 ↗(On Diff #46885)

I do not get the part about the allocation sizes. Where do you verify them in this test case?

5 ↗(On Diff #46885)

You don't know that. Same as you don't know that they are in the example below. As long as there is no alignment information attached to the pointers it is not clear how the alignment is and how it has to be in order to access the memory through this pointers. Not all machines require aligned accesses...

grosser updated this revision to Diff 46892.Feb 4 2016, 3:57 AM
grosser edited edge metadata.

Addressed Johannes' comments.

grosser retitled this revision from Support accesses with differently sized types to the same array to [POLLY] Support accesses with differently sized types to the same array.Feb 4 2016, 3:57 AM
grosser updated this object.

Hi Johannes,

thanks for the quick review. I replied inline.

Regarding the alignment test case. I renamed it and made clear that this is not about alignment, but about the offset expression being a multiple of the element size. A restriction, we have today and which is left untouched by this patch.


151 ↗(On Diff #46885)

I just went through the code, but did not find an obvious example. I could add it e.g. in the invariant load hoisting, but it would just add an unnecessary layer of indirection. So I think we can not move this out of this patch.

205 ↗(On Diff #46885)

I now use GreatestCommonDivisor64 from include/llvm/Support/MathExtras.h

496 ↗(On Diff #46885)

What do you mean by "lower bound". Is there another isl function I could use instead of lexmin? (for me lexmin is the smallest value, so it seems to be some kind of lowerbound).

24 ↗(On Diff #46885)

I added the following text:

; The allocation size discussed above defines the number of canonical array      
; elements accessed. For example, even though i27 only consists of 3 bytes,      
; its allocation size is 4 bytes. Consequently, we model the access to an        
; i27 element as an access to four canonical elements resulting in access        
; relation constraints '4i0 <= o0 <= 3 + 4i0' instead of '3i0 <= o0 <= 2 + 3i0'.

Did this make the test case more clear?

5 ↗(On Diff #46885)

For LLVM some alignment information is always known, it is either given explicitly with an alignment annotation or implicitly through the target data interface. For most architectures this means that types are by-default aligned to a multiple of their type size. So yes, we always get some information about alignment guarantees.

However, the term alignment is a little confusing here and the above alignment is not what this is about. We use the term unaligned in Polly to talk about access functions for which the offset from the base pointer can not be evenly divided by the type size, as we otherwise can not generate an access function for it (without falling back to smaller types). This restriction has been introduced by you as a correctness fix in

Regarding this patch: we just leave the existing restriction in place (and we just add a test to verify that this is indeed the case). As written in the comment, this could be allowed as a straightforward extension. However, to keep this patch small, I focused on what I see as the more common case and left this for a later extension. Even so this is surely a nice extension, I do not see an inherent need why this needs to be part of an initial implementation.

I could add a brief explanation to this test case to make this difference clear.

Meinersbur added inline comments.Feb 4 2016, 4:14 AM
151 ↗(On Diff #46892)

getDebugLoc() is declared in llvm::Instruction. Just use

const llvm::DebugLoc &getDebugLoc() const {
    return I->getDebugLoc();
grosser updated this revision to Diff 46894.Feb 4 2016, 4:16 AM
grosser updated this object.

Addressed Michael's comment

Regarding the "alignment" test cases:
I get your point now. The comment in the test cases confused me and the new comment makes it clear why we do not handle these cases _yet_.

205 ↗(On Diff #46892)

cool. I did not find that one.

489 ↗(On Diff #46892)

I thought so but I cannot find it now.. let's just stick with lexmin. Sorry for the noise.

25 ↗(On Diff #46892)

This does not help. Sorry.

  1. In my book i27 consists of more than 3 bytes, thus I do not get the comment. It does not access 4 complete bytes, true, but it accesses more than 3. See point 2) for my take on the modeling.
  2. The allocation size is target specific and should not be hard coded in Polly. This means we should not fix the argumentation to byte level. Instead the gcd of the allocation sizes in bit should be used as granularity. This will never make a difference for machines that work on byte level (gcd will just be a multiple of 8) and will prevent problems on machines that do not...
jdoerfert added inline comments.Feb 4 2016, 4:27 AM
151 ↗(On Diff #46894)

This method (as well as most of the ones above) is now just a plain wrapper around the same one in llvm::Instruction. I argued this before and will do it again: __ Do not copy all these functions but instead use the original __


is not soo bad is it?

Meinersbur added inline comments.Feb 4 2016, 4:34 AM
151 ↗(On Diff #46894)

The model CallSite class also has some of these forwarders. In my original design, MemAccInst was derived from llvm::Instruction st this was no issue.


Instruction *operator->() const { return I; }

and use like this:

grosser updated this revision to Diff 46896.Feb 4 2016, 4:39 AM

Address Johannes' comment

Last comments inline.

151 ↗(On Diff #46894)

I still like the shorter version.

Adding new, trivial functions out-of-line seems to be minimal cost whereas adding (even so minimal) complexity in already non-trivial functions is something I am slightly worried about.

Clearly this is nothing super important, so if you feel strong about this I could change this.

26 ↗(On Diff #46894)

Sorry, i27 was a mistake. It should be i24. Then the explanation makes sense.

Regarding the allocation size. We derived it using DL.getTypeAllocSize(), so it was already target specific. I now use the function getTypeAllocSizeInBits(), which allows us to also get rid of the magic constant '8'.

jdoerfert accepted this revision.Feb 4 2016, 4:50 AM
jdoerfert edited edge metadata.


27 ↗(On Diff #46896)

Thanks for changing this.

This revision is now accepted and ready to land.Feb 4 2016, 4:50 AM
This revision was automatically updated to reflect the committed changes.