This change adds a ni:<addrspace> specifier in the datalayout string to
denote pointers in the given address space as "non-integral", and adds some
typing rules these special pointers.
Details
Diff Detail
Event Timeline
This probably does not need saying, but this is just the very first step in the plan mentioned in http://lists.llvm.org/pipermail/llvm-dev/2016-July/102466.html, other steps will follow once this lands.
For people seeing this for the first time, the design discussion starts at http://lists.llvm.org/pipermail/llvm-dev/2016-July/102161.html
Also ping!
(I've changed the specification to not be tied to GC semantics as discussed on the llvm-dev thread)
docs/LangRef.rst | ||
---|---|---|
559–561 | What consequence does the changing pointer value have on the optimizer? I'm interested in this for fat pointers which will not change value during runtime |
docs/LangRef.rst | ||
---|---|---|
559–561 | That is the justification for disallowing integer <-> pointer conversions (directly via cast instructions or via memory). If fat pointers have a stable bitwise representation, then why not represent them as normal pointers? |
docs/LangRef.rst | ||
---|---|---|
559–561 | The pointer arithmetic does not behave exactly like a 128-bit integer. Doing pointer arithmetic such as a 128-bit add would be incorrect, and we don't want CodeGenPrepare etc. to be allowed to decompose it into integer operations. The high 64-bits are constant "metadata" bits for the memory operation. The low 64-bits behave like a pointer. I think what we need is something that would only ever have the pointer manipulated through GEP, and then special addressing mode matching in the backend (since the fat pointer should almost always be treated as a constant with a separate index operand) |
docs/LangRef.rst | ||
---|---|---|
559–561 | Okay, that sounds reasonable. We have a similar restriction, in that GEPs have to stay GEPs and can't be decomposed into integer arithmetic. However, we also want to allow things like alignment, which only really make sense if the pointer has an integral representation. What do you think of changing the language to say "does not have a useful integral representation"? "useful" is deliberately vague here, with the understanding that we disallow some integer-like operations / properties, but allow others. |
docs/LangRef.rst | ||
---|---|---|
559–561 |
Just to be explicit: I meant we also want to allow specifying things like alignment on these non-integral pointers |
docs/LangRef.rst | ||
---|---|---|
559–561 | That seems ok. The low bits are pointer like, so alignment still would make sense for our case |
docs/LangRef.rst | ||
---|---|---|
559–561 | Maybe better would be a target dependent representation? Saying not useful seems restrictive to backend code that wants to rely on target specific assumptions |
include/llvm/IR/DataLayout.h | ||
---|---|---|
207 | If the one restriction remains it at least needs a mention in the langref |
include/llvm/IR/DataLayout.h | ||
---|---|---|
150 | Seems like it should be a SmallSet? |
include/llvm/IR/DataLayout.h | ||
---|---|---|
150 | Since this isn't going to change after creation, how about a sorted SmallVector? That way we'll avoid the isSmall check in isNonIntegralPointerType and will also be able to use a binary search instead of a linear search. |
include/llvm/IR/DataLayout.h | ||
---|---|---|
150 | I don't forsee this ever being larger than 3 or 4, so the linear search a SmallSet does for the small case is probably better |
include/llvm/IR/DataLayout.h | ||
---|---|---|
150 |
Then why bother with a SmallSet at all? If you're worried about worst-case situations where we have millions of non-integral address spaces, then a sorted vector seems better than a SmallSet. |
include/llvm/IR/DataLayout.h | ||
---|---|---|
150 | I suppose there's not much difference |
What consequence does the changing pointer value have on the optimizer? I'm interested in this for fat pointers which will not change value during runtime