- Replace getAssociatedFunction with getAnchorScope where it's clear we want the parent function.
- Fixes in doc and other small refactorings.
Details
Diff Detail
Event Timeline
Today I started reading through the whole Attributor, or at least its core parts, because I think otherwise I won't
have a good enough understanding to proceed. This is the start of a series of small patches that will mostly be NFC
improvements to docs with which I hope I'll have the opportunity to ask questions about different decisions in the Attributor.
Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary. Plus, IRP currently
saves a pointer (Value *, the anchor value) and an int. For most users of LLVM, a pointer will take 8 bytes. Because of alignment,
we will use another 8 bytes anyway for KindOrArgNo, not 4. And I think it will be so much simpler to use the wasted 4 bytes
to decouple the values. That is, one int for the kind and one intfor the argument number.
If we all agree in that, of course I'm willing to do the related patches.
This seems fine, as well as the initiative. Few comments though
Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary.
I kind of like how this is working, but if others agree you can change it.
llvm/include/llvm/Transforms/IPO/Attributor.h | ||
---|---|---|
38 | don't think this is necessary. | |
278 | I think this assert is also redundant, given that it is called again in getAnchorValue() | |
280 | This makes sense. Now that I look at the other parts we seem to mix these two a lot. I think the better solution is to inline getAnchorScope here and replace all the call sites with getAssociatedFunction as it's clearer (at least to me). |
I kind of like how this is working, but if others agree you can change it.
I appreciate it, it may be that I was not involved when it was written. It just seems
to be unclear how it's working. Consider this:
/// Create a position describing the argument \p Arg. static const IRPosition argument(const Argument &Arg) { return IRPosition(const_cast<Argument &>(Arg), Kind(Arg.getArgNo())); }
Arg.getArgNo() can be 1, in which case, the respective Kind is IRP_CALL_SITE_ARGUMENT. At least to me, it makes
no sense why we want that. Again, it might be that only I don't get it but I have a lot of "why this happens?" moments
when reading different parts of the Attributor that have to do with arguments.
llvm/include/llvm/Transforms/IPO/Attributor.h | ||
---|---|---|
278 | Yes, although I don't understand why it's not at the start of this function. | |
280 | Yes, I like that too. |
Great! It's about time we give it a overhaul, a lot of the documentation is still from the very beginning as well. Feel free
to continue issuing design suggestions.
Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary. Plus, IRP currently
saves a pointer (Value *, the anchor value) and an int. For most users of LLVM, a pointer will take 8 bytes. Because of alignment,
we will use another 8 bytes anyway for KindOrArgNo, not 4. And I think it will be so much simpler to use the wasted 4 bytes
to decouple the values. That is, one int for the kind and one intfor the argument number.If we all agree in that, of course I'm willing to do the related patches.
You can split the two apart but fit them into at most 64bit, maybe even 32.
FWIW, I was hoping to add a position (=Instruction) to this class (or a subclass) at some point so we can do flow sensitive queries.
I'm fine with this. @sstefan1 please accept once you are satisfied.
llvm/include/llvm/Transforms/IPO/Attributor.h | ||
---|---|---|
168 | FWIW. The encoding allows to check for arguments by doing value >= 0. If that is true, the argument number is the value. | |
278 | not redundant but should be at the start. If it's invalid we crash on the dyn_cast otherwise (or worse) |
:)
Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary. Plus, IRP currently
saves a pointer (Value *, the anchor value) and an int. For most users of LLVM, a pointer will take 8 bytes. Because of alignment,
we will use another 8 bytes anyway for KindOrArgNo, not 4. And I think it will be so much simpler to use the wasted 4 bytes
to decouple the values. That is, one int for the kind and one intfor the argument number.If we all agree in that, of course I'm willing to do the related patches.
You can split the two apart but fit them into at most 64bit, maybe even 32.
Um, right now 64 bit is only the pointer. How would one fit a pointer and 2 ints in 64 (or 32 bits).
Right now, if I understand correctly, the pointer and the int take about 12 bytes (96 bits). And probably 16 bytes because of the alignment.
FWIW, I was hoping to add a position (=Instruction) to this class (or a subclass) at some point so we can do flow sensitive queries.
Ok, good. I don't see right now how flow-sensitivity helps in IRP but I'll leave it for when time comes to introduce it.
I'm fine with this. @sstefan1 please accept once you are satisfied.
llvm/include/llvm/Transforms/IPO/Attributor.h | ||
---|---|---|
168 | But how do you know if it's a call site argument or a formal parameter (i.e. IRP_ARGUMENT) ? | |
278 | Yes, ok, that's why I mentioned that its position seemed weird. I'll change it. |
llvm/include/llvm/Transforms/IPO/Attributor.h | ||
---|---|---|
168 | If the value is smaller than 0, it is of the type specified by the value, thus the value is the "kind". | |
280 | These are different things for call site [arguments] |
llvm/include/llvm/Transforms/IPO/Attributor.h | ||
---|---|---|
280 | I agree, my bad. I think that there are at least few places that we use getAssociatedFunction instead of getAnchorScope. Not necessary in this patch. |
sizeof(Kind + ArgNo) <= 64
Oh bytes, yes. With the change they will take 16 bytes (as now).
@sstefan1
Thanks for accepting, I'll upload another diff and I'll leave other bigger changes (like the kind thing or the getAnchorScope() - if it gets agreed upon - for followups).
llvm/include/llvm/Transforms/IPO/Attributor.h | ||
---|---|---|
168 | Oh ok, using dyn_cast or sth on the AnchorVal. Thanks, I thought it was somehow done only with KindOrArgNo and I missed it. | |
280 | So, yesterday I saw quite late this getCalledFunction() which should create problems and it did. I left it for today to try to combine them (in getAssociatedFunction()) better but it seems we can't. Anytime that we do CGSCC pass for example, for call instructions is different (or when we get scope for AAValueRange).
to me this is a problem because getAssociatedFunction() does not have a coherent behavior. It's like "I'll give you the parent function most of the time, except if the anchor value is a CallBase". So, when you see it used in the code, it's unclear whether the one who wrote it thought:
I think:
Plus, if that happens, we should change getAssociatedFunction() to only return the callee and otherwise null (i.e. remove the ambiguity). See follow-up diff. |
- Replacement of getAssociatedFunction() with getAnchorScope() where I think it's almost certain we want the parent function (we can revert that if you don't like it).
- Other small refactorings.
Thanks to both!
@uenoku It fixes some typos but with the discussion, I also did some refactoring. I'll change the description.
don't think this is necessary.