This is an archive of the discontinued LLVM Phabricator instance.

[Attributor][NFC] Refactorings and typos in doc
ClosedPublic

Authored by baziotis on Mar 14 2020, 7:24 AM.

Details

Summary
  • Replace getAssociatedFunction with getAnchorScope where it's clear we want the parent function.
  • Fixes in doc and other small refactorings.

Diff Detail

Event Timeline

baziotis created this revision.Mar 14 2020, 7:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2020, 7:24 AM
baziotis edited the summary of this revision. (Show Details)Mar 14 2020, 7:25 AM

Today I started reading through the whole Attributor, or at least its core parts, because I think otherwise I won't
have a good enough understanding to proceed. This is the start of a series of small patches that will mostly be NFC
improvements to docs with which I hope I'll have the opportunity to ask questions about different decisions in the Attributor.

Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary. Plus, IRP currently
saves a pointer (Value *, the anchor value) and an int. For most users of LLVM, a pointer will take 8 bytes. Because of alignment,
we will use another 8 bytes anyway for KindOrArgNo, not 4. And I think it will be so much simpler to use the wasted 4 bytes
to decouple the values. That is, one int for the kind and one intfor the argument number.

If we all agree in that, of course I'm willing to do the related patches.

Today I started reading through the whole Attributor, or at least its core parts, because I think otherwise I won't
have a good enough understanding to proceed. This is the start of a series of small patches that will mostly be NFC
improvements to docs with which I hope I'll have the opportunity to ask questions about different decisions in the Attributor.

This seems fine, as well as the initiative. Few comments though

Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary.

I kind of like how this is working, but if others agree you can change it.

llvm/include/llvm/Transforms/IPO/Attributor.h
38

don't think this is necessary.

278

I think this assert is also redundant, given that it is called again in getAnchorValue()

280

This makes sense. Now that I look at the other parts we seem to mix these two a lot. I think the better solution is to inline getAnchorScope here and replace all the call sites with getAssociatedFunction as it's clearer (at least to me).

baziotis marked 2 inline comments as done.EditedMar 14 2020, 10:25 AM

I kind of like how this is working, but if others agree you can change it.

I appreciate it, it may be that I was not involved when it was written. It just seems
to be unclear how it's working. Consider this:

/// Create a position describing the argument \p Arg.
static const IRPosition argument(const Argument &Arg) {
  return IRPosition(const_cast<Argument &>(Arg), Kind(Arg.getArgNo()));
}

Arg.getArgNo() can be 1, in which case, the respective Kind is IRP_CALL_SITE_ARGUMENT. At least to me, it makes
no sense why we want that. Again, it might be that only I don't get it but I have a lot of "why this happens?" moments
when reading different parts of the Attributor that have to do with arguments.

llvm/include/llvm/Transforms/IPO/Attributor.h
278

Yes, although I don't understand why it's not at the start of this function.

280

Yes, I like that too.

Today I started reading through the whole Attributor, or at least its core parts, because I think otherwise I won't
have a good enough understanding to proceed. This is the start of a series of small patches that will mostly be NFC
improvements to docs with which I hope I'll have the opportunity to ask questions about different decisions in the Attributor.

Great! It's about time we give it a overhaul, a lot of the documentation is still from the very beginning as well. Feel free
to continue issuing design suggestions.

Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary. Plus, IRP currently
saves a pointer (Value *, the anchor value) and an int. For most users of LLVM, a pointer will take 8 bytes. Because of alignment,
we will use another 8 bytes anyway for KindOrArgNo, not 4. And I think it will be so much simpler to use the wasted 4 bytes
to decouple the values. That is, one int for the kind and one intfor the argument number.

If we all agree in that, of course I'm willing to do the related patches.

You can split the two apart but fit them into at most 64bit, maybe even 32.
FWIW, I was hoping to add a position (=Instruction) to this class (or a subclass) at some point so we can do flow sensitive queries.


I'm fine with this. @sstefan1 please accept once you are satisfied.

llvm/include/llvm/Transforms/IPO/Attributor.h
168

FWIW. The encoding allows to check for arguments by doing value >= 0. If that is true, the argument number is the value.

278

not redundant but should be at the start. If it's invalid we crash on the dyn_cast otherwise (or worse)

baziotis marked 2 inline comments as done.Mar 14 2020, 1:33 PM

Today I started reading through the whole Attributor, or at least its core parts, because I think otherwise I won't
have a good enough understanding to proceed. This is the start of a series of small patches that will mostly be NFC
improvements to docs with which I hope I'll have the opportunity to ask questions about different decisions in the Attributor.

Great! It's about time we give it a overhaul, a lot of the documentation is still from the very beginning as well. Feel free
to continue issuing design suggestions.

:)

Starting off, I tried multiple times to read IRPosition and in my humble opinion, it seems way too complicated.
Specifically the KindOrArgNo. This coupling of basically 2 kinds of info in one seems unnecessary. Plus, IRP currently
saves a pointer (Value *, the anchor value) and an int. For most users of LLVM, a pointer will take 8 bytes. Because of alignment,
we will use another 8 bytes anyway for KindOrArgNo, not 4. And I think it will be so much simpler to use the wasted 4 bytes
to decouple the values. That is, one int for the kind and one intfor the argument number.

If we all agree in that, of course I'm willing to do the related patches.

You can split the two apart but fit them into at most 64bit, maybe even 32.

Um, right now 64 bit is only the pointer. How would one fit a pointer and 2 ints in 64 (or 32 bits).
Right now, if I understand correctly, the pointer and the int take about 12 bytes (96 bits). And probably 16 bytes because of the alignment.

FWIW, I was hoping to add a position (=Instruction) to this class (or a subclass) at some point so we can do flow sensitive queries.

Ok, good. I don't see right now how flow-sensitivity helps in IRP but I'll leave it for when time comes to introduce it.


I'm fine with this. @sstefan1 please accept once you are satisfied.

llvm/include/llvm/Transforms/IPO/Attributor.h
168

But how do you know if it's a call site argument or a formal parameter (i.e. IRP_ARGUMENT) ?
Since argument numbers start from 0, this is problematic (e.g. see the previous comment regarding the argument() constructor).
Plus then there are a lot of switches that if say we have argument number == 3 don't seem to be that clear.
I'm mainly asking in order to be able to understand the current code so that I can change it.

278

Yes, ok, that's why I mentioned that its position seemed weird. I'll change it.

sizeof(Kind + ArgNo) <= 64

jdoerfert added inline comments.Mar 14 2020, 6:56 PM
llvm/include/llvm/Transforms/IPO/Attributor.h
168

If the value is smaller than 0, it is of the type specified by the value, thus the value is the "kind".
If the value is at least 0 and the associated value is an llvm::Argument, it is an argument at position value.
If the value is at least 0 and the associated value is an llvm::CallBase, it is a call site argument at position value.

280

These are different things for call site [arguments]

sstefan1 accepted this revision.Mar 15 2020, 3:23 AM
sstefan1 added inline comments.
llvm/include/llvm/Transforms/IPO/Attributor.h
280

I agree, my bad. I think that there are at least few places that we use getAssociatedFunction instead of getAnchorScope. Not necessary in this patch.

This revision is now accepted and ready to land.Mar 15 2020, 3:23 AM
baziotis marked 2 inline comments as done.Mar 15 2020, 5:51 AM

sizeof(Kind + ArgNo) <= 64

Oh bytes, yes. With the change they will take 16 bytes (as now).

@sstefan1
Thanks for accepting, I'll upload another diff and I'll leave other bigger changes (like the kind thing or the getAnchorScope() - if it gets agreed upon - for followups).

llvm/include/llvm/Transforms/IPO/Attributor.h
168

Oh ok, using dyn_cast or sth on the AnchorVal. Thanks, I thought it was somehow done only with KindOrArgNo and I missed it.

280

So, yesterday I saw quite late this getCalledFunction() which should create problems and it did. I left it for today to try to combine them (in getAssociatedFunction()) better but it seems we can't. Anytime that we do CGSCC pass for example, for call instructions is different (or when we get scope for AAValueRange).
It may actually be a good idea to decouple them further rather than combining them. Specifically:

I think that there are at least few places that we use getAssociatedFunction instead of getAnchorScope

to me this is a problem because getAssociatedFunction() does not have a coherent behavior. It's like "I'll give you the parent function most of the time, except if the anchor value is a CallBase". So, when you see it used in the code, it's unclear whether the one who wrote it thought:

  • I know the anchor value is a CallBase, I expect the callee back.
  • I know it's not, I want the parent function of the AnchorVal.
  • I want a whatever scope that should be a function.

I think:

  • Case 1 calls should be left as they are.
  • Case 2 calls should be replaced with getAnchorScope
  • Case 3 calls should be eliminated because they're unclear.

Plus, if that happens, we should change getAssociatedFunction() to only return the callee and otherwise null (i.e. remove the ambiguity).

See follow-up diff.

baziotis updated this revision to Diff 250416.Mar 15 2020, 5:54 AM
baziotis edited the summary of this revision. (Show Details)
  • Replacement of getAssociatedFunction() with getAnchorScope() where I think it's almost certain we want the parent function (we can revert that if you don't like it).
  • Other small refactorings.
sstefan1 accepted this revision.Mar 23 2020, 8:08 AM

I'll re-accept :)

llvm/lib/Transforms/IPO/Attributor.cpp
8652

typo: cache.

LGTM but it is not fixing the typo rather refactoring :)

Thanks to both!

@uenoku It fixes some typos but with the discussion, I also did some refactoring. I'll change the description.

baziotis retitled this revision from [Attributor][NFC] Typos in doc to [Attributor][NFC] Refactorings and typos in doc.Mar 23 2020, 10:21 AM
This revision was automatically updated to reflect the committed changes.