Page MenuHomePhabricator

[NVPTX] Adds a new address space inference pass.

Authored by jingyue on Mar 8 2016, 11:07 AM.



The old address space inference pass (NVPTXFavorNonGenericAddrSpaces) is unable
to convert the address space of a pointer induction variable. This patch adds a
new pass called NVPTXInferAddressSpaces that overcomes that limitation using a
fixed-point data-flow analysis (see the file header comments for details).

The new pass is experimental and not enabled by default. Users can turn
it on by setting the -nvptx-use-infer-addrspace flag of llc.

Diff Detail

Event Timeline

jingyue updated this revision to Diff 50059.Mar 8 2016, 11:07 AM
jingyue retitled this revision from to [NVPTX] Adds a new address space inference pass..
jingyue updated this object.
jingyue added reviewers: jholewinski, jlebar, tra.
jingyue added a subscriber: llvm-commits.
jlebar edited edge metadata.Mar 9 2016, 8:23 PM

I had a few minutes to read through the comment here, hope you don't mind the drive-by review. I thought the comment was really helpful, btw.

Would it be worth adding something to the pass this one is intended to replace (and maybe this one as well) to indicate the relationship between the two passes? I expect this may be unclear to someone just reading the code.


Nit: s/ so/, so/


I think you want a comma after "PHINodes", since (I think) this is a nonrestrictive clause. Meaning, you're saying that phi nodes are the challenge, not that specifically phi nodes that create loops are the challenge.


Nit: "requires us to know" or "requires the compiler to know".


Nit: List parallelism. "Our algorithm first converts a to b. Then it *converts* b to c. FInally, it *fixes* the undef in c so that"

Slowly making my way through this, here's a checkpoint. So far I'm finding this quite easy to read. Thank you!


Is this change related to the new pass we're adding? Not clear to me how...


As I read the style guide, we're supposed to make anonymous namespaces as small as possible and then use static functions everywhere else?


SmallVector has a constructor taking a range -- does that work here?


Nit: SmallVector has an initializer_list constructor, so would this be clearer if we just said

return {Op.getOperand(0)};


return SmallVector<Value *, 2>{{Op.getOperand(0)}};

? For my part, I can never remember which way the args in the constructor here go.


Uber nit: I think not having this "//" would be helpful, as the two comments are entirely different, rather than two paragraphs in a single comment. As-is I had to backtrack to understand what was happening.




Is there some reason not to define this one-liner here?


every expression


Does SetVector work for you here?


every expression is in

Almost all of my comments are cosmetic, but I have a few nontrivial questions. This looks quite sane to me.


We don't necessarily clone I -- e.g. if it's an AddrSpaceCast where the pointer types match. Probably worth documenting, since it's significant that this function returns a Value and not an Instruction.


"space, according"


Maybe it's obvious, but I can't figure out when we hit this. Src->getType()'s address space matches NewAddrSpace, but Src->getType() != NewPtrType?


Any reason we can't iterate over I->operands() directly? getOperandUse() seems to do exactly the same thing as getOperand, just with different types.

If we do have to do a non-range-based for loop, style guide says to cache the length (which I find silly, but there it is).


Would push_back be clearer?


Is it worth asserting that the phi's type is a pointer? Otherwise we're in trouble here.

NewIndexList(GEP->idx_begin(), GEP->idx_end());



I may be missing something, but it seems like this would be a lot simpler if we instead cloned the old instruction, then iterated over its relevant operands and converted the pointer ones to the new address space. Then we wouldn't have to worry about tricky things like remembering to call setInBounds.

In fact if we wanted, I think we could write this mostly-generically:

  • for each instruction type, make a list of all of the operand indices that may contain pointers and that we want to modify.
  • then iterate over that list and convert the ones which are pointers to the new address space.

"space, according"


s/and that/and /

(List parallelism -- you should be able to read the list while omitting (1): "because foo and (2) this function is called ...", not "and that".)


when constructing


perhaps document what 'false' is? Unless it's obvious. (Old rant:


value V, with its


modified, in postorder

(Otherwise it looks like "in postorder" modifies "needs to be modified", rather than "is called".)


"Every value in Postorder is", or "All values in Postorder are"


I haven't written any large passes, so if this is wrong, feel free to ignore, but carrying the InferredAddressSpaces map as a member variable seems like it might add more complexity than using it as a local variable here. In particular, we wouldn't need to remember to clear it, and updateAddressSpace could simply return the new address space, instead of updating the global map. (Maybe it would return an Optional<unsigned>, or it could just return the same address space to indicate "no change".)


expressions, which


Is it worth not doing the hash lookup twice?


Although style guide likes declaring variables inside if statements, it also likes reducing nesting, and in this case maybe that wins -- i.e., reverse the if statement and add a continue?


Why can't we iterate over V->uses() directly? (Is it because we're modifying them?)


of a load/store


type, so (joining independent clauses with a conjunction)


"is still valid", maybe?


Maybe it's obvious, but it's not clear to me why we want to go past phi nodes. We're adding a cast to the value V, which is defined at position I->getIterator(). So surely that value is in fact available right after that position -- why skip further down?


Are these incompatible -- i.e., we can't combine these two into one FileCheck call with two use-prefixes?

(I'm done with the whole thing now.)

Thanks a lot for the review! I understand it's a lot of work :) I answered one of your high-level questions. I'm still OOO and will get to other comments later.


I tried that initially. The problem was you can't just "convert" the address space of a pointer operand of a Value without changing the type of the Value. e.g.,

float* a = GEP float, float* b, 0, 1

if we convert b's type to float addrspace(3)*, the GEP instruction becomes invalid because a's type is still float*. We would need an extra step of a->mutateType() which is discouraged and doesn't work for constant expressions (footnote: the lookup table for constant expressions uses type + operands as keys, so mutating the types brutally would mess it up.)

So, I chose to construct new instructions and constant expressions instead of directly converting types, as it is safer and does not depend too much on the internals.

jingyue updated this revision to Diff 51094.Mar 18 2016, 5:31 PM
jingyue marked 35 inline comments as done.
jingyue edited edge metadata.

jlebar@'s comments


NVPTXInferAddressSpaces does a better job in preserving the variable names than the old version of NVPTXFavorNonGenericAddrSpaces. After this change, the two passes can share the same @rauw test.

In the old version, NewASC is likely going to be removed, so it makes more sense to assign the name to NewGEP instead. This makes this pass generate more named variables instead of something like %1. See @rauw in access-non-generic.ll.


PHINode::incoming_values returns a range of Use* instead of a range of Value*. Use* can be type-converted to Value*, so my code works.


addrspacecast can convert the pointee type as well. See @ld_int_from_float.


I don't get what you mean. What's the one-liner?


I like your proposal. PTAL.


That's a great idea.


Yes. U->set(...) invalidates use iterators of U->getUser().


In LLVM IR, PHINodes have to be in the front of a basic block.


IR-WITH-LOOP only works with nvptx-infer-addrspace and not nvptx-favor-non-generic, so I split the tests into

IR: check both favor and infer
IR-WITH-LOOP: check only favor

jlebar accepted this revision.Mar 20 2016, 2:05 AM
jlebar edited edge metadata.

Thank you very much for explaining the bits of this that were confusing to me -- it's very helpful!


I think you meant to remove the line above?


You'll probably want to go over this once more with clang-format to catch "&" and "*" placement. I read too much google3 to notice this consistently, myself. :)


I just mean moving the definition of this function into this declaration. Not a big deal either way.


Looks good to me!


I just meant

FileCheck %s --check-prefix IR --check-prefix IR-WITH-LOOP

Not a big deal -- in fact maybe we'll get clearer errors as-is.

This revision is now accepted and ready to land.Mar 20 2016, 2:05 AM
jingyue updated this revision to Diff 51134.Mar 20 2016, 1:45 PM
jingyue marked 2 inline comments as done.
jingyue edited edge metadata.

Some more minor changes.

All comments addressed. Submitting...


The one-liner definition below expands to llvm::initializeNVPTXInferAddressSpacesPass, so it has to be out of namespace llvm.


I didn't know I could chain multiple --check-prefix together. Thanks for pointing this out.

jingyue closed this revision.Mar 20 2016, 2:04 PM