User Details
- User Since
- Nov 11 2012, 12:12 PM (427 w, 21 h)
Nov 30 2020
Few questions below. Please bear with me as I try to grok the proposal and the long stream of comments...
Jun 27 2016
Add test for signext case
Jun 17 2016
Jun 9 2016
Looks reasonable to me.
Looks good to me!
Looks good to me!
May 19 2016
May 2 2016
Apr 29 2016
Address review comments
Apr 28 2016
Refactor opcode selection into utility function and add test case
Apr 27 2016
Just putting this up for Tobias to try... test case incoming.
Apr 5 2016
Mar 31 2016
Looks good! Thanks!
That's for... hmm... I don't know. I don't see how that block would be executed. The call lowering code has always been a mess, unfortunately.
Mar 14 2016
Thanks for the comments. I'll try to get a new version of this up soon. As for http://reviews.llvm.org/D17872, it seems unlikely to be related. This bug is very specific to the selection of LDG. Though it may be generally related in so far as both are due to i8 handling. I really wish we had a better way of handling that.
Mar 10 2016
Mar 9 2016
This is an issue with LDG handling. canLowerToLDG() is returning true for an i8 load zero-extended to i32, but SelectLDGLDU cannot handle this case.
Mar 1 2016
Cool! Glad this was fixed. Thanks!
At the hardware level, there is no 'param' address space. It's a PTX abstraction to handle parameter passing. There may be memory associated with it for complex calls, but it should never alias another address space. There's really no such thing as a pointer to a param space pointer.
Sorry, I know I've been slow to respond recently...
I'm not sure I understand the relation to non-param loads/stores. Param loads/stores will never alias another address space, so arbitrary reordering with other loads/stores should be legal. Can you show an example of what you're seeing?
Feb 29 2016
Looks reasonable to me. Thanks!
Sorry, missed the original notifications.
Feb 17 2016
What is the motivation for this? Is it reformatted with a tool?
Feb 16 2016
Looks reasonable to me.
Feb 10 2016
We do not have any tests for debug support, since it was never properly implemented. We have plenty of tests for other parts of the NVPTX AsmPrinter. Or are you referring to something else?
The documentation is actually correct here. PTXLdStInstCode is an internal enumeration that is independent of the public address space mapping. It's unfortunate that there is disagreement between the two mappings, but this should be fixed by changing the PTXLdStInstCode enum. See the AddressSpace enum in lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h.
Jan 29 2016
LGTM. Thanks!
Jan 22 2016
LGTM! Thanks for fixing this; I just saw the PR go by. :)
Jan 14 2016
LGTM
Oct 26 2015
Aug 18 2015
Aug 17 2015
Aug 10 2015
Aug 5 2015
Aug 1 2015
This looks reasonable for now. Longer term, I'd like to experiment with doing away with the typed registers completely and just use .bXX for all registers. Typing is nice for readability, but code gen can be cleaner without it. They all compile down to the same set of registers at the sass level, and I'd like to match the hardware as much as possible (given that we can't just generate sass).
Jul 31 2015
Looks reasonable to me, with the minor comment.
Jul 17 2015
Looks reasonable to me.
Jul 15 2015
The algorithm looks good. Thanks for working on this!
Jul 9 2015
The short answer is that ptxas doesn't handle vector registers very well. It may be good to revisit this, but ptxas currently prefers scalar ops.
Jul 1 2015
Looks reasonable to me
Jun 30 2015
LTGM
Sounds reasonable to me.
Jun 25 2015
Good catch; that was a legacy workaround that I missed.
Jun 24 2015
What is the future expectation for NVPTXPeephole? Are you planning on adding additional transforms? If not, perhaps a more specific name is warranted. Otherwise, LGTM! Thanks!
Jun 16 2015
Other than the issues raised by Jingyue, this looks good to me! Thanks!
Jun 9 2015
Looks reasonable to me. Thanks!
Jun 8 2015
Looks reasonable to me.
What is the purpose to using a static relocation model for NVPTX? I would prefer to just enforce a default here, especially since this is essentially unused for NVPTX.
Sorry for the delay. I was out of town when this was posted and completely missed it.
LGTM
LGTM
Jun 4 2015
This looks good to me now. I agree that we should merge these two passes to prevent any ordering issues.
Jun 2 2015
Change itself looks good, but these intrinsics are deprecated in favor of addrspacecast. A note to that effect should be added.
Comments inlined.
Apr 28 2015
I committed a fix for this in r236000
Apr 23 2015
LGTM
Apr 22 2015
Apr 20 2015
Sure, go for it! It may be a few days before I get around to looking at this.
Handling this aspect of PTX has always been a PITA. Internally, we customize lowerConstant to handle this case and create an MCExpr subclass (NVPTXGenericMCSymbolRefExpr) that handles the printing of "generic(<symbolref>)". Something like this will be needed upstream. Our internal version is a bit of a hack, so I'll want to come up with something cleaner for upstream. If you don't have time to tackle that, I'll try to put something together soon.
Apr 16 2015
Apr 9 2015
Looks good to me! Thanks for working on this!
Mar 30 2015
Mar 11 2015
LGTM. Thanks!
Feb 3 2015
LGTM! Thanks for working on this!
Feb 2 2015
Comments inline.
Jan 31 2015
Looks good to me! Sorry for the delay, missed the original email notification.
Jan 26 2015
Dec 17 2014
NVPTX changes look reasonable to me.
Dec 1 2014
This looks fine for now, considering the same approach is taken for the .globl directive. That said, we really need a better way to handle this. Ptxas supports just enough GNU-style syntax to let us get by without a custom AsmStreamer in most cases, but not enough to avoid these annoying issues.
Nov 10 2014
This looks like a good start to me, but I'm not an expert on IndVarSimplify.
LGTM. Thanks!
Oct 29 2014
LGTM. Thanks!
Oct 24 2014
LGTM.
Aug 29 2014
No issues internally, please commit. Thanks!
Aug 28 2014
LGTM, but let me check with some folks internally.
Jul 17 2014
This LGTM. Thanks for implementing this!
Closed by commit rL213253 (authored by @jholewinski).
Jul 16 2014
Remove erroneous merge
Add test case