jholewinski (Justin Holewinski)
User

Projects

User does not belong to any projects.

User Details

User Since
Nov 11 2012, 12:12 PM (253 w, 6 d)

Recent Activity

Jun 27 2016

jholewinski committed rL273922: Only emit extension for zeroext/signext arguments if type is < 32 bits.
Only emit extension for zeroext/signext arguments if type is < 32 bits
Jun 27 2016, 1:29 PM
jholewinski closed D21756: Only emit extension for zeroext/signext arguments if type is < 32 bits by committing rL273922: Only emit extension for zeroext/signext arguments if type is < 32 bits.
Jun 27 2016, 1:29 PM
jholewinski updated the diff for D21756: Only emit extension for zeroext/signext arguments if type is < 32 bits.

Add test for signext case

Jun 27 2016, 12:06 PM
jholewinski retitled D21756: Only emit extension for zeroext/signext arguments if type is < 32 bits from to Only emit extension for zeroext/signext arguments if type is < 32 bits.
Jun 27 2016, 11:25 AM

Jun 17 2016

jholewinski accepted D20389: NVPTX: Add supported CL features.

Looks good to me

Jun 17 2016, 6:49 AM

Jun 9 2016

jholewinski added a comment to D21162: [CUDA] Implement __shfl* intrinsics in clang headers..

Looks reasonable to me.

Jun 9 2016, 5:04 AM
jholewinski added a comment to D21161: [NVPTX] Mark bar.sync intrinsic as convergent..

Looks good to me!

Jun 9 2016, 5:00 AM
jholewinski added a comment to D21160: [NVPTX] Add intrinsics for shfl instructions..

Looks good to me!

Jun 9 2016, 5:00 AM

May 19 2016

jholewinski added inline comments to D20389: NVPTX: Add supported CL features.
May 19 2016, 7:34 AM

May 2 2016

jholewinski committed rL268272: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection.
[NVPTX] Fix sign/zero-extending ldg/ldu instruction selection
May 2 2016, 11:18 AM
jholewinski closed D19615: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection by committing rL268272: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection.
May 2 2016, 11:18 AM

Apr 29 2016

jholewinski updated the diff for D19615: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection.

Address review comments

Apr 29 2016, 5:56 AM

Apr 28 2016

jholewinski updated the diff for D19615: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection.

Refactor opcode selection into utility function and add test case

Apr 28 2016, 1:06 PM
jholewinski added inline comments to D19615: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection.
Apr 28 2016, 9:29 AM

Apr 27 2016

jholewinski added a comment to D19615: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection.

Just putting this up for Tobias to try... test case incoming.

Apr 27 2016, 1:41 PM
jholewinski retitled D19615: [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection from to [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection.
Apr 27 2016, 1:40 PM

Apr 5 2016

jholewinski committed rL265389: [NVPTX] Handle ldg created from sign-/zero-extended load.
[NVPTX] Handle ldg created from sign-/zero-extended load
Apr 5 2016, 5:43 AM
jholewinski closed D18053: [NVPTX] Handle ldg created from sign-/zero-extended load by committing rL265389: [NVPTX] Handle ldg created from sign-/zero-extended load.
Apr 5 2016, 5:43 AM

Mar 31 2016

jholewinski accepted D17472: [NVPTX] Annotate some instructions as hasSideEffects = 0..

Looks good! Thanks!

Mar 31 2016, 5:37 AM
jholewinski added a comment to D17872: [NVPTX] Add a truncate DAG node to some calls..

That's for... hmm... I don't know. I don't see how that block would be executed. The call lowering code has always been a mess, unfortunately.

Mar 31 2016, 5:36 AM

Mar 14 2016

jholewinski added a comment to D18053: [NVPTX] Handle ldg created from sign-/zero-extended load.

Thanks for the comments. I'll try to get a new version of this up soon. As for http://reviews.llvm.org/D17872, it seems unlikely to be related. This bug is very specific to the selection of LDG. Though it may be generally related in so far as both are due to i8 handling. I really wish we had a better way of handling that.

Mar 14 2016, 4:27 AM

Mar 10 2016

jholewinski retitled D18053: [NVPTX] Handle ldg created from sign-/zero-extended load from to [NVPTX] Handle ldg created from sign-/zero-extended load.
Mar 10 2016, 10:55 AM

Mar 9 2016

jholewinski requested changes to D12093: [NVPTX] Support register copy from i16 to i32 register types.

This is an issue with LDG handling. canLowerToLDG() is returning true for an i8 load zero-extended to i32, but SelectLDGLDU cannot handle this case.

Mar 9 2016, 10:22 AM

Mar 1 2016

jholewinski added a comment to D17357: [NVPTX] Remove workaround for tablegen crash in NVPTXInstrInfo.td..

Cool! Glad this was fixed. Thanks!

Mar 1 2016, 11:49 AM
jholewinski added a comment to D17471: [NVPTX] Annotate param loads/stores as mayLoad/mayStore..

At the hardware level, there is no 'param' address space. It's a PTX abstraction to handle parameter passing. There may be memory associated with it for complex calls, but it should never alias another address space. There's really no such thing as a pointer to a param space pointer.

Mar 1 2016, 11:46 AM
jholewinski accepted D17354: Reformat NVPTXInstrInfo.td, and add additional comments..

Sorry, I know I've been slow to respond recently...

Mar 1 2016, 4:54 AM
jholewinski added a comment to D17471: [NVPTX] Annotate param loads/stores as mayLoad/mayStore..

I'm not sure I understand the relation to non-param loads/stores. Param loads/stores will never alias another address space, so arbitrary reordering with other loads/stores should be legal. Can you show an example of what you're seeing?

Mar 1 2016, 4:52 AM

Feb 29 2016

jholewinski accepted D17423: [NVPTX] Use different, convergent MIs for convergent calls..

Looks reasonable to me. Thanks!

Feb 29 2016, 5:56 PM
jholewinski accepted D17471: [NVPTX] Annotate param loads/stores as mayLoad/mayStore..

Sorry, missed the original notifications.

Feb 29 2016, 5:53 PM

Feb 17 2016

jholewinski added a comment to D17354: Reformat NVPTXInstrInfo.td, and add additional comments..

What is the motivation for this? Is it reformatted with a tool?

Feb 17 2016, 4:01 PM

Feb 16 2016

jholewinski accepted D17315: [NVPTX] Annotate call machine instructions as calls..

Looks reasonable to me.

Feb 16 2016, 5:33 PM

Feb 10 2016

jholewinski added a comment to D17086: [NVPTX] emit .file directives for files referenced by subprograms..

We do not have any tests for debug support, since it was never properly implemented. We have plenty of tests for other parts of the NVPTX AsmPrinter. Or are you referring to something else?

Feb 10 2016, 11:32 AM
jholewinski added a comment to D17071: [NVPTX] Update address space mapping documentation.

The documentation is actually correct here. PTXLdStInstCode is an internal enumeration that is independent of the public address space mapping. It's unfortunate that there is disagreement between the two mappings, but this should be fixed by changing the PTXLdStInstCode enum. See the AddressSpace enum in lib/Target/NVPTX/MCTargetDesc/NVPTXBaseInfo.h.

Feb 10 2016, 7:14 AM

Jan 29 2016

jholewinski accepted D16739: [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor..

LGTM. Thanks!

Jan 29 2016, 3:54 PM

Jan 22 2016

jholewinski accepted D16479: [NVPTX] expand mul_lohi to mul_lo and mul_hi.

LGTM! Thanks for fixing this; I just saw the PR go by. :)

Jan 22 2016, 11:37 AM

Jan 14 2016

jholewinski accepted D16202: [NVPTX] Do not emit .hidden or .protected directives as they are not allowed by PTX. .

LGTM

Jan 14 2016, 11:28 AM

Oct 26 2015

jholewinski added inline comments to D14079: [FPEnv Core 14/14] Introduce F*_W_CHAIN instrs to prevent reordering.
Oct 26 2015, 8:54 AM

Aug 18 2015

jholewinski added inline comments to D12093: [NVPTX] Support register copy from i16 to i32 register types.
Aug 18 2015, 11:51 AM

Aug 17 2015

jholewinski added inline comments to D12093: [NVPTX] Support register copy from i16 to i32 register types.
Aug 17 2015, 5:39 PM

Aug 10 2015

jholewinski added inline comments to D11855: SelectionDAG: Prefer to combine multiplication with less uses for fma.
Aug 10 2015, 11:35 AM

Aug 5 2015

jholewinski added inline comments to D11774: [NVPTX] Use LDG for pointer induction variables.
Aug 5 2015, 3:44 PM

Aug 1 2015

jholewinski added a comment to D11622: [NVPTX] allow register copy between float and int.

This looks reasonable for now. Longer term, I'd like to experiment with doing away with the typed registers completely and just use .bXX for all registers. Typing is nice for readability, but code gen can be cleaner without it. They all compile down to the same set of registers at the sass level, and I'd like to match the hardware as much as possible (given that we can't just generate sass).

It's a shame we can't generate SASS ;-) Working around limitations in ptxas makes LLVM's job somewhat miserable here.

Aug 1 2015, 9:32 AM
jholewinski accepted D11622: [NVPTX] allow register copy between float and int.

This looks reasonable for now. Longer term, I'd like to experiment with doing away with the typed registers completely and just use .bXX for all registers. Typing is nice for readability, but code gen can be cleaner without it. They all compile down to the same set of registers at the sass level, and I'd like to match the hardware as much as possible (given that we can't just generate sass).

Aug 1 2015, 3:46 AM

Jul 31 2015

jholewinski accepted D11505: [NVPTX] convert pointers in byval kernel arguments to global.

Looks reasonable to me, with the minor comment.

Jul 31 2015, 1:03 PM

Jul 17 2015

jholewinski added a comment to D11304: [NVPTX] run LSR before straight-line optimizations.

Looks reasonable to me.

Jul 17 2015, 12:22 PM

Jul 15 2015

jholewinski accepted D11220: Correct lowering of memmove in NVPTX.

Fair enough.

Jul 15 2015, 8:31 AM
jholewinski added a comment to D11220: Correct lowering of memmove in NVPTX.

The algorithm looks good. Thanks for working on this!

Jul 15 2015, 8:06 AM

Jul 9 2015

jholewinski added a comment to D11089: [NVPTX] declare no vector registers.

The short answer is that ptxas doesn't handle vector registers very well. It may be good to revisit this, but ptxas currently prefers scalar ops.

Jul 9 2015, 4:33 PM
jholewinski added a comment to D11089: [NVPTX] declare no vector registers.

LGTM

Jul 9 2015, 4:26 PM

Jul 1 2015

jholewinski accepted D10855: Enable partial and runtime loop unrolling for NVPTX.

Looks reasonable to me

Jul 1 2015, 5:05 AM

Jun 30 2015

jholewinski added a comment to D10844: [NVPTX] Fix issue introduced in D10321.

LTGM

Jun 30 2015, 11:04 AM
jholewinski accepted D9940: Consider offset in global variables during lowering in NVPTX..

Sounds reasonable to me.

Jun 30 2015, 9:17 AM

Jun 25 2015

jholewinski accepted D10664: Fixed a typo in __nvvm_atom_min_gen_l() type string. .

Good catch; that was a legacy workaround that I missed.

Jun 25 2015, 5:38 AM
jholewinski accepted D10666: [CUDA] Implemented __nvvm_atom_*_gen_* builtins..
Jun 25 2015, 5:37 AM

Jun 24 2015

jholewinski added a comment to D10549: Add NVPTXPeephole pass to reduce unnecessary address cast.

What is the future expectation for NVPTXPeephole? Are you planning on adding additional transforms? If not, perhaps a more specific name is warranted. Otherwise, LGTM! Thanks!

Jun 24 2015, 4:48 AM

Jun 16 2015

jholewinski added a comment to D10483: Add NVPTXLowerAlloca pass to convert alloca'ed memory to local address.

Other than the issues raised by Jingyue, this looks good to me! Thanks!

Jun 16 2015, 1:01 PM

Jun 9 2015

jholewinski accepted D10345: [NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces.

Looks reasonable to me. Thanks!

Jun 9 2015, 1:48 PM

Jun 8 2015

jholewinski accepted D8209: [nvptx] Only support the 'm' inline assembly memory constraint. NFC..

Looks reasonable to me.

Jun 8 2015, 3:52 PM
jholewinski added a comment to D9940: Consider offset in global variables during lowering in NVPTX..

What is the purpose to using a static relocation model for NVPTX? I would prefer to just enforce a default here, especially since this is essentially unused for NVPTX.

Jun 8 2015, 3:50 PM
jholewinski accepted D9946: Fix endianess issue in global variable constant initialization in NVPTX..

Sorry for the delay. I was out of town when this was posted and completely missed it.

Jun 8 2015, 3:47 PM
jholewinski accepted D9983: Refactor: Simplify boolean conditional return statements in lib/Target/NVPTX.

LGTM

Jun 8 2015, 3:45 PM
jholewinski accepted D10322: [NVPTX] run SROA after NVPTXFavorNonGenericAddrSpaces.

LGTM

Jun 8 2015, 3:43 PM

Jun 4 2015

jholewinski accepted D10154: [NVPTX] kernel pointer arguments point to the global address space.

This looks good to me now. I agree that we should merge these two passes to prevent any ordering issues.

Jun 4 2015, 11:36 AM

Jun 2 2015

jholewinski added a comment to D10132: [docs] fix the declarations of the llvm.nvvm.ptr.gen.to.* intrinsics.

Change itself looks good, but these intrinsics are deprecated in favor of addrspacecast. A note to that effect should be added.

Jun 2 2015, 7:29 AM
jholewinski added a comment to D10154: [NVPTX] kernel pointer arguments point to the global address space.

Comments inlined.

Jun 2 2015, 5:45 AM

Apr 28 2015

jholewinski added a comment to D9063: [AsmPrinter] handles addrspacecast in lowerConstant.

I committed a fix for this in r236000

Apr 28 2015, 10:23 AM
jholewinski committed rL236000: [NVPTX] Handle addrspacecast constant expressions in aggregate initializers.
[NVPTX] Handle addrspacecast constant expressions in aggregate initializers
Apr 28 2015, 10:21 AM

Apr 23 2015

jholewinski accepted D9130: [NVPTX] Emits "generic()" depending on the original address space.

LGTM

Apr 23 2015, 5:57 PM

Apr 22 2015

jholewinski added inline comments to D9202: [NVPTX] aggregates of addrspacecast as initializer.
Apr 22 2015, 11:12 AM

Apr 20 2015

jholewinski added a comment to D9063: [AsmPrinter] handles addrspacecast in lowerConstant.

Sure, go for it! It may be a few days before I get around to looking at this.

Apr 20 2015, 10:14 AM
jholewinski added a comment to D9063: [AsmPrinter] handles addrspacecast in lowerConstant.

Handling this aspect of PTX has always been a PITA. Internally, we customize lowerConstant to handle this case and create an MCExpr subclass (NVPTXGenericMCSymbolRefExpr) that handles the printing of "generic(<symbolref>)". Something like this will be needed upstream. Our internal version is a bit of a hack, so I'll want to come up with something cleaner for upstream. If you don't have time to tackle that, I'll try to put something together soon.

Apr 20 2015, 5:35 AM

Apr 16 2015

jholewinski added inline comments to D9063: [AsmPrinter] handles addrspacecast in lowerConstant.
Apr 16 2015, 4:31 PM

Apr 9 2015

jholewinski added a comment to D8576: Divergence analysis for GPU programs.

Looks good to me! Thanks for working on this!

Apr 9 2015, 12:46 PM

Mar 30 2015

jholewinski committed rL233583: [NVPTX] Associate a minimum PTX version for each SM architecture.
[NVPTX] Associate a minimum PTX version for each SM architecture
Mar 30 2015, 12:33 PM
jholewinski committed rL233575: [NVPTX] Add options for PTX 4.1/4.2 and SM 3.2/3.7/5.2/5.3.
[NVPTX] Add options for PTX 4.1/4.2 and SM 3.2/3.7/5.2/5.3
Mar 30 2015, 11:15 AM

Mar 11 2015

jholewinski accepted D8274: [NVPTXAsmPrinter] do not print .align on function headers.

LGTM. Thanks!

Mar 11 2015, 6:25 PM

Feb 3 2015

jholewinski accepted D7310: Add straight-line strength reduction to LLVM.

LGTM! Thanks for working on this!

Feb 3 2015, 6:57 AM

Feb 2 2015

jholewinski committed rL227861: When generating llvm.used, we may need an addrspacecast instead of a bitcast..
When generating llvm.used, we may need an addrspacecast instead of a bitcast.
Feb 2 2015, 1:07 PM
jholewinski closed D7345: When generating llvm.used, we may need an addrspacecast instead of a bitcast. by committing rL227861: When generating llvm.used, we may need an addrspacecast instead of a bitcast..
Feb 2 2015, 1:07 PM
jholewinski added a comment to D7345: When generating llvm.used, we may need an addrspacecast instead of a bitcast..

Comments inline.

Feb 2 2015, 11:36 AM
jholewinski retitled D7345: When generating llvm.used, we may need an addrspacecast instead of a bitcast. from to When generating llvm.used, we may need an addrspacecast instead of a bitcast..
Feb 2 2015, 6:59 AM

Jan 31 2015

jholewinski accepted D7041: [NVPTX] Emit .pragma "nounroll" for loops marked with nounroll.

Looks good to me! Sorry for the delay, missed the original email notification.

Jan 31 2015, 6:30 AM

Jan 26 2015

jholewinski committed rL227123: [NVPTX] Generate a more optimal sequence for select of i1.
[NVPTX] Generate a more optimal sequence for select of i1
Jan 26 2015, 11:54 AM
jholewinski committed rL227117: [NVPTX] Handle floating-point conversion patterns that are not explicitly….
[NVPTX] Handle floating-point conversion patterns that are not explicitly…
Jan 26 2015, 11:13 AM

Dec 17 2014

jholewinski added a comment to D6701: Encode native floating point types in data layout string..

NVPTX changes look reasonable to me.

Dec 17 2014, 9:32 AM
jholewinski accepted D6573: [NVPTX] Fix bugs related to isSingleValueType.

LGTM.

Dec 17 2014, 9:30 AM

Dec 1 2014

jholewinski accepted D6455: [NVPTX] Do not emit .weak symbols for NVPTX.

This looks fine for now, considering the same approach is taken for the .globl directive. That said, we really need a better way to handle this. Ptxas supports just enough GNU-style syntax to let us get by without a custom AsmStreamer in most cases, but not enough to avoid these annoying issues.

Dec 1 2014, 12:33 PM

Nov 10 2014

jholewinski added a comment to D6188: Disable indvar widening if arithmetics on the wider type are more expensive.

This looks like a good start to me, but I'm not an expert on IndVarSimplify.

Nov 10 2014, 10:10 AM
jholewinski accepted D6195: [NVPTX] Add an NVPTX-specific TargetTransformInfo.

LGTM. Thanks!

Nov 10 2014, 10:01 AM

Oct 29 2014

jholewinski accepted D6020: Pass aggregate arguments by value in NVPTX.

LGTM. Thanks!

Oct 29 2014, 5:49 AM

Oct 24 2014

jholewinski accepted D5612: [NVPTX] aligned byte-buffers for vector return types.

LGTM.

Oct 24 2014, 1:06 PM

Aug 29 2014

jholewinski accepted D5093: [NVPTX] Make the alignment an explicit argument to ldu/ldg.

No issues internally, please commit. Thanks!

Aug 29 2014, 7:39 AM

Aug 28 2014

jholewinski added a comment to D5093: [NVPTX] Make the alignment an explicit argument to ldu/ldg.

LGTM, but let me check with some folks internally.

Aug 28 2014, 5:04 AM

Jul 17 2014

jholewinski accepted D4558: NVPTX: support f64 <-> f16 intrinsics.

This LGTM. Thanks for implementing this!

Jul 17 2014, 6:59 AM
jholewinski closed D4537: [TABLEGEN] Do not crash on intrinsics with names longer than 40 characters.

Closed by commit rL213253 (authored by @jholewinski).

Jul 17 2014, 4:32 AM

Jul 16 2014

jholewinski updated the diff for D4537: [TABLEGEN] Do not crash on intrinsics with names longer than 40 characters.

Remove erroneous merge

Jul 16 2014, 7:26 AM
jholewinski updated the diff for D4537: [TABLEGEN] Do not crash on intrinsics with names longer than 40 characters.

Add test case

Jul 16 2014, 7:25 AM
jholewinski updated subscribers of D4537: [TABLEGEN] Do not crash on intrinsics with names longer than 40 characters.
Jul 16 2014, 7:16 AM
jholewinski retitled D4537: [TABLEGEN] Do not crash on intrinsics with names longer than 40 characters from to [TABLEGEN] Do not crash on intrinsics with names longer than 40 characters.
Jul 16 2014, 7:05 AM