Page MenuHomePhabricator

jvesely (Jan Vesely)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 24 2014, 5:35 PM (220 w, 5 d)

Recent Activity

Tue, Nov 27

jvesely committed rL347668: travis: Add cmake build.
travis: Add cmake build
Tue, Nov 27, 8:10 AM
jvesely committed rL347667: Add cmake build system.
Add cmake build system
Tue, Nov 27, 8:10 AM
jvesely committed rL347666: r600: Remove empty OVERRIDES file.
r600: Remove empty OVERRIDES file
Tue, Nov 27, 8:04 AM
jvesely committed rL347665: amdgcn: Consolidate atomic minmax helpers.
amdgcn: Consolidate atomic minmax helpers
Tue, Nov 27, 8:04 AM
jvesely committed rL347664: configure: Add target specific asm rule..
configure: Add target specific asm rule.
Tue, Nov 27, 8:04 AM
jvesely committed rL347663: configure: provide llvm_as helper variable.
configure: provide llvm_as helper variable
Tue, Nov 27, 8:04 AM

Nov 10 2018

jvesely committed rL346597: r600: Add datalayout to image builtin implementation.
r600: Add datalayout to image builtin implementation
Nov 10 2018, 1:46 PM

Nov 3 2018

jvesely committed rL346086: Remove redundant OVERRRIDES file.
Remove redundant OVERRRIDES file
Nov 3 2018, 5:58 PM
jvesely committed rL346085: configure: Provide symlink for amdgcn-mesa3d instead of configure hack.
configure: Provide symlink for amdgcn-mesa3d instead of configure hack
Nov 3 2018, 5:58 PM
jvesely committed rL346084: travis: Check tahiti-amdgcn-mesa-mesa3d.bc.
travis: Check tahiti-amdgcn-mesa-mesa3d.bc
Nov 3 2018, 5:58 PM
jvesely committed rL346083: amdgcn-amdhsa: Convert get_{global,local}_size to clc for all llvm versions.
amdgcn-amdhsa: Convert get_{global,local}_size to clc for all llvm versions
Nov 3 2018, 5:43 PM
jvesely committed rL346082: amdgcn: Move __clc_amdgcn_s_waitcnt definition to clc file.
amdgcn: Move __clc_amdgcn_s_waitcnt definition to clc file
Nov 3 2018, 5:43 PM
jvesely committed rL346081: amdgcn: Convert get_num_groups to clc.
amdgcn: Convert get_num_groups to clc
Nov 3 2018, 5:43 PM
jvesely committed rL346080: amdgcn: Convert get_global_size to clc.
amdgcn: Convert get_global_size to clc
Nov 3 2018, 5:43 PM
jvesely committed rL346079: amdgcn: Convert get_local_size to clc.
amdgcn: Convert get_local_size to clc
Nov 3 2018, 5:43 PM
jvesely committed rL346078: r600: Convert barrier to clc.
r600: Convert barrier to clc
Nov 3 2018, 5:37 PM
jvesely committed rL346077: r600: Convert get_num_groups to clc.
r600: Convert get_num_groups to clc
Nov 3 2018, 5:37 PM
jvesely committed rL346076: r600: Convert get_global_size to clc.
r600: Convert get_global_size to clc
Nov 3 2018, 5:37 PM
jvesely committed rL346075: r600: Convert get_local_size to clc.
r600: Convert get_local_size to clc
Nov 3 2018, 5:37 PM

Sep 29 2018

jvesely updated subscribers of D52548: Stop instcombining propagating wider shufflevector arguments to predecessors..
Sep 29 2018, 6:53 AM

Sep 15 2018

jvesely committed rL342341: configure: Rework support for gfx9+ devices that were added post LLVM 3.9.
configure: Rework support for gfx9+ devices that were added post LLVM 3.9
Sep 15 2018, 3:03 PM
jvesely committed rL342338: .travis: Add llvm-7 build.
.travis: Add llvm-7 build
Sep 15 2018, 1:03 PM
jvesely committed rL342337: .travis: Use source whitelist alias for llvm-6 repository.
.travis: Use source whitelist alias for llvm-6 repository
Sep 15 2018, 1:03 PM

Aug 21 2018

jvesely accepted D47261: AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space.

v5: rename MAX_COMMON_ADDRESS to MAX_AMDGPU_ADDRESS

Aug 21 2018, 9:46 AM
jvesely added inline comments to D47261: AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space.
Aug 21 2018, 7:39 AM

Aug 20 2018

jvesely added a comment to D47261: AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space.

Please add a reference to llvm bug https://bugs.llvm.org/show_bug.cgi?id=38113
as well as correct "Differential Revision" tag when committing.

Aug 20 2018, 2:12 PM
jvesely added inline comments to D47261: AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space.
Aug 20 2018, 2:08 PM
jvesely added inline comments to D50974: AMDGPU: fix updating the alias rules since r340171.
Aug 20 2018, 8:36 AM
jvesely abandoned D23923: AMDGPU/R600: Use KCache selection in DAGCombiner.

This patch no longer applies

Aug 20 2018, 7:40 AM
jvesely requested changes to D47261: AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space.

NACK. This patch is clearly wrong.
MAX_COMMON_ADDRESS is used in AMDGPUAAResult::ASAliasRulesTy::getAliasResult to filter indices to the ASAliasRules table which is 6x6. Allowing address space 6 leads to out of bounds access to the array.

Aug 20 2018, 6:50 AM

Aug 7 2018

jvesely committed rL339190: AMDGPU: Remove broken i16 ternary patterns.
AMDGPU: Remove broken i16 ternary patterns
Aug 7 2018, 2:55 PM
jvesely closed D49836: AMDGPU: Remove broken ternary i16 patterns.
Aug 7 2018, 2:55 PM
jvesely updated the diff for D49836: AMDGPU: Remove broken ternary i16 patterns.

rename numbered operations

Aug 7 2018, 2:14 PM

Aug 3 2018

jvesely added a comment to D49836: AMDGPU: Remove broken ternary i16 patterns.

ping.
Can we just have the fix in, and worry about optimizing i16 extends later?

Aug 3 2018, 10:48 AM
jvesely committed rL338898: amdgcn: Use __constant AS for amdgcn builtins..
amdgcn: Use __constant AS for amdgcn builtins.
Aug 3 2018, 8:14 AM

Aug 1 2018

jvesely closed D49962: AMDGPU/R600: Convert kernel param loads to use PARAM_I_ADDRESS.

Merged as r338610

Aug 1 2018, 11:45 AM
jvesely committed rL338610: AMDGPU/R600: Convert kernel param loads to use PARAM_I_ADDRESS.
AMDGPU/R600: Convert kernel param loads to use PARAM_I_ADDRESS
Aug 1 2018, 11:36 AM
jvesely added a comment to D49934: AMDGPU: Allow fp32-denormals feaure for r600 targets.

Merged without the test. thanks

Aug 1 2018, 8:06 AM
jvesely committed rL338569: AMDGPU: Allow fp32-denormals feature for r600 targets.
AMDGPU: Allow fp32-denormals feature for r600 targets
Aug 1 2018, 8:05 AM
jvesely closed D49934: AMDGPU: Allow fp32-denormals feaure for r600 targets.
Aug 1 2018, 8:05 AM

Jul 28 2018

jvesely created D49962: AMDGPU/R600: Convert kernel param loads to use PARAM_I_ADDRESS.
Jul 28 2018, 3:46 PM

Jul 27 2018

jvesely abandoned D49649: AMDGPU/R600: Don't set fp32-denormals feature for r600.

D49934

Jul 27 2018, 2:04 PM
jvesely abandoned D49650: Targets/AMDGPU: Don't set fp32-denormals feature for r600.

According to cayman manual, these registers do exist so we should probably just make the feature accepted on r600 as well

sure, that's the way it was before r335942. I assumed the removal was intentional.

Probably accidental because nothing in r600 was actually using it

given the number of warnings it outputs, I find that unlikely.
@tstellar what was your intention? It's not like someone is going to work on EG/CM denormals any time soon.

I don't mind either way. I just want to avoid another round of bikeshedding.

Jul 27 2018, 2:04 PM
jvesely created D49934: AMDGPU: Allow fp32-denormals feaure for r600 targets.
Jul 27 2018, 1:38 PM
jvesely accepted D49907: AMDGPU: Stop trying to exend arguments for clover.

I've been using a version of this locally and it fixes most, but not all tests with char/uchar/short/ushort kernel arguments.
I thought that fixing the hardcoded alignemnt=4 would help, but it's not enough.
It'll need to be handled separately.

Jul 27 2018, 12:14 PM
jvesely committed rL338127: AMDGPU/R600: Add MOV instructions to BFE patterns.
AMDGPU/R600: Add MOV instructions to BFE patterns
Jul 27 2018, 8:00 AM
jvesely closed D49641: AMDGPU/R600: Add MOV instructions to BFE patterns.
Jul 27 2018, 8:00 AM

Jul 26 2018

jvesely added a comment to D49650: Targets/AMDGPU: Don't set fp32-denormals feature for r600.

According to cayman manual, these registers do exist so we should probably just make the feature accepted on r600 as well

sure, that's the way it was before r335942. I assumed the removal was intentional.

Probably accidental because nothing in r600 was actually using it

Jul 26 2018, 1:45 PM
jvesely added inline comments to D49836: AMDGPU: Remove broken ternary i16 patterns.
Jul 26 2018, 12:10 PM
jvesely added inline comments to D49836: AMDGPU: Remove broken ternary i16 patterns.
Jul 26 2018, 10:00 AM

Jul 25 2018

jvesely created D49836: AMDGPU: Remove broken ternary i16 patterns.
Jul 25 2018, 11:55 PM
jvesely added a comment to D49650: Targets/AMDGPU: Don't set fp32-denormals feature for r600.

According to cayman manual, these registers do exist so we should probably just make the feature accepted on r600 as well

Jul 25 2018, 11:10 AM

Jul 23 2018

jvesely abandoned D49642: AMDGPU: Rework extract-lowbits test.

I'd rather stop trying to share tests with r600 at all. I would like to split out most of the shared tests as-i

Any reason for that? Both bfe instructions use the same patterns so it'd be just a copy paste.

A lot of tests have too many run lines as is, and adding more for r600 increases the mess. In this case you are actually changing the tested content. The original used VGPR inputs for everything, and this changes everything to be SGPR inputs. Both would be useful as separate tests, but we don't try particular hard to match scalar BFEs currently. Also, I want to stop artificially sharing some of the intrinsics.

Jul 23 2018, 8:11 PM
jvesely updated the diff for D49641: AMDGPU/R600: Add MOV instructions to BFE patterns.

v2: copy tests to a new file

Jul 23 2018, 8:07 PM

Jul 22 2018

jvesely added a comment to D49642: AMDGPU: Rework extract-lowbits test.

I'd rather stop trying to share tests with r600 at all. I would like to split out most of the shared tests as-i

Jul 22 2018, 12:54 PM
jvesely created D49650: Targets/AMDGPU: Don't set fp32-denormals feature for r600.
Jul 22 2018, 12:40 PM
jvesely created D49649: AMDGPU/R600: Don't set fp32-denormals feature for r600.
Jul 22 2018, 12:32 PM

Jul 21 2018

jvesely added a child revision for D49642: AMDGPU: Rework extract-lowbits test: D49641: AMDGPU/R600: Add MOV instructions to BFE patterns.
Jul 21 2018, 9:35 PM
jvesely added a parent revision for D49641: AMDGPU/R600: Add MOV instructions to BFE patterns: D49642: AMDGPU: Rework extract-lowbits test.
Jul 21 2018, 9:35 PM
jvesely created D49642: AMDGPU: Rework extract-lowbits test.
Jul 21 2018, 9:35 PM
jvesely created D49641: AMDGPU/R600: Add MOV instructions to BFE patterns.
Jul 21 2018, 7:59 PM

Jul 6 2018

jvesely accepted D49037: AMDGPU: Refactor Subtarget classes.

Other than the few nits mentioned in the text, LGTM.

Jul 6 2018, 12:37 PM

Jun 27 2018

jvesely added a comment to D46365: AMDGPU: Separate R600 and GCN TableGen files.

running llc through valgrind produced flood of 'Conditional jump or move depends on uninitialised value(s)'
269 errors from 24 contexts. Initialzieng just CaymanISA in R600SUbtarget gets rid of most of them.

These should be fixed now, can you re-test?

Fails to build:
llvm-tblgen: Unknown command line argument '-gen-tgt-intrinsic'. Try: '../../../bin/llvm-tblgen -help'
llvm-tblgen: Did you mean '-gen-tgt-intrinsic-impl'?
make[2]: *** [lib/Target/AMDGPU/CMakeFiles/AMDGPUCommonTableGen.dir/build.make:1730: lib/Target/AMDGPU/R600GenIntrinsics.inc.tmp] Error 1

Jun 27 2018, 8:30 PM
jvesely added a comment to D46365: AMDGPU: Separate R600 and GCN TableGen files.

Now, there are tests for MEMRAT_CACHELESS stoers, and they pass so I guess there is another untested store path that got mixed between TS2 and TS3.
I can paste the .ll file if you're interested.

Yes, that would be helpful.

Jun 27 2018, 8:02 PM
jvesely added a comment to D46365: AMDGPU: Separate R600 and GCN TableGen files.

running llc through valgrind produced flood of 'Conditional jump or move depends on uninitialised value(s)'
269 errors from 24 contexts. Initialzieng just CaymanISA in R600SUbtarget gets rid of most of them.

These should be fixed now, can you re-test?

Jun 27 2018, 7:43 PM

Jun 21 2018

jvesely added a comment to D46365: AMDGPU: Separate R600 and GCN TableGen files.

I added the below snippet to check whether the caymanISA feature gets initialized correctly:

Jun 21 2018, 6:04 PM
jvesely committed rL335280: atom: Use volatile pointers for cl_khr_{global,local}_int32_{base….
atom: Use volatile pointers for cl_khr_{global,local}_int32_{base…
Jun 21 2018, 12:32 PM
jvesely committed rL335279: atom: Consolidate cl_khr_{local,global}_int32_{base,extended}_atomics….
atom: Consolidate cl_khr_{local,global}_int32_{base,extended}_atomics…
Jun 21 2018, 12:32 PM
jvesely committed rL335278: atomic: Provide function implementation of atomic_{dec,inc}.
atomic: Provide function implementation of atomic_{dec,inc}
Jun 21 2018, 12:32 PM
jvesely committed rL335277: atom: Consolidate cl_khr_int64_{base,extended}_atomics declarations.
atom: Consolidate cl_khr_int64_{base,extended}_atomics declarations
Jun 21 2018, 12:32 PM
jvesely committed rL335276: atom: Consolidate cl_khr_{local,global}_int32_{base,extended}_atomics….
atom: Consolidate cl_khr_{local,global}_int32_{base,extended}_atomics…
Jun 21 2018, 12:32 PM
jvesely committed rL335275: atomic: Cleanup atomic_cmpxchg header.
atomic: Cleanup atomic_cmpxchg header
Jun 21 2018, 12:32 PM
jvesely committed rL335274: atomic: Move define cleanup to shared include.
atomic: Move define cleanup to shared include
Jun 21 2018, 12:31 PM

Jun 15 2018

jvesely added a comment to D46365: AMDGPU: Separate R600 and GCN TableGen files.

a quick update. running llc manually on the kernel .ll (dumped using CLOVER_DEBUG=llvm) produces correct assembly. Running it in clover generates incorrect code (dumped using CLOVER_DEBUG=native) and hangs GPU.

Jun 15 2018, 12:43 PM

Jun 14 2018

jvesely added a comment to D46365: AMDGPU: Separate R600 and GCN TableGen files.

I assume that there is no change in generated code intended for r600 (EG/CM).
These are the changes in piglit tests I noticed:

< 	MEM_RAT_CACHELESS STORE_RAW T0.X, T1.X, 1
---
> 	MEM_RAT_CACHELESS STORE_DWORD T0.X, T1.X

There are other changes wrt register allocation and packetizer, but this one looks the most suspicious. My turks is TS2 and STORE_DWORD is not defined in the ISA (STORE_RAW is the only allowed opcode for CACHELESS target). Checking cayman ISA STORE_DWORD is opcode 20 (vs. opc 2 for STORE_RAW), which is reserved on TS2. The instruction also lost the offset.

Jun 14 2018, 11:23 PM
jvesely added a comment to D46365: AMDGPU: Separate R600 and GCN TableGen files.

I've tried the updated version of the patch, although it did not apply cleanly. It also causes GPU hangs on my turks in piglit tests.

Jun 14 2018, 12:23 AM

Jun 7 2018

jvesely abandoned D40514: AMDGPU: Restrict ieee_mode to HSA..
Jun 7 2018, 2:23 PM
jvesely committed rL334228: r600/fmin: Flush denormals before calling builtin..
r600/fmin: Flush denormals before calling builtin.
Jun 7 2018, 1:32 PM
jvesely committed rL334226: math/fma: Add fp32 software implementation.
math/fma: Add fp32 software implementation
Jun 7 2018, 1:32 PM
jvesely committed rL334227: r600/fmax: Flush denormals before calling builtin..
r600/fmax: Flush denormals before calling builtin.
Jun 7 2018, 1:32 PM

May 30 2018

jvesely committed rL333622: AMDGPU/R600: Make sure functions are cacheline aligned.
AMDGPU/R600: Make sure functions are cacheline aligned
May 30 2018, 9:12 PM
jvesely closed D47516: AMDGPU/R600: Make sure functions are cache line aligned.
May 30 2018, 9:12 PM
jvesely added inline comments to D47516: AMDGPU/R600: Make sure functions are cache line aligned.
May 30 2018, 11:52 AM
jvesely updated the diff for D47516: AMDGPU/R600: Make sure functions are cache line aligned.

Change explanation to cache line alignment (p2align 3 still hangs the GPU).
Use ensure alignment

May 30 2018, 10:01 AM
jvesely created D47516: AMDGPU/R600: Make sure functions are cache line aligned.
May 30 2018, 12:56 AM

May 17 2018

jvesely committed rL332677: Add initial support for half precision builtins.
Add initial support for half precision builtins
May 17 2018, 3:59 PM

May 14 2018

jvesely committed rL332324: rootn: Use denormal path only.
rootn: Use denormal path only
May 14 2018, 9:26 PM

May 2 2018

jvesely committed rL331434: remquo: Port from amd builtins.
remquo: Port from amd builtins
May 2 2018, 10:48 PM
jvesely committed rL331435: remquo: Flush denormals if not supported.
remquo: Flush denormals if not supported
May 2 2018, 10:48 PM
jvesely committed rL331433: math: Add helper function to flush denormals if not supported..
math: Add helper function to flush denormals if not supported.
May 2 2018, 10:48 PM
jvesely accepted D46346: AMDGPU: rename OpenCL lowering pass to be R600 specific..

LGTM

May 2 2018, 4:48 PM · Restricted Project
jvesely committed rL331366: clc_sqrt: Reuse unary_decl.inc.
clc_sqrt: Reuse unary_decl.inc
May 2 2018, 9:10 AM

Apr 25 2018

jvesely committed rL330851: relational/select: Condition types for half are short/ushort, not char/uchar.
relational/select: Condition types for half are short/ushort, not char/uchar
Apr 25 2018, 10:40 AM

Apr 24 2018

jvesely accepted D45989: AMDGPU/R600: Move int_r600_store_stream_output to the public intrinsic file.

I thought libclc was using this, but it is not. Maybe we can just delete this instead?

Apr 24 2018, 4:29 PM

Apr 23 2018

jvesely committed rL330649: log10: Use sw implementation from amd builtins.
log10: Use sw implementation from amd builtins
Apr 23 2018, 2:14 PM

Apr 17 2018

jvesely committed rL330207: powr: Use denormal path only.
powr: Use denormal path only
Apr 17 2018, 12:39 PM
jvesely committed rL330206: pown: Use denormal path only.
pown: Use denormal path only
Apr 17 2018, 12:39 PM
jvesely committed rL330205: pow: Use denormal path only.
pow: Use denormal path only
Apr 17 2018, 12:39 PM
jvesely committed rL330198: amdgcn/fmin: Fix typos that reduced precision.
amdgcn/fmin: Fix typos that reduced precision
Apr 17 2018, 11:14 AM
jvesely committed rL330197: exp10: Port from amd builtins.
exp10: Port from amd builtins
Apr 17 2018, 11:11 AM