Page MenuHomePhabricator

t-tye (Tony Tye)
User

Projects

User does not belong to any projects.

User Details

User Since
Mar 22 2017, 6:01 PM (116 w, 4 d)

Recent Activity

Mar 25 2019

t-tye added a comment to D59008: [AMDGPU] Switch default dwarf version to 5.

LGTM

Do we know the state of split DWARF and DWARF compression for DWARF 5 (compared to DWARF 2)?

State of them in what sense? Compression is pretty orthogonal to any DWARF version - it's more about the container (ELF, etc) you use. Split DWARF is non-standardly supported in pre-v5, and I think it's functioning in the standards conformant v5 mode too.

Mar 25 2019, 2:00 PM · Restricted Project
t-tye accepted D59008: [AMDGPU] Switch default dwarf version to 5.

Do we know the state of split DWARF and DWARF compression for DWARF 5 (compared to DWARF 2)?

Mar 25 2019, 11:10 AM · Restricted Project

Mar 7 2019

t-tye accepted D59057: AMDHSA: Code object v3 updates.

LGTM with comment update.

Mar 7 2019, 11:22 AM · Restricted Project
t-tye requested changes to D59057: AMDHSA: Code object v3 updates.
Mar 7 2019, 12:33 AM · Restricted Project

Feb 28 2019

t-tye accepted D58802: [AMDGPU] Mark ds instructions as meybeAtomic.

LGTM

Feb 28 2019, 8:14 PM · Restricted Project

Feb 21 2019

t-tye added a comment to D58518: [HIP] change kernel stub name.

To clarify, I am saying that the stub does have a different name since it is conceptually part of the implementation of doing the call to the device function implementation, and is not in fact the the device function being called itself. However, when we generate code for a function that is present on both the host and device, both copies of the code are for the same source level function and so can have the same symbol name (which was a question that was asked).

Feb 21 2019, 12:11 PM · Restricted Project, Restricted Project
t-tye added a comment to D58518: [HIP] change kernel stub name.

Yes this relates to supporting the debugger.

Feb 21 2019, 11:38 AM · Restricted Project, Restricted Project
t-tye accepted D58159: AMDGPU: Remove debugger related subtarget features.

LGTM

Feb 21 2019, 11:26 AM

Nov 19 2018

t-tye added a comment to D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.

Ping?

I think the remarks by @t-tye point to a potentially useful optimization, but that should not be part of this patch.

Nov 19 2018, 8:27 AM

Nov 15 2018

t-tye accepted D53445: [AMDGPU] Update code object metadata format documentation.

LGTM

Nov 15 2018, 12:27 PM

Nov 9 2018

t-tye added a comment to D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.

This is sufficient, because whenever only one event of a count type is

pending, its last time point is naturally the upper bound of all time
points of this count type, and when multiple event types are pending,
the count type has gone out of order and an s_waitcnt to 0 is required
to clear any pending event type (and will then clear all pending event
types for that count type).

Just wondered if can do better than using 0. Instead can the lowest count be used as this should be sufficient to ensure all out-of-order events in this have happened? I had discussed this with Bob at one time.

Hmm, how would that work? What lowest count are you referring to? For example, if lgkm has both in-flight SMEM read, and in-flight LDS, we could either have all SMEM read finish first or all LDS finish first.

Something that we could do is a more finely-grained tracking of in-order events. For example, if we have both in-flight SMEM and in-flight LDS, and we need to wait for the second-to-last LDS, then in fact we could do an lgkmcnt(1) wait -- because if the counter reaches 1 or less, the second-to-last LDS must have returned. After the lgkmcnt(1), we still need to conservatively assume that any event type that was previously in-flight may still be in-flight, so this patch here is compatible with such a more finely-grained tracking.

I think the finer-grained tracking could be achieved by introducing separate timelines for each event type: currently we only have timelines by counter. Anyway, it'd be a separate change, mainly for the benefit of mixing LDS and SMEM I think.

Nov 9 2018, 9:25 AM

Nov 7 2018

t-tye accepted D54186: AMDGPU: Enable code object v3 for AMDHSA only.

LGTM

Nov 7 2018, 5:37 PM
t-tye added a comment to D54228: AMDGPU/InsertWaitcnts: Simplify pending events tracking.

This is sufficient, because whenever only one event of a count type is

pending, its last time point is naturally the upper bound of all time
points of this count type, and when multiple event types are pending,
the count type has gone out of order and an s_waitcnt to 0 is required
to clear any pending event type (and will then clear all pending event
types for that count type).

Nov 7 2018, 3:56 PM

Nov 6 2018

t-tye accepted D54178: AMDGPU/Docs: Add product names for Vega20.

LGTM

Nov 6 2018, 2:50 PM

Nov 5 2018

t-tye added a comment to D53153: [OpenCL] Mark kernel functions with default visibility.

But do you want to support *dynamically* linking object files? Because that's what visibility is about.

To be specific, if you don't have multiple levels of linking — doing a slower and relatively more expensive link to form a self-contained unit of code distribution, then doing a faster link to form a runnable program from multiple independently-distributed such units — then visibility isn't really doing anything for you.

Nov 5 2018, 8:16 PM
t-tye accepted D53153: [OpenCL] Mark kernel functions with default visibility.

Summary needs updating as now only being done for kernels and not namespace scope variables.

Nov 5 2018, 1:22 PM
t-tye added a comment to D53445: [AMDGPU] Update code object metadata format documentation.

Should the v2 version of the metadata be moved to a separate section rather than be deleted since it is still generated when the v2 code object format is requested?

Nov 5 2018, 1:05 PM
t-tye accepted D53222: AMDGPU: Add sram-ecc feature.

LGTM (scheduler can be done in separate change list)

Nov 5 2018, 11:20 AM
t-tye added a comment to D53222: AMDGPU: Add sram-ecc feature.

Does the assembler need any support for this? For example, does the .amdgcn_target directive need to accept the +sramecc and ensure the e_flag is set accordingly?

Nov 5 2018, 9:29 AM
t-tye accepted D53223: AMDGPU: Add sram-ecc feature options.

LGTM

Nov 5 2018, 9:17 AM

Oct 23 2018

t-tye accepted D53526: AMDGPU: Switch some lld tests to v2.

LGTM

Oct 23 2018, 4:18 PM
t-tye accepted D53525: AMDGPU: Enable code object v3 by default.

LGTM

Oct 23 2018, 2:32 PM

Oct 22 2018

t-tye added inline comments to D53222: AMDGPU: Add sram-ecc feature.
Oct 22 2018, 1:24 PM

Oct 17 2018

t-tye accepted D53386: AMDGPU: Add options to enable/disable code object v3.

LGTM

Oct 17 2018, 2:27 PM

Oct 16 2018

t-tye added inline comments to D53222: AMDGPU: Add sram-ecc feature.
Oct 16 2018, 9:46 AM

Oct 7 2018

t-tye added a comment to D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions.

Another word commonly used across languages is "offload".

Oct 7 2018, 8:41 PM

Jul 20 2018

t-tye accepted D49613: AMDGPU: Switch default dwarf version to 2.

LGTM

Jul 20 2018, 1:29 PM

Jul 17 2018

t-tye added inline comments to D49448: [AMDGPU] Fix VGPR spills where offset doesn't fit in 12 bits.
Jul 17 2018, 5:36 PM

Jul 10 2018

t-tye added inline comments to D49096: AMDGPU: Make hidden argument metadata consistent with amdgpu-implicitarg-num-bytes attribute.
Jul 10 2018, 9:31 AM

Jul 9 2018

t-tye added inline comments to D49096: AMDGPU: Make hidden argument metadata consistent with amdgpu-implicitarg-num-bytes attribute.
Jul 9 2018, 6:41 PM

Jun 21 2018

t-tye added a comment to D48435: AMDGPU: Add patterns for i32/i64 local atomic load/store.

Now this is implemented it may be worth converting the memorylegalizer tests from MIR to IR tests.

Jun 21 2018, 11:45 AM

Jun 14 2018

t-tye committed rL334733: [AMDGPU] Document the AMDGPU LLVM attributes.
[AMDGPU] Document the AMDGPU LLVM attributes
Jun 14 2018, 9:44 AM
t-tye closed D48101: [AMDGPU] Document the AMDGPU LLVM attributes.
Jun 14 2018, 9:44 AM

Jun 13 2018

t-tye abandoned D48103: [AMDGPU] Update code object metadata format documentation.

Abandon as duplicate of D47549.

Jun 13 2018, 6:39 PM
t-tye updated the diff for D47549: [AMDGPU] Update code object metadata format documentation.

Rename note record enmerator from NT_AMD_AMDGPU_* to NT_AMDGPU_* to match the vendor name change from "AMD" to "AMDGPU".

Jun 13 2018, 11:26 AM

Jun 12 2018

t-tye created D48103: [AMDGPU] Update code object metadata format documentation.
Jun 12 2018, 3:48 PM
t-tye created D48101: [AMDGPU] Document the AMDGPU LLVM attributes.
Jun 12 2018, 3:27 PM

Jun 8 2018

t-tye added inline comments to D47900: AMDGPU: Error on LDS global address in functions.
Jun 8 2018, 1:09 AM

Jun 7 2018

t-tye committed rL334257: [AMDGPU] Simplify memory legalizer (add missing virtual descructor).
[AMDGPU] Simplify memory legalizer (add missing virtual descructor)
Jun 7 2018, 6:04 PM
t-tye committed rL334241: [AMDGPU] Simplify memory legalizer.
[AMDGPU] Simplify memory legalizer
Jun 7 2018, 3:33 PM
t-tye closed D47504: [AMDGPU] Simplify memory legalizer.
Jun 7 2018, 3:32 PM
t-tye accepted D47601: AMDGPU: Add 64-bit relative variant kind.

LGTM

Jun 7 2018, 2:53 PM
t-tye accepted D47566: AMDHSA: Code object v3 updates.

LGTM

Jun 7 2018, 2:51 PM
t-tye added inline comments to D47900: AMDGPU: Error on LDS global address in functions.
Jun 7 2018, 12:45 PM
t-tye added a comment to D47504: [AMDGPU] Simplify memory legalizer.

Should add some tests for this GDS support. Using a pure MIR test you can bypass the fact that we don't codegen it yet

Would it be ok to do that as a separate patch? Would like to add code selection patterns to handle lds and gds atomics and then use that to make an llvm ir test.

Jun 7 2018, 11:51 AM
t-tye updated the diff for D47504: [AMDGPU] Simplify memory legalizer.

Add MIR tests for local and region address spaces.

Jun 7 2018, 11:47 AM

Jun 4 2018

t-tye updated subscribers of D47566: AMDHSA: Code object v3 updates.
Jun 4 2018, 10:56 PM
t-tye added inline comments to D47566: AMDHSA: Code object v3 updates.
Jun 4 2018, 10:52 PM
t-tye added inline comments to D47566: AMDHSA: Code object v3 updates.
Jun 4 2018, 12:20 PM
t-tye added inline comments to D47566: AMDHSA: Code object v3 updates.
Jun 4 2018, 11:46 AM
t-tye added a comment to D47601: AMDGPU: Add 64-bit relative variant kind.

What is this actually needed for? Having a relative relocation in a data segment doesn't seem that useful?

Never mind, I see it in the other commit. Could you please update the table of relocation types in AMDGPUUsage.rst? It has an R_AMDGPU_RELATIVE64, I don't know what that's about, but it doesn't mention R_AMDGPU_REL64.

Jun 4 2018, 10:01 AM

Jun 1 2018

t-tye updated the diff for D47504: [AMDGPU] Simplify memory legalizer.

Further minimize MIR tests (thanks @rampitec ).

Jun 1 2018, 3:40 PM

May 31 2018

t-tye added a comment to D47504: [AMDGPU] Simplify memory legalizer.

Should add some tests for this GDS support. Using a pure MIR test you can bypass the fact that we don't codegen it yet

May 31 2018, 5:54 PM
t-tye added inline comments to D47504: [AMDGPU] Simplify memory legalizer.
May 31 2018, 5:39 PM
t-tye updated the diff for D47504: [AMDGPU] Simplify memory legalizer.

Reduced size of mir tests.

May 31 2018, 5:39 PM
t-tye added inline comments to D47504: [AMDGPU] Simplify memory legalizer.
May 31 2018, 2:05 AM
t-tye added inline comments to D47504: [AMDGPU] Simplify memory legalizer.
May 31 2018, 1:56 AM
t-tye added inline comments to D47504: [AMDGPU] Simplify memory legalizer.
May 31 2018, 12:50 AM
t-tye updated the diff for D47504: [AMDGPU] Simplify memory legalizer.

Update for @rampitec review comments.

May 31 2018, 12:38 AM

May 30 2018

t-tye created D47549: [AMDGPU] Update code object metadata format documentation.
May 30 2018, 11:50 AM

May 29 2018

t-tye created D47504: [AMDGPU] Simplify memory legalizer.
May 29 2018, 4:30 PM

May 25 2018

t-tye accepted D47392: AMDGPU: Always set COMPUTE_PGM_RSRC2.ENABLE_TRAP_HANDLER to zero for AMDHSA as it is set by CP.

LGTM

May 25 2018, 2:24 PM
t-tye added inline comments to D47370: AMDGPU: Round up kernel argument allocation size.
May 25 2018, 1:58 PM
t-tye added a comment to D47261: AMDGPU: bump AS.MAX_COMMON_ADDRESS to 6 since 32-bit addr space.

Needs a test, preferably the full set of AA checks with 32 bit constant

Needs a test, preferably the full set of AA checks with 32 bit constant

Sure., but can you give me more details about what you want? Is there already an example that I could start from?

Unfortunately it looks like the commit that added this didn't actually add a proper test for this although I thought there was one that would be easy to add to. I would like there to be a test that purely tests the results of alias queries, like those found in test/Analysis/BasicAA or test/Analysis/ScopedNoAliasAA/basic.ll

May 25 2018, 12:18 PM
t-tye accepted D47370: AMDGPU: Round up kernel argument allocation size.

LGTM

May 25 2018, 12:13 PM
t-tye added a comment to D47370: AMDGPU: Round up kernel argument allocation size.

Are we sure that is what RT(s) do?

It doesn't really matter if it does or not, since we're now requesting the larger allocation

May 25 2018, 12:08 PM
t-tye added inline comments to D47370: AMDGPU: Round up kernel argument allocation size.
May 25 2018, 12:05 PM
t-tye accepted D47378: [AMDGPU][Waitcnt] Remove obsolete waitcnt option.

LGTM Thanks:-)

May 25 2018, 11:41 AM · Restricted Project

May 18 2018

t-tye added inline comments to D46769: [AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume execution..
May 18 2018, 11:49 AM

May 17 2018

t-tye accepted D46472: [HIP] Support offloading by linker script.

LGTM except for minor suggestions.

May 17 2018, 1:17 PM

May 16 2018

t-tye added inline comments to D46992: [AMDGPU] Add perf hints to functions.
May 16 2018, 9:31 PM
t-tye committed rL332485: [AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume….
[AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume…
May 16 2018, 9:23 AM
t-tye closed D46769: [AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume execution..
May 16 2018, 9:23 AM

May 15 2018

t-tye accepted D29911: AMDGPU : Recalculate SGPRs when trap handler is supported.

LGTM

May 15 2018, 3:16 PM

May 11 2018

t-tye created D46769: [AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume execution..
May 11 2018, 12:56 PM

May 8 2018

t-tye added inline comments to D46616: [AMDGPU][Waitcnt] Fix handling of flat instrs.
May 8 2018, 6:40 PM · Restricted Project

Apr 25 2018

t-tye accepted D46067: [AMDGPU][Waitcnt] Take ISA target into account for s_waitcnt expcnt instr generation.

LGTM

Apr 25 2018, 4:07 PM · Restricted Project

Apr 13 2018

t-tye committed rL330081: [AMDGPU] Add gfx902 product names.
[AMDGPU] Add gfx902 product names
Apr 13 2018, 7:01 PM
t-tye closed D45609: [AMDGPU] Add gfx902 product names.
Apr 13 2018, 7:01 PM

Apr 12 2018

t-tye created D45609: [AMDGPU] Add gfx902 product names.
Apr 12 2018, 10:04 PM
t-tye requested changes to D45246: Add AMDPAL Code Conventions section to AMD docs.

@timcorringham ping

Apr 12 2018, 10:02 PM
t-tye committed rL329981: [AMDGPU] Update relocation record description.
[AMDGPU] Update relocation record description
Apr 12 2018, 6:05 PM
t-tye closed D45587: [AMDGPU] Update relocation record description..
Apr 12 2018, 6:05 PM
t-tye added a comment to D45503: [AMDGPU] Ensure there are enough registers for wave dispatch.

I believe the metadata does have the actual (ie non-granulated rounded up value) register count. So the runtime could access the information from there.

Apr 12 2018, 2:40 PM
t-tye created D45587: [AMDGPU] Update relocation record description..
Apr 12 2018, 1:15 PM

Apr 4 2018

t-tye reopened D45246: Add AMDPAL Code Conventions section to AMD docs.
Apr 4 2018, 5:31 PM

Apr 3 2018

t-tye accepted D45129: AMDGPU/Metadata: Always report a fixed number of hidden arguments.

LGTM

Apr 3 2018, 1:11 PM

Mar 27 2018

t-tye committed rL328669: [AMDGPU] Define code object identification string used in AMDHSA runtimes..
[AMDGPU] Define code object identification string used in AMDHSA runtimes.
Mar 27 2018, 2:25 PM
t-tye closed D44718: [AMDGPU] Define code object identification string used in AMDHSA runtimes..
Mar 27 2018, 2:25 PM

Mar 23 2018

t-tye added a comment to D44718: [AMDGPU] Define code object identification string used in AMDHSA runtimes..

LGTM, but I'd rather use '+' instead of ',' for the features.

Mar 23 2018, 1:04 PM
t-tye updated the diff for D44718: [AMDGPU] Define code object identification string used in AMDHSA runtimes..

Change syntax for target features to be concatenated to Processor using a plus prefix for each target feature.

Mar 23 2018, 1:03 PM
t-tye updated the diff for D44718: [AMDGPU] Define code object identification string used in AMDHSA runtimes..

Change target feature element of the target identification string to be comma separated so hyphens only delimit the elements.

Mar 23 2018, 12:14 PM
t-tye committed rL328351: [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU.
[AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU
Mar 23 2018, 12:02 PM
t-tye closed D44697: [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU.
Mar 23 2018, 12:02 PM
t-tye committed rC328350: [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG).
[AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG)
Mar 23 2018, 11:54 AM
t-tye committed rL328350: [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG).
[AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG)
Mar 23 2018, 11:54 AM
t-tye closed D44696: [AMDGPU] Update OpenCL to use 48 bytes of implicit arguments for AMDGPU (CLANG).
Mar 23 2018, 11:54 AM
t-tye committed rL328349: [AMDGPU] Remove use of OpenCL triple environment and replace with function….
[AMDGPU] Remove use of OpenCL triple environment and replace with function…
Mar 23 2018, 11:48 AM
t-tye closed D43736: [AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU.
Mar 23 2018, 11:48 AM
t-tye committed rL328347: [AMDGPU] Remove use of OpenCL triple environment and replace with function….
[AMDGPU] Remove use of OpenCL triple environment and replace with function…
Mar 23 2018, 11:46 AM