Page MenuHomePhabricator

foad (Jay Foad)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 29 2014, 9:58 AM (437 w, 5 d)

Recent Activity

Today

foad added a comment to D146031: [AMDGPU] Add MMOs for GFX11 Streamout Instructions.

I wonder if you need to update getAliasResult in AMDGPUAliasAnalysis for the new address space.

Mon, Mar 20, 6:39 AM · Restricted Project, Restricted Project

Fri, Mar 17

foad committed rGac62e7b6cb45: [AMDGPU] Remove some unneeded curly braces (authored by foad).
[AMDGPU] Remove some unneeded curly braces
Fri, Mar 17, 11:09 AM · Restricted Project, Restricted Project
foad added inline comments to D146313: [AMDGPU] Simplify SMEM Real instruction definitions. NFC..
Fri, Mar 17, 11:09 AM · Restricted Project, Restricted Project
foad committed rG53076d34e3ea: [AMDGPU] Simplify SMEM Real instruction definitions. NFC. (authored by foad).
[AMDGPU] Simplify SMEM Real instruction definitions. NFC.
Fri, Mar 17, 11:05 AM · Restricted Project, Restricted Project
foad closed D146313: [AMDGPU] Simplify SMEM Real instruction definitions. NFC..
Fri, Mar 17, 11:05 AM · Restricted Project, Restricted Project
foad requested review of D146313: [AMDGPU] Simplify SMEM Real instruction definitions. NFC..
Fri, Mar 17, 11:02 AM · Restricted Project, Restricted Project
foad added inline comments to D146287: [AMDGPU][GISel] Add inverse ballot intrinsic.
Fri, Mar 17, 10:17 AM · Restricted Project, Restricted Project, Restricted Project
foad accepted D146170: InstCombine: Fold is.fpclass for single infinity to fcmp.
Fri, Mar 17, 7:42 AM · Restricted Project, Restricted Project
foad accepted D146142: InstCombine: Fold is.fpclass nan|zero to fcmp ueq 0.
Fri, Mar 17, 7:37 AM · Restricted Project, Restricted Project
foad added inline comments to D146170: InstCombine: Fold is.fpclass for single infinity to fcmp.
Fri, Mar 17, 2:41 AM · Restricted Project, Restricted Project
foad added a comment to D146170: InstCombine: Fold is.fpclass for single infinity to fcmp.

From the description:

llvm.is.fpclass(x, fcPosInf) -> fcmp oeq x, +inf
llvm.is.fpclass(x, fcNegInf) -> fcmp oeq x, -inf
llvm.is.fpclass(x, ~fcPosInf) -> fcmp one x, +inf
llvm.is.fpclass(x, ~fcNegInf) -> fcmp one x, -inf

The last two should be "une".

Fri, Mar 17, 2:39 AM · Restricted Project, Restricted Project
foad added inline comments to D146142: InstCombine: Fold is.fpclass nan|zero to fcmp ueq 0.
Fri, Mar 17, 2:38 AM · Restricted Project, Restricted Project
foad added a comment to D146136: [llvm][CycleInfo] Quick look-up for block in cycle..

Patch seems fine to me, but I wonder if we need to get more eyes on your SetVector changes. I don't think there is a code owner for ADT. @dblaikie?

One good thing is that this not change the way SetVector is currently used. If all the user wants is a vector and a set of the same type, then nothing changed for them. But it does allow a new use-case by giving enough rope. It's difficult to specify static constraints. Can a user have a vector of float and a set of float? Sure they can, as long as they understand the lossy hashing produced by silently converting the float to int when inserting in the set!

Fri, Mar 17, 2:29 AM · Restricted Project, Restricted Project
foad updated subscribers of D146136: [llvm][CycleInfo] Quick look-up for block in cycle..

Patch seems fine to me, but I wonder if we need to get more eyes on your SetVector changes. I don't think there is a code owner for ADT. @dblaikie?

Fri, Mar 17, 1:14 AM · Restricted Project, Restricted Project

Thu, Mar 16

foad added inline comments to D146136: [llvm][CycleInfo] Quick look-up for block in cycle..
Thu, Mar 16, 9:12 AM · Restricted Project, Restricted Project
foad added inline comments to D146031: [AMDGPU] Add MMOs for GFX11 Streamout Instructions.
Thu, Mar 16, 3:48 AM · Restricted Project, Restricted Project

Wed, Mar 15

foad added inline comments to D146136: [llvm][CycleInfo] Quick look-up for block in cycle..
Wed, Mar 15, 6:50 AM · Restricted Project, Restricted Project
foad accepted D145990: [AMDGPU] Update mul.ll with auto-generated checks.

Thanks!

Wed, Mar 15, 3:47 AM · Restricted Project, Restricted Project
foad added inline comments to D146031: [AMDGPU] Add MMOs for GFX11 Streamout Instructions.
Wed, Mar 15, 3:44 AM · Restricted Project, Restricted Project
foad accepted D146018: [AMDGPU] Use UniformityAnalysis in AtomicOptimizer.
Wed, Mar 15, 12:09 AM · Restricted Project, Restricted Project

Tue, Mar 14

foad added inline comments to D146031: [AMDGPU] Add MMOs for GFX11 Streamout Instructions.
Tue, Mar 14, 8:18 AM · Restricted Project, Restricted Project
foad added inline comments to D146031: [AMDGPU] Add MMOs for GFX11 Streamout Instructions.
Tue, Mar 14, 8:00 AM · Restricted Project, Restricted Project
foad accepted D144729: [AMDGPU] Select v_sat_pk_u8_i16.

LGTM.

Tue, Mar 14, 6:38 AM · Restricted Project, Restricted Project
foad updated subscribers of D146031: [AMDGPU] Add MMOs for GFX11 Streamout Instructions.
Tue, Mar 14, 4:49 AM · Restricted Project, Restricted Project
foad added inline comments to D146018: [AMDGPU] Use UniformityAnalysis in AtomicOptimizer.
Tue, Mar 14, 4:31 AM · Restricted Project, Restricted Project
foad added a comment to D146032: [DAG] Fold (bitcast (logicop (bitcast x), (c))) -> (logicop x, (bitcast c)) iff the current logicop type is illegal.

Fold (bitcast (logicop (bitcast x), (c))) -> (and x, (bitcast c)) iff the current logicop type is illegal

Tue, Mar 14, 4:23 AM · Restricted Project, Restricted Project
foad added a comment to D145990: [AMDGPU] Update mul.ll with auto-generated checks.

Looks good with nits.

Tue, Mar 14, 2:57 AM · Restricted Project, Restricted Project
foad added a comment to D141355: [AMDGPUUnifyDivergentExitNodes] Add NewPM support.

The commit message should make it clear that you're also changing it to use UniformityAnalysis instead of (Legacy?)DivergenceAnalysis.

Tue, Mar 14, 2:53 AM · Restricted Project, Restricted Project
foad added a comment to D145918: [DAG/AMDGPU] Use UniformityAnalysis in DAGISel.

LGTM but I'd like @sameerds to have the final say.

Tue, Mar 14, 2:38 AM · Restricted Project, Restricted Project
foad added a comment to D146018: [AMDGPU] Use UniformityAnalysis in AtomicOptimizer.

Looks good overall.

Tue, Mar 14, 2:34 AM · Restricted Project, Restricted Project
foad added inline comments to D145918: [DAG/AMDGPU] Use UniformityAnalysis in DAGISel.
Tue, Mar 14, 2:17 AM · Restricted Project, Restricted Project
foad committed rGdc3882eace54: [AMDGPU] Fix .amdhsa_shared_vgpr_count error checking for GFX11 (authored by foad).
[AMDGPU] Fix .amdhsa_shared_vgpr_count error checking for GFX11
Tue, Mar 14, 2:06 AM · Restricted Project, Restricted Project
foad closed D145936: [AMDGPU] Fix .amdhsa_shared_vgpr_count error checking for GFX11.
Tue, Mar 14, 2:05 AM · Restricted Project, Restricted Project

Mon, Mar 13

foad requested review of D145936: [AMDGPU] Fix .amdhsa_shared_vgpr_count error checking for GFX11.
Mon, Mar 13, 7:07 AM · Restricted Project, Restricted Project
foad committed rG23b0df72d272: [AMDGPU] Remove BoolToList class (authored by foad).
[AMDGPU] Remove BoolToList class
Mon, Mar 13, 2:23 AM · Restricted Project, Restricted Project

Fri, Mar 10

foad committed rG5fc5c7ebe251: [AMDGPU] Make use of defvar in defining SMEM Real instructions (authored by foad).
[AMDGPU] Make use of defvar in defining SMEM Real instructions
Fri, Mar 10, 6:34 AM · Restricted Project, Restricted Project
foad accepted D145785: [AMDGPU][MachineVerifier] Fix vdata reg count for MIMG d16.
Fri, Mar 10, 5:32 AM · Restricted Project, Restricted Project
foad accepted D145711: [llvm-tblgen] Support conditional definitions using !casts clauses.

I'm not competent to review the implementation but it seems to work well for me. This will let us replace lots of instances of foreach _ = BoolToList<cond>.ret in ... with if cond then ... in AMDGPU backend tablegen files.

Fri, Mar 10, 4:35 AM · Restricted Project, Restricted Project
foad accepted D145688: [StructurizeCFG] Use UniformityAnalysis instead of DivergenceAnalysis.
Fri, Mar 10, 2:06 AM · Restricted Project, Restricted Project

Thu, Mar 9

foad added reviewers for D145711: [llvm-tblgen] Support conditional definitions using !casts clauses: Joe_Nash, Paul-C-Anagnostopoulos.
Thu, Mar 9, 12:54 PM · Restricted Project, Restricted Project
foad updated subscribers of D145711: [llvm-tblgen] Support conditional definitions using !casts clauses.

but I think this also ends up fixing the original reported issue (https://github.com/llvm/llvm-project/issues/49830) as well

Thu, Mar 9, 12:54 PM · Restricted Project, Restricted Project
foad updated subscribers of D145688: [StructurizeCFG] Use UniformityAnalysis instead of DivergenceAnalysis.
Thu, Mar 9, 6:06 AM · Restricted Project, Restricted Project

Wed, Mar 8

foad accepted D144955: Fix SGPR + offset Scratch offset folding.
Wed, Mar 8, 9:06 AM · Restricted Project, Restricted Project
foad accepted D144957: Fix SGPR + VGPR + offset Scratch offset folding.
Wed, Mar 8, 8:58 AM · Restricted Project, Restricted Project
foad added inline comments to D144955: Fix SGPR + offset Scratch offset folding.
Wed, Mar 8, 8:51 AM · Restricted Project, Restricted Project
foad added inline comments to D144957: Fix SGPR + VGPR + offset Scratch offset folding.
Wed, Mar 8, 6:47 AM · Restricted Project, Restricted Project
foad accepted D144956: Fix VGPR + offset Scratch offset folding.

LGTM, thanks.

Wed, Mar 8, 6:26 AM · Restricted Project, Restricted Project

Tue, Mar 7

foad added inline comments to D144956: Fix VGPR + offset Scratch offset folding.
Tue, Mar 7, 8:46 AM · Restricted Project, Restricted Project
foad added inline comments to D145108: [RISCV][llvm-tblgen] Support conditional definitions using !exists clauses.
Tue, Mar 7, 5:08 AM · Restricted Project, Restricted Project
foad added a comment to D145108: [RISCV][llvm-tblgen] Support conditional definitions using !exists clauses.

Hi, this patch makes tablegen crash instead of error on cases like this:

if !cast<A>("").x then
  def x;

(Based on the test case for https://github.com/llvm/llvm-project/issues/49830)

Tue, Mar 7, 5:08 AM · Restricted Project, Restricted Project
foad committed rG0265dd992580: Fix "compatiable" typos (authored by foad).
Fix "compatiable" typos
Tue, Mar 7, 4:58 AM · Restricted Project, Restricted Project
foad accepted D144711: [DAGCombiner] don't reuse the pointer info for merged store.

LGTM.

Tue, Mar 7, 4:20 AM · Restricted Project, Restricted Project
foad added inline comments to D144711: [DAGCombiner] don't reuse the pointer info for merged store.
Tue, Mar 7, 2:41 AM · Restricted Project, Restricted Project
foad accepted D145045: [AMDGPU] Extend WorkGroupID* codegen for compute shaders.

Looks OK to me, thanks.

Tue, Mar 7, 2:36 AM · Restricted Project, Restricted Project
foad added a comment to D145441: [AMDGPU] Define data layout entries for buffers.

Just my 2p: it feels a bit premature to commit patches for this. It feels more like something you could prototype on a branch somewhere and come back when you have more experience with how it all works out in practice.

Tue, Mar 7, 1:51 AM · Restricted Project, Restricted Project, Restricted Project
foad added inline comments to D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.
Tue, Mar 7, 1:33 AM · Restricted Project, Restricted Project, Restricted Project

Mon, Mar 6

foad committed rG5281f5c1e6da: [AMDGPU] Add GFX9,GFX10,GFX11 checks for llvm.amdgcn.s.buffer.load (authored by foad).
[AMDGPU] Add GFX9,GFX10,GFX11 checks for llvm.amdgcn.s.buffer.load
Mon, Mar 6, 10:23 AM · Restricted Project, Restricted Project
foad committed rGe73d3150b18a: [AMDGPU] Generate checks for llvm.amdgcn.s.buffer.load (authored by foad).
[AMDGPU] Generate checks for llvm.amdgcn.s.buffer.load
Mon, Mar 6, 10:23 AM · Restricted Project, Restricted Project
foad updated the diff for D145396: [AMDGPU] Fix AddedComplexity for s_buffer_load patterns. NFCI..

Update

Mon, Mar 6, 9:33 AM · Restricted Project, Restricted Project
foad requested review of D145396: [AMDGPU] Fix AddedComplexity for s_buffer_load patterns. NFCI..
Mon, Mar 6, 8:56 AM · Restricted Project, Restricted Project
foad added a comment to D145373: Fix SafeIntIterator reference type.

Thanks for the fix! Unfortunately I am not competent to review it.

Mon, Mar 6, 6:27 AM · Restricted Project, Restricted Project
foad committed rG271010bf50a5: [AMDGPU] Restore temporal divergence in test (authored by foad).
[AMDGPU] Restore temporal divergence in test
Mon, Mar 6, 4:12 AM · Restricted Project, Restricted Project
foad accepted D145366: [AMDGPU] Use UniformityAnalysis in LateCodeGenPrepare.
Mon, Mar 6, 4:00 AM · Restricted Project, Restricted Project
foad accepted D145359: [AMDGPU] Use UniformityAnalysis in RewriteUndefsForPHI.

LGTM, thanks!

Mon, Mar 6, 12:28 AM · Restricted Project, Restricted Project
foad added inline comments to D145358: [AMDGPU] Use UniformityAnalysis in CodeGenPrepare.
Mon, Mar 6, 12:27 AM · Restricted Project, Restricted Project

Sun, Mar 5

foad committed rG7ba61eaf34c3: [AMDGPU] More precise limit on SALU cycles in s_delay_alu instructions (authored by foad).
[AMDGPU] More precise limit on SALU cycles in s_delay_alu instructions
Sun, Mar 5, 12:17 AM · Restricted Project, Restricted Project

Fri, Mar 3

foad committed rG7442f8635b4d: [AMDGPU] Fix invalid instid value in s_delay_alu instruction (authored by foad).
[AMDGPU] Fix invalid instid value in s_delay_alu instruction
Fri, Mar 3, 1:12 PM · Restricted Project, Restricted Project
foad closed D145232: [AMDGPU] Fix invalid instid value in s_delay_alu instruction.
Fri, Mar 3, 1:11 PM · Restricted Project, Restricted Project
foad requested review of D145232: [AMDGPU] Fix invalid instid value in s_delay_alu instruction.
Fri, Mar 3, 4:59 AM · Restricted Project, Restricted Project
foad added inline comments to D31353: AMDGPU: Diagnose illegal SGPR to VGPR copies.
Fri, Mar 3, 3:49 AM · Restricted Project
foad committed rG08bdff862ca6: [AMDGPU] Fix error message for illegal copy (authored by foad).
[AMDGPU] Fix error message for illegal copy
Fri, Mar 3, 3:48 AM · Restricted Project, Restricted Project
foad committed rGf5ab447cf6dc: [AMDGPU] Add test case for AMDGPUInsertDelayAlu bug (authored by foad).
[AMDGPU] Add test case for AMDGPUInsertDelayAlu bug
Fri, Mar 3, 3:09 AM · Restricted Project, Restricted Project
foad accepted D145216: [llvm][Uniformity] correctly use a vector as a set by uniqifying elements.

Works for me, thanks! I have a larger test case that goes from about 50 s to less than 1 ms with this patch.

Fri, Mar 3, 1:42 AM · Restricted Project, Restricted Project

Thu, Mar 2

foad added a comment to D144420: [STLExtras] Allow for non-member `begin`/`end` in `append_range`.

Thanks!

Thu, Mar 2, 8:36 AM · Restricted Project, Restricted Project
foad added inline comments to D113816: [llvm-reduce] Assert that the number of chunks does not change with reductions.
Thu, Mar 2, 7:39 AM · Restricted Project, Restricted Project
Herald added a project to D113816: [llvm-reduce] Assert that the number of chunks does not change with reductions: Restricted Project.
Thu, Mar 2, 7:35 AM · Restricted Project, Restricted Project
foad added a comment to D145159: [AMDGPU] Match med3 for (max (min ..)).

Is there a reason this is implemented in C++ instead of instruction selection patterns?

Thu, Mar 2, 7:02 AM · Restricted Project, Restricted Project
foad added inline comments to D145159: [AMDGPU] Match med3 for (max (min ..)).
Thu, Mar 2, 7:00 AM · Restricted Project, Restricted Project
foad added a comment to D144420: [STLExtras] Allow for non-member `begin`/`end` in `append_range`.

Hi, this patch is provoking a build failure in an LLVM_ENABLE_EXPENSIVE_CHECKS build on my Ubuntu 22.04.2 machine using Clang 14 as the host compiler:

[189/189] Building CXX object unittests/ADT/CMakeFiles/ADTTests.dir/STLExtrasTest.cpp.o
FAILED: unittests/ADT/CMakeFiles/ADTTests.dir/STLExtrasTest.cpp.o 
/usr/lib/ccache/clang++ -DEXPENSIVE_CHECKS -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GLIBCXX_DEBUG -D_GNU_SOURCE -D_LIBCPP_ENABLE_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/jayfoad2/llvm-expensive/unittests/ADT -I/home/jayfoad2/git/llvm-project/llvm/unittests/ADT -I/home/jayfoad2/llvm-expensive/include -I/home/jayfoad2/git/llvm-project/llvm/include -I/home/jayfoad2/git/llvm-project/third-party/unittest/googletest/include -I/home/jayfoad2/git/llvm-project/third-party/unittest/googlemock/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g  -Wno-variadic-macros -Wno-gnu-zero-variadic-macro-arguments -fno-exceptions -fno-rtti -Wno-suggest-override -std=c++17 -MD -MT unittests/ADT/CMakeFiles/ADTTests.dir/STLExtrasTest.cpp.o -MF unittests/ADT/CMakeFiles/ADTTests.dir/STLExtrasTest.cpp.o.d -o unittests/ADT/CMakeFiles/ADTTests.dir/STLExtrasTest.cpp.o -c /home/jayfoad2/git/llvm-project/llvm/unittests/ADT/STLExtrasTest.cpp
In file included from /home/jayfoad2/git/llvm-project/llvm/unittests/ADT/STLExtrasTest.cpp:9:
In file included from /home/jayfoad2/git/llvm-project/llvm/include/llvm/ADT/STLExtras.h:20:
In file included from /home/jayfoad2/git/llvm-project/llvm/include/llvm/ADT/Hashing.h:51:
In file included from /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/algorithm:60:
In file included from /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_algobase.h:69:
In file included from /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/debug/debug.h:133:
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/debug/functions.h:110:44: error: no matching function for call to '__addressof'
      return __foreign_iterator_aux4(__it, std::__addressof(*__other));
                                           ^~~~~~~~~~~~~~~~
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/debug/functions.h:164:14: note: in instantiation of function template specialization '__gnu_debug::__foreign_iterator_aux3<__gnu_cxx::__normal_iterator<const int *, std::__cxx1998::vector<int>>, std::vector<int>, std::random_access_iterator_tag, llvm::detail::SafeIntIterator<int, false>>' requested here
      return __foreign_iterator_aux3(__it, __other, __other_end, __tag());
             ^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/debug/functions.h:186:5: note: in instantiation of function template specialization '__gnu_debug::__foreign_iterator_aux2<__gnu_cxx::__normal_iterator<const int *, std::__cxx1998::vector<int>>, std::vector<int>, std::random_access_iterator_tag, llvm::detail::SafeIntIterator<int, false>>' requested here
        || __foreign_iterator_aux2(__it, std::__miter_base(__other),
           ^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/debug/functions.h:198:14: note: in instantiation of function template specialization '__gnu_debug::__foreign_iterator_aux<__gnu_cxx::__normal_iterator<const int *, std::__cxx1998::vector<int>>, std::vector<int>, std::random_access_iterator_tag, llvm::detail::SafeIntIterator<int, false>>' requested here
      return __foreign_iterator_aux(__it, __other, __other_end, _Integral());
             ^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/debug/vector:626:4: note: in instantiation of function template specialization '__gnu_debug::__foreign_iterator<__gnu_cxx::__normal_iterator<const int *, std::__cxx1998::vector<int>>, std::vector<int>, std::random_access_iterator_tag, llvm::detail::SafeIntIterator<int, false>>' requested here
          __glibcxx_check_insert_range(__position, __first, __last, __dist);
          ^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/debug/macros.h:183:36: note: expanded from macro '__glibcxx_check_insert_range'
_GLIBCXX_DEBUG_VERIFY(__gnu_debug::__foreign_iterator(_Position,_First,_Last),\
                                   ^
/home/jayfoad2/git/llvm-project/llvm/include/llvm/ADT/STLExtras.h:2017:5: note: in instantiation of function template specialization 'std::vector<int>::insert<llvm::detail::SafeIntIterator<int, false>, void>' requested here
  C.insert(C.end(), adl_begin(R), adl_end(R));
    ^
/home/jayfoad2/git/llvm-project/llvm/unittests/ADT/STLExtrasTest.cpp:367:3: note: in instantiation of function template specialization 'llvm::append_range<std::vector<int>, llvm::iota_range<int>>' requested here
  append_range(V, llvm::seq(6, 8));
  ^
/usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/move.h:49:5: note: candidate function [with _Tp = int] not viable: expects an lvalue for 1st argument
    __addressof(_Tp& __r) _GLIBCXX_NOEXCEPT
    ^
1 error generated.
ninja: build stopped: subcommand failed.

Do you have any idea what is wrong here? Thanks!

Thu, Mar 2, 5:34 AM · Restricted Project, Restricted Project
foad added inline comments to D131587: [CodeGen] Deduplicate restore blocks in branch relaxation.
Thu, Mar 2, 3:21 AM · Restricted Project, Restricted Project
foad accepted D145081: AMDGPU: Add more flat scratch load and store tests for 8 and 16-bit types.

Looks good, thanks!

Thu, Mar 2, 3:18 AM · Restricted Project, Restricted Project
Herald added a project to D31353: AMDGPU: Diagnose illegal SGPR to VGPR copies: Restricted Project.
Thu, Mar 2, 1:07 AM · Restricted Project

Wed, Mar 1

foad accepted D144070: [llvm][GenericUniformity] Prevent assert while calculating temporal divergence.

LGTM.

Wed, Mar 1, 5:53 AM · Restricted Project, Restricted Project
foad accepted D144562: [NFC] Refine tests by adding `:` to checks.

LGTM.

Wed, Mar 1, 3:33 AM · Restricted Project, Restricted Project
foad accepted D145018: [AMDGPUUnifyDivergentExitNodes] Use Uniformity Analysis.

LGTM.

Wed, Mar 1, 12:18 AM · Restricted Project, Restricted Project, Restricted Project
foad accepted D145013: [SIAnnotateControlFlow] Use Uniformity analysis.

LGTM, thanks.

Wed, Mar 1, 12:16 AM · Restricted Project, Restricted Project, Restricted Project

Tue, Feb 28

foad added inline comments to D144956: Fix VGPR + offset Scratch offset folding.
Tue, Feb 28, 9:23 AM · Restricted Project, Restricted Project
foad added inline comments to D144956: Fix VGPR + offset Scratch offset folding.
Tue, Feb 28, 9:19 AM · Restricted Project, Restricted Project
foad accepted D144955: Fix SGPR + offset Scratch offset folding.
Tue, Feb 28, 9:12 AM · Restricted Project, Restricted Project
foad added inline comments to D144162: [AMDGPU] Replace LegacyDA with Uniformity Analysis in AnnotateUniformValues.
Tue, Feb 28, 9:04 AM · Restricted Project, Restricted Project, Restricted Project
foad accepted D144954: [AMDGPU][AsmParser][NFC] Simplify parsing cache policies..

Seems obvious

Tue, Feb 28, 5:46 AM · Restricted Project, Restricted Project
foad accepted D144902: [AMDGPU][AsmParser] Distinguish literal and modifier SMEM offsets..

I don't quite understand the bug here - can you explain how "s_load_dword s1, s[2:3], s0 0x1" would have been mishandled before your patch?

The 0x1 bit is currently indistinguishable from offset:0x1 and therefore accepted as a valid operand.

Tue, Feb 28, 4:19 AM · Restricted Project, Restricted Project
foad added a comment to D143731: [AMDGPU] Break-up large PHIs for DAGISel.

This is because it introduces a need to have a build_vector before copying the PHI value, and that build_vector may have many undef elements. This can cause very high register pressure and abnormal stack usage in some cases.

Do you have any insight into *why* undef elements in build_vector would cause high register pressure? Is there any chance of fixing this elsewhere, by teaching other parts of the compiler to handle undef elements better?

In the particular case I was looking at, we had a 32xi32 vector (from 11xF64 legalized) with 10 undef elements. As the final operation of the BB was a BUILD_VECTOR + CopyToReg, it forced 22 values to be alive until that point and did a wasteful 32x32 copy for this 11xf64 vector. With this change, the undef values can be optimized out and the intermediate values (elements) no longer need to be all alive at the same point. This means they can be reordered and scratch usage goes back down to zero.

Before this patch, I had looked at optimizing this at the DAG level but it's too much effort to do there. As Matt put it earlier, PHIs/Copies larger than 32 bits are "unnatural" on AMDGPU anyway so there's no real benefit in keep around large PHIs.

Tue, Feb 28, 2:52 AM · Restricted Project, Restricted Project
foad added inline comments to D144162: [AMDGPU] Replace LegacyDA with Uniformity Analysis in AnnotateUniformValues.
Tue, Feb 28, 2:33 AM · Restricted Project, Restricted Project, Restricted Project
foad added a comment to D144699: [llvm][Uniformity] provide overloads for Instruction* and Value*.

I'm fine with the patch but I don't particularly like the fact that DA has an isDivergent(Instruction *) in the first place, because it's really unclear what it means - for non-void instructions I guess it's the same as isDivergent(Value *), and for conditional branches it means something specific, and for other instructions...? I'd much prefer for have a more specific isDivergentBranch(BranchInst *).

Tue, Feb 28, 2:27 AM · Restricted Project, Restricted Project
foad added inline comments to D131587: [CodeGen] Deduplicate restore blocks in branch relaxation.
Tue, Feb 28, 2:23 AM · Restricted Project, Restricted Project
foad added inline comments to D144711: [DAGCombiner] don't reuse the pointer info for merged store.
Tue, Feb 28, 2:21 AM · Restricted Project, Restricted Project
foad added a comment to D144902: [AMDGPU][AsmParser] Distinguish literal and modifier SMEM offsets..

I don't quite understand the bug here - can you explain how "s_load_dword s1, s[2:3], s0 0x1" would have been mishandled before your patch?

Tue, Feb 28, 2:17 AM · Restricted Project, Restricted Project
foad added a comment to D143731: [AMDGPU] Break-up large PHIs for DAGISel.

This is because it introduces a need to have a build_vector before copying the PHI value, and that build_vector may have many undef elements. This can cause very high register pressure and abnormal stack usage in some cases.

Tue, Feb 28, 2:12 AM · Restricted Project, Restricted Project
foad added inline comments to D144687: [GlobalISel][NFC] Add MachineInstr::getFirst[N]{Regs,LLTs}() helpers to extract regs & types..
Tue, Feb 28, 2:10 AM · Restricted Project, Restricted Project