All things performance at AMD: current focus - Vulkan!
- User Since
- Dec 6 2013, 2:54 AM (271 w, 3 d)
Mon, Feb 11
Fix comment that should now say exclusive scan instead of inclusive scan.
Fri, Feb 8
Make the final readlane be in the WWM section too as per the review comments.
Thu, Feb 7
Wed, Feb 6
Updated to bring in an additional fix to remove read_register exec and replace it with a ballot.
Tue, Feb 5
Tue, Jan 29
Mon, Jan 28
Fixed review comments.
Jan 18 2019
Fix review comments.
Jan 17 2019
Also tag s.getpc as scalar, and add tests in the Analysis folder like suggested.
@arsenm agreed - not quite sure how to do tests for this? I started to look at how we test alias analysis, and they have a cl::opt to force print the debug info for that pass, and that is how our amdgpu-alias-analysis.ll tests the aliasing rules. How exactly to test that divergence analysis did something here?
Jan 10 2019
@cwabbott I planned to do a followup once this DPP change had landed to add the missing dpp/codegen patterns to the atomic optimizer - so watch this space!
Jan 9 2019
Fix review comment.
Jan 8 2019
Fix review comments.
Jan 7 2019
@nhaehnle I've removed canReadVGPR as it only had the single callsite - but given @arsenm's comment I did not change getOpRegClass. I also don't feel super comfortable doing the change you suggested to addUsersToMoveToVALUWorklist as I'm worried that there will be non-LLVM-tested paths that I could trip up on with ease. I'd rather ship the commit as is if y'all are ok with it.
Removed canReadVGPR as it had only a single callsite.
Dec 21 2018
Dec 12 2018
Dec 10 2018
Made the l1 flush change happen for MESA3D too (like Nicolai asked for).
Dec 6 2018
Dec 4 2018
Fixed review comments:
Nov 8 2018
Nov 6 2018
We discussed this on an internal AMD meeting Monday 5th November 2018, and came to the conclusion that even though I do want the scalar load combining to be brought upstream, it would be better as a separate change so that we can get broader testing across the users of our AMDGPU backend.
Nov 5 2018
Fix review comments made by @nhaehnle
Nov 2 2018
Nov 1 2018
Oct 31 2018
Oct 29 2018
Added an additional lit test for the dim variants (including the 2darraymsaa requested by @nhaehnle), and change the naming of the a16.d16 tests to vNf16.
Oct 26 2018
Oct 19 2018
Maybe a dumb question - but why can't we just use the tbuffer load/store instead of these? It already upcasts for you (the zext/sext is built in depending on the nfmt I believe).
Oct 10 2018
Oct 8 2018
Oct 5 2018
Sep 28 2018
Fix latest review comments.
Move the shuffle widens check into Instructions.h, and call it increasesLength to match the existing changesLength call that was already on ShuffleVectorInst.
Changed the check to not ever push wider shufflevector stuff back onto predecessor instructions as per spatel's suggestion!
Sep 27 2018
Incorporate a related test case that my approach also fixes from https://bugs.llvm.org/show_bug.cgi?id=28911
Rebased on tip trunk.
Sep 26 2018
- Added a comment explaining each of the 3 variants for each of sdiv/srem/udiv/urem
- Fixed the whitespace issue
- Removed the FMF flags that were a legacy from the original shader this was reduced form
Removed the fdiv/frem as they weren't currently inst simplifying the bad behaviour, and used the utils script to update the test case.
See https://reviews.llvm.org/D52556 for just the test case with the bad output expected.
Just to be clear you want me to commit the tests in a separate commit with the bad output first?
Sep 25 2018
Sep 24 2018
Removed the no-args overload as per Nicolai's suggestion.
Sep 14 2018
Added an interaction with the DominatorTree, so that if its present in the PassManager we can preserve + update it.
Sep 13 2018
Sep 12 2018
Update to include tip LLVM changes (DivergenceAnalysis -> LegacyDivergenceAnalysis, AMDGPUAS is become just an enum now).
Sep 9 2016
This is something we (Codeplay) would like to see upstream, it is a much cleaner solution than the metadata workarounds everyone (including us) have been using to fix this.
Apr 27 2016
Apr 26 2016
So we build a bunch of internal libraries in a mix of OpenCL and C++, and then link them all together to create SPIR libraries that can be fed to calls to clLinkProgram and linked against user kernels.
Apr 25 2016
Jan 29 2015
Ran clang-format on the file, and there was a TON of line changes out-with the patch. Extracted the formatted lines from my original patch only, and updated the patch.