mlir_runner_utils can still be built
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Today
Yesterday
Tue, Jan 24
Mon, Jan 23
Thanks for the review!
Thu, Jan 19
Wed, Jan 18
Thu, Jan 12
Oct 17 2022
In D134477#3860460, @RKSimon wrote:LGTM - cheers
Oct 14 2022
Add back AVX2 check for v8f32. Add TODO.
Oct 13 2022
Remove AVX2 check on v8f32.
Oct 12 2022
Generalize to other 256-bit vector types; Limit change to AVX2 only.
Oct 10 2022
ShuffleVectorSDNode -> auto
Oct 6 2022
Friendly ping for a review of this.
Sep 30 2022
Hi @RKSimon, any thought on the new version? Thanks.
Sep 24 2022
Pre-commit test. Use NumElts instead of hard-coding.
Sep 22 2022
Feb 2 2022
I verified this patch together with https://reviews.llvm.org/D116130 fix the issue https://github.com/llvm/llvm-project/issues/49689. I'm not an expert in this area. Could someone review this patch so we can move forward with the fix?
Jan 21 2022
Thanks for the quick fix.
Sounds good. Thank you.
Jan 20 2022
This is causing a failure in LLVM :: tools/gold/X86/cache.ll on our side. Do you see the same failure?
... ls: cannot access '/home/zhuhan/server-llvm-build/auto-rebase/test/tools/gold/X86/Output/cache.ll.tmp.cache': No such file or directory Expected 1 lines, got 0.
Jan 7 2022
What I meant is, the number of vector instructions decreased after adding -march=native, compared with not adding -march=native, with or without this patch. In terms of this patch, the number increased significantly in our test which didn't add -march=native. See the above table for numbers.
In D114799#3226181, @ABataev wrote:In D114799#3226154, @zhuhan0 wrote:Hi, we observed ~9% increase in SLP.NumVectorInstructions on SPEC's 508.namd_r with this change, using llvm-test-suite. I noticed the number is not reported here. Curious did you also see the same result? We tested on an Intel Skylake.
I ttied with -march=native on Skylake, but did not see wuch results.
Jan 6 2022
Hi, we observed ~9% increase in SLP.NumVectorInstructions on SPEC's 508.namd_r with this change, using llvm-test-suite. I noticed the number is not reported here. Curious did you also see the same result? We tested on an Intel Skylake.
Jan 4 2022
I think this change has performance win on SpecCPU 2017 508.namd_r by 6%. Run using --size train, showing running time.
With this change | Without this change | Diff |
52.439 | 55.842 | -6.094% |
Dec 2 2021
https://reviews.llvm.org/D107075 landed.
Oct 7 2021
LGTM.
Oct 4 2021
Sorry for the delay. I've been occupied with other projects.
Aug 20 2021
In D108353#2954114, @yurai007 wrote:@zhuhan0: Thanks for following up my memmove change! Actually there is analogous follow up but it didn't catch lot of attention: https://reviews.llvm.org/D107075.
Maybe you can reuse some parts of that change. I feel you could at least copy most unit tests from there for better coverage.
And last but not least - there is one ongoing small but important fix for https://reviews.llvm.org/D104464 which can be found here: https://reviews.llvm.org/D107964
It would be great if you and @hoy could take a look since your changes depends on it.
Aug 19 2021
Aug 18 2021
Address comment.
May 13 2021
LGTM.
May 12 2021
May 4 2021
In D97667#2735870, @lebedev.ri wrote:Alright.
Please ensure that you have added all the necessary tests.
It looks about reasonable to me now, i think you can try re-landing it.
May 3 2021
Apr 30 2021
In D97667#2725626, @dstenb wrote:With this patch we got the following assertion:
bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed.in LoopIdiomRecognize::processLoopMemCpy() at the following comparison:
// Check if the load stride matches the store stride. if (StrIntStride != LoadIntStride && StrIntStride != -LoadIntStride) return false;for a memcpy done between two address spaces with different pointer sizes.
I don't have a upstream reproducer ready for this, but I'll see if I can create one.
Compare LoadStride and StoreStride after sign extension.
In D97667#2724992, @tpopp wrote:Describing what the code is intended to do (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/tests/reverse_test.cc#L146).
A 4d array is taking in reversing elements across the 0th and 1st dimensions, so for every value previously indexed at [A,B,C,D] in an array of size [W,X,Y,Z], the new index of the value is [W-A-1, X-B-1, C, D].
The original code indexes into proper locations for the first 2 dimensions, and then copies the subdata, while this change results in a single copy after indexing only in dimension 0, which cannot be done as the data in dimension 1 cannot be copied due to the reversal.
Apr 28 2021
Rename variables, fix stride check, add two tests and one more remark.
@tpopp I cannot reproduce your test failure with opt -O2 and -O3. My patch only affects memcpy intrinsics in the loop body. Therefore running your test case shouldn't hit my code. Output of opt -O3:
; ModuleID = 'reverse_4d_float_array.ll' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu"
Apr 27 2021
typo
Apr 22 2021
In D97667#2710349, @xbolva00 wrote:This was already accepted and you fixed build break, I think you can try to reland it.
Rebase.
Address comment.
Apr 21 2021
Address @hoy's comments.
Also split the preparatory change away into an NFC patch https://reviews.llvm.org/D100979.
Fix build break. The breakage was a situation where the memcpy source and destination were of different types/sizes. Abort the transformation if that's the case. Also added a test case memcpy-intrinsic-different-types.ll.
Add back comments.
Mar 30 2021
@lebedev.ri @zino is the replacement account for @zinob. He had some issue with the old account and couldn't retrieve it.
Mar 29 2021
Mar 23 2021
In D97667#2645105, @foad wrote:Typo in description: "perheader".
Mar 17 2021
Friendly ping. I could ask my colleagues to review this, but would appreciate some community feedback. I didn't find a clear code owner for this pass, so I simply put the top contributors to LoopIdiomRecognize.cpp as reviewers. Please let me know if I should put somebody else.
Mar 7 2021
Add function name to optimization remarks.
Mar 1 2021
Fix linter.
Dec 4 2020
Sorry nvm, this is already fixed.
Hello, this seems to break the build. Error here http://lab.llvm.org:8011/#/builders/16/builds/2910/steps/5/logs/stdio. Line 46 of llvm/tools/llvm-profgen/ProfiledBinary.cpp should now take a reference instead of pointer.
Jun 24 2020
Thanks @jdoerfert! I don't yet have commit access. Could you commit this for me?
Jun 23 2020
In D81176#2106382, @yaxunl wrote:In D81176#2105944, @zhuhan0 wrote:This broke a test clang/test/Tooling/clang-check-offload.cpp for a critical Linux distro at Facebook. With this change, the test adds a -include __clang_hip_runtime_wrapper argument. The wrapper includes some standard c++ headers, but our distro don't have those headers in the default include paths, thus causing a break.
I notice this behavior doesn't happen for CUDA tests, which also rely on a similar __clang_cuda_runtime_wrapper. I think what's causing the difference is the different handling of nogpuinc/nogpulib option. My knowledge on this area is limited, so correct me if I'm wrong. CUDA seems to respect nogpuinc and doesn't include its wrapper if the flag is provided: https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/Cuda.cpp#L255. But based on this change, HIP does things differently: https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/ToolChains/AMDGPU.cpp#L226.
If I modify RocmInstallationDetector::AddHIPIncludeArgs to also respect nogpuinc/nogpulib, the test will pass for us. Is it a mistake for HIP to always include the wrapper file? Could you provide a fix for this issue? Thanks!
Thanks for investigating the issue. It makes sense to respect nogpuinc and nogpulib. fixed by 2580635bd2f3c0527353e4d7823326cd9f92ff7c
Jun 21 2020
This broke a test clang/test/Tooling/clang-check-offload.cpp for a critical Linux distro at Facebook. With this change, the test adds a -include __clang_hip_runtime_wrapper argument. The wrapper includes some standard c++ headers, but our distro don't have those headers in the default include paths, thus causing a break.
May 6 2020
@smeenai Yes please. Thanks!