Thu, Feb 13
- Rename and combine some cases together.
- Add some more cases.
Wed, Feb 12
Fix build failure
Correct me if I'm wrong. rint(x) returns x if x is an NaN. However, intruction like XSRDPIC may turn SNAN to QNAN. Does it matter?
Sun, Feb 9
Thu, Jan 30
Sun, Jan 26
Jan 16 2020
- Wrap memory modified check into a single method (which can help other methods)
Jan 14 2020
Rebase and fix test errors
Fix a test error
- Use dedicated method to check ModRefInfo.
- Add tests about calls.
Jan 11 2020
Jan 10 2020
Jan 8 2020
The merge check bot should have some problems in resolving parent-child revision with some already committed. Currently, they have no problem applying into master and tests are passed.
Address some comments.
- Change some use of auto.
- Update test cases with comments.
- Use getModRefInfo.
Jan 6 2020
This patch is too large to review and 'remaining parts' is confusing. It's better to split this one into several patches (round, extend/trunc, sqrt, fma, etc.) and push them into a review stack.
Rebase and fix conflicts.
Jan 4 2020
Jan 2 2020
Use explicit template instantiation approach.
Can you add more info about compilation errors and compiler version?
bad luck, got a build failure.
Link to the report: http://lab [.] llvm [.] org:8011/builders/clang-ppc64be-linux-lnt/builds/33875Build Reason: scheduler Build Source Stamp: [branch master] f9f78cf6ac73d9148be9b626f418bf6770e512f6 Blamelist: Lorenzo Casalino <email@example.com> BUILD FAILED: failed test-suite
Jan 1 2020
This causes build failure (using GCC 8.2.1), complaining undefined reference to releaseNode, since this patch splits declaration and definition of the template method.
Dec 30 2019
I checked all references to isPredicable in codebase:
- Instr doc generation uses the bit (this is non-functional)
- CodeGen tablegen itself uses it
- Both TargetInstrInfo::isUnpredicatedTerminator and TargetInstrInfo::PredicateInstruction references the method, but PPC overrides them and doesn't use it.
- MachineInstr::findFirstPredOperandIdx references it, but only ARM and AMDGPU invokes it
- Implicit null check and machine sink have referenced it. However, this patch won't touch instrs may load or store, so result of expression would never change
- Other references are for other platforms
So, I assume you are saying this is *NFC* for PowerPC, right?
Dec 29 2019
This diff seems out of date, needs rebasing.
- Create inbounds store
- Strip casted pointer before comparison
- Make sure load and store belong to the same BB
- Keep nontemporal metadata of store
Update test case to reflect new points in main patch.
You need to revisit the place that use the isPredicable bit of the MI, which might cause functionality change.
Dec 27 2019
Extend patch with full contexts.
Dec 25 2019
- Add check for address of load and store.
- Add check for any memory write instructions between load and store.
- Add more test cases for cases above. (Thanks to spatel)
Dec 23 2019
Update test using auto-genearate tool.
https://reviews.llvm.org/D71828 is created for simpler logic at InstCombine.
Dec 12 2019
Dec 4 2019
Thanks for comments and explanation from everyone. I think there're two key issues to clarify and solve about this revision:
Dec 3 2019
Sorry I didn't see this revision. I commited the same change in rGc246d6e536c7112019cba6cfc764daeb9088ef29
Nov 28 2019
Address some comments from the community:
Nov 27 2019
Will abandon this revision and add them back to D70223 for easier discussion.
Nov 25 2019
It looks like this is missing some checks on the load. The code needs to check that the load and store target the same address, and that there aren't any operations between the load and the store that could modify the memory.
The profitability check probably needs to weigh the cost of the memory operations a little more carefully in cases where the total number of memory operations increases.
I'm a little worried there could be a performance penalty on certain CPUs if the vector value is loaded soon afterwards, due to the partial overlap. Depends on details of the specific CPU, though, and maybe it's rare enough that it doesn't matter.
Nov 19 2019
My concern is store and store volatile are so different in semantics that it might break original test intention.
Nov 17 2019
Remove test case change to swaps-le-5 and swaps-le-6 since they're moved to a single differential D70373.
Nov 15 2019
- Add regression test.
- Check legality before doing costy operations.
Nov 14 2019
Nov 13 2019
Nov 10 2019
Oct 15 2019
Oct 11 2019
Sep 12 2019
Sep 9 2019
Remove unexpected changed file caused by newline characters.
Upload patch with full context.
Update patch to fix check regressions from recent commits.
Sep 5 2019
I updated a test for testing this new way of division estimations. It's posted at https://github.com/ecnelises/fp-division-test/blob/master/algorithm_test.c so people can do test by their own. Here are my accuracy results:
Sep 3 2019
I think these two points weren't addressed.
I'd like to see at least some publicly-stated numbers on accuracy,
just so we all know this is going in the right direction for all inputs.
Changing my 'accepted' until this is answered.
The test at:
...seems to do a small random sampling.
The original transform was tested on x86 using brute force for all possible floats (1.0f/x) and is attached here:
I'm not sure how to prove this, but by distributing the multiplication into the last step of the estimate, I think we are always trading better accuracy around the numerator value with potentially overflowing to infinity for extremely different numerator/denominator. That's a good trade-off IMO and within the loosely-defined behavior enabled by 'arcp' in LLVM and '-mrecip' with Clang.
Sep 2 2019
Update test to reflect changes introduced in rL370071.
Aug 27 2019
Aug 26 2019
Fix typo and rebase.