- User Since
- Jan 23 2015, 9:38 AM (207 w, 4 d)
Mon, Jan 14
Amy, can we get this committed?
Fri, Jan 11
Updated the wrong indices for big endian systems. Added comments for magic numbers for indices/shift amounts.
Fri, Jan 4
Sarvesh, I've tried this on a PPC64LE machine with one version of GLIBC and found no problems. However, I am reluctant to just go ahead with a change without testing with a few other versions of GLIBC.
Mon, Dec 31
Indentation was off in the td file.
Upon closer inspection, this actually almost never fires on PPC so spending any more time on it does not seem useful. Abandoning this patch.
Sun, Dec 30
Updated to remove the patterns for the Altivec versions of vector min/max as they have IEEE semantics wrt. handling NaN. A subsequent patch will legalize the _IEEE versions of the nodes for single precision and provide patterns to match them to vmaxfp/vminfp.
Sat, Dec 29
We have neglected this for a very long time. Just adding a comment to trickle it up to the top of the review queue and I plan to review it very soon.
A couple of questions since I am not all that familiar with clang and am certainly not familiar with this unusual SUSE 32-bit situation:
- We seem to be changing the set of aliases here, but what happens if someone actually explicitly specifies --target=powerpc-suse-linux?
- Do we need to change anything about include paths?
- Can you describe the default triple for clang on SUSE 32-bit PPC? Will it be powerpc-suse-linux? powerpc64-suse-linux?
- Will this change not affect 64-bit PPC SUSE? Namely will the default libraries on actual 64-bit PPC SUSE big endian systems now be 32-bit libraries?
- Can you please add a test case and a patch with full context before this patch can go any further?
Other than the minor nit about the test case, LGTM.
LGTM. We definitely want to go ahead with this.
I assume that we are fixing this specific instance because we have a test case that breaks if we turn on verification by default. However, I would assume that we actually have a lot of other failures hiding in here which would be uncovered by doing a bootstrap build with -O0. I am not suggesting that we put all the fixes in a single patch, just that we want to do more thorough testing before we turn on MachineFunction verification by default.
Mon, Dec 24
Also, I've thought a bit about what Jinsong mentioned and I am wondering if maybe we don't want to take a different approach. Namely, the cost model is meant to return the cost of individual instructions with a given type and this cost is the instructions latency relative to the latency of a simple scalar ALU instruction (I think a comment somewhere uses add as the basis for what 1 means). And the vectorizers use these values to essentially add up the costs of all the instructions in a loop/block and then compute what provides the lowest total cost assuming that the cost of an N-wide vectorization can be computed by simply dividing the total cost by N.
I think this code could be greatly simplified if we implement int PPCTTIImpl::vectorCostAdjustmentFactor(Type *Ty) which would return 1 by default and 2 if we are on a subtarget with vector operations that take up two execution units and Ty is a vector type.
The remaining comments from me are minor. I am OK with this going in if Robert is.
Fri, Dec 21
Wed, Dec 19
Dec 13 2018
Dec 12 2018
Yeah, as we discussed over IRC, all I was suggesting is that there is a difference when we choose one over the other with this patch. But I don't think we really care about endianness here, so we might want to just change this to defined(__ppc__) || defined(__PPC__) so we always define LLVM_THREADING_USE_STD_CALL_ONCE the same way on all PPC platforms.
Since we have discovered this in the PPC back end but the issue also exists in ARM and X86, it'd be nice to notify @t.p.northover (already a reviewer) and @craig.topper about this potentially needing to change in their back ends as well.
This code has been in place for quite a long time as there is some sort of bug in libstdc++. I don't remember what the problem was but I remember that this was required because of an issue with the GNU C++ runtime.
My concern is that the intent with defining LLVM_THREADING_USE_STD_CALL_ONCE to zero when __ppc__ is defined is that we would use the non GNU one on all PPC platforms.
However, gcc does not define __ppc__.
So the behaviour before this change is that anything built with clang on any PPC system will use the clang version of std::call_once and anything built with gcc will use the GNU version. This was the case on all little endian and big endian systems (both 32 and 64 bit). After this change, this is what the situation will be (compiler-rt == clang's libc++, GNU == GNU libstdc+++):
Dec 10 2018
:) It seems these have been around for a long time and nobody bothered to fix it. Thanks for fixing this.
Dec 7 2018
I am definitely on board with this refactoring since it makes the purpose of the various pseudo instructions more clear.
However, I really don't like the name PhonyInst. These will lead to real instructions being emitted during emit time. I think we should name these according to when we expect to expand these (similarly to what you did for the custom inserter ones).
The names should reflect the uses:
- Custom inserter
- Post-RA expansion
- Emit-time expansion
- Asm parser expansion (PPCAsmPseudo)
Dec 6 2018
I suppose since it is in the ABI, it is fine to leave the magic numbers there. And thanks for adding the comment. However, please verify that this mapping applies for ELF v1 as well and if it doesn't please add an appropriate guard.
Nov 30 2018
I'll let @renenkel approve or request changes to this as he understands the math much better than I do, but I do want an investigation into how the INFINITY/QNAN tests work along with a comment explaining that. Particularly since I would have expected that every __int128_t value is a number (i.e. not a NAN) and none of them produce an infinity.
Let's add the #include for the file that defines the pertinent macros and push this change so we start running the test cases on PPC64LE.
Nov 26 2018
Please let me know if there are any further comments from anyone. Otherwise, we can probably proceed with this.
The PPC changes seem to be neutral, so no objection from me. Would you mind adding a comment to https://reviews.llvm.org/D54663 if this gets committed before that one as I'll have to update the patch? I'll do the same if I commit mine first.
Nov 21 2018
Nov 20 2018
Feel free to just commit patches like this. It appears you're just adding a test case ahead of a patch to make the review easier as it shows only the diffs in code gen.
Nov 19 2018
Nov 16 2018
The "recent patch" mentioned in the description is https://reviews.llvm.org/D53346.
Actually, the handling for v2i32 is already optimal due to the DAG combine that handles this. We'll leave it in the DAG combine since that handles extractions even if they didn't come from legalizing v2i32 types (i.e. it handles a superset of the legalization added here).
Removed the actual definitions of the static arrays from the header. Simply provided the macro for the defs in the header and the macro is used in only the two CU's that need it.
This way if we need to add any new ones, they just need to be added to the macro.
Nov 14 2018
Nov 13 2018
LGTM now that we don't bitcast types any longer.
Nov 12 2018
This is how I meant to do it initially but was reluctant to introduce a new header dependency. However, I don't think it's an issue at all for PPCMCTargetDesc.h to have a dependency on MC/MCRegisterInfo.h.
Did the same for the disassembler.
Nov 9 2018
Nov 8 2018
@hubert.reinterpretcast Have your comments been addressed adequately in the latest version of the patch? Do you have an opinion on adding the test case I proposed?
Just for clarification (and please add the text to the commit message), this is actually required by the ABI:
Each element of the result vector is the result of logically right shifting the corresponding element of ARG1 by the number of bits specified by the value of the corresponding element of ARG2, modulo the number of bits in the element. The bits that are shifted out are replaced by zeros.
Nov 2 2018
The test case fix can be done on the commit. Approving this now.
Since this fixes the sanitizers on PPC for the newer kernels and the fix seems perfectly reasonable and there are no concerns brought up by anyone, let's go ahead with this fix.
Oct 26 2018
Oct 25 2018
Please revert this. The way this was committed is incorrect. You've created directories such as llvm/trunk/llvm/ which we do not want. I suppose this has something to do with the fact that the patch was created using the monorepo and committed in some different way.
Oct 23 2018
Since the logic in the switch statement is difficult to follow, I ended up having to test this patch with a script that tries all combinations of comparisons, results and operand order in order to convince myself that it doesn't emit a setb when it is not valid to do so. Everything checks out there, but I don't really have an easy way of testing whether any opportunities are missed.
I am really hoping that you can simplify the logic in that switch to make it easier to follow.
I think the changes that are needed are clear enough that this doesn't require another review cycle. Approving this with the assumption that the required change will be made on the commit.
Oct 22 2018
Updated the regular expression to just eat any text that is not relevant between the function directive and its text. This seems to work for both ELFv1 and ELFv2 asm output.
Implemented all the patterns for codegen of extract->store.
Oct 19 2018
Modified the changeset to not reduce the capabilities of the pass (select the correct opcode based on the register for post-ra).