- User Since
- Apr 14 2013, 11:48 AM (253 w, 3 d)
Fri, Feb 16
I've run into an internal compiler error that seems caused by this behavior of getMergedLocation: https://bugs.llvm.org/show_bug.cgi?id=36410
Wed, Feb 14
Ah, so this is a problem when some processor models use itineraries and others don't? In that case, the change looks good to me ...
Fri, Feb 9
Checked in as rev. 324726. Thanks!
Thu, Feb 8
Hi Arnold, I only now noticed this commit. I wasn't aware there were any SystemZ back-end issues relating to swifterror support -- can you elaborate? Is there anything I should be doing in the back-end to fix this?
Wed, Jan 31
Yes, that makes sense to me. It also agrees with the layout of bitfields on (all currently supported) big-endian platforms, so that's another argument for it.
Tue, Jan 30
I've been talking to Jonas about the big-endian question, and I now think that there may actually be open questions about what the layout should be.
Jan 22 2018
Jan 19 2018
OK, I've now checked in both changes: enable long tests in the clang-s390x-linux builder, and run this large test only when long tests are enabled.
I see that the build bot switched off long tests some time ago. I'll try to switch this on again for at least one of the SystemZ builders.
We could also move the test to the Large directory with the other large-branch tests; this is not run by default but only when explicitly requested.
Jan 18 2018
Superseded by https://reviews.llvm.org/D41798
Jan 16 2018
Jan 15 2018
Jan 8 2018
This patch seems to be identical to the one I proposed here:
Dec 6 2017
This has caused build bot failures on SystemZ (and others):
Dec 5 2017
The SystemZ changes LGTM. Given that Quentin approved the common part, this should be fine.
Dec 4 2017
For consistency, the other variants of the tbegin intrinsic (int_s390_tbegin and int_s390_tbegin_nofloat) should likewise get the IntrWriteMem flag.
The TwoAddress fix has been committed, but the intrinsics flags fixing seem to require some work so I am not sure if we really want to wait with these changes until that is done?
Dec 1 2017
These changes (at the code level) look reasonable to me now. (There is one mayStore on a STOC instruction that is probably still superfluous now.)
Nov 30 2017
Thanks for providing the diff. I noticed two problems, also annotated in the detailed comment:
- You added mayLoad to a number of load-conditional instructions that actually don't access memory (reg-reg or reg-immed instructions)
- The guarded-storage instructions should really have the side effects flag (I overlooked this in my previous review).
Nov 29 2017
This LGTM now, thanks!
Nov 28 2017
Actually, I take it back, sorry ...
Yes, this looks good to me. I agree that lowerBITCAST probably needs to be fixed as well.
Nov 24 2017
Thanks for looking at and reviewing those flags!
Nov 14 2017
Nov 9 2017
See two minor inline comments. Otherwise, this now LGTM. Thanks!
Oct 25 2017
Right. Always coalescing with the copy source of course extends the live range of the source, and thus increases overall register pressure. So it's not a good idea to do this unconditionally.
Oct 24 2017
Well, the special handling of i128 arguments as such is certainly needed on SystemZ, this is why I added it in the first place in rev. 261325.
Oct 17 2017
The SystemZ parts LGTM. I agree that it's much preferable to have this handled in common code than in the back end ...
Oct 6 2017
OK, this now looks all good to me, thanks!
Oct 4 2017
Have you looked into what's going on with the vec-cmp-cmp-logic-select.ll and vec-cmpsel.ll test cases? I see several of the tests now adding spill/reload code when they didn't before -- ideally this shouldn't happen if the misched does proper register pressure analysis ...
Sep 29 2017
- Quentin approved the interface change
- The SystemZ parts now look good to me
- We have performance runs confirming this does not introduce regressions
Note that as of r314428 I've switched the SystemZ back-end to use a custom expander for ATOMIC_CMP_SWAP_WITH_SUCCESS, which makes use of the condition code value set by the compare-and-swap hardware instruction to completely omit any extra comparison.
Sep 28 2017
Sorry - with the update this fails again, so no update...
Should we make a point to try to handle this somehow?
Sep 27 2017
Sep 25 2017
Sep 22 2017
I looked into making i128 legal when I recently added the 128-bit atomics, but in the end decided against it. Not only would it be a lot of work (since you basically would have to reimplement all the operations on i128 that are currently handled by common code), but it is not clear that we actually always want it.
Sep 21 2017
Looks like this broke the s390x build bots:
This doesn't look in any way target-specific, so I'm wondering why common code doesn't already do that. It seems there is already support for extracting hints from copies in calculateSpillWeightAndHint, why is this not working here?
Sep 20 2017
Looks basically good to me, just a couple of cosmetic comments inline.
The code change looks good to me, but please try to simplify the .mir test case following the suggestions outlined here: https://llvm.org/docs/MIRLangRef.html#simplifying-mir-files
Sep 19 2017
Sep 18 2017
Hmm. Some of those changes look like regressions, we should try to at least understand why this happens. Maybe there's a simple way to fix those again.
Sep 17 2017
Sep 15 2017
As far as SystemZ is concerned, this change should be safe. In our ABI only vector arguments are affected by the IsFixed flags, and there shouldn't be any libcalls with vector arguments.
Thanks for trying this suggestion! See a couple of inline comments.
Aug 21 2017
Aug 18 2017
Aug 17 2017
Oh, and just comment on this:
The getRegAllocationHints implementation makes sense to me. However, I'm wondering if we shouldn't also check for the *destination* register -- you only force one source register to the same class as the other source register, but I think we should check whether *any* of the three registers is already allocated, and then always force the other two to the same class.
Aug 4 2017
Aug 2 2017
Yes, this benefits a real workload. This happens when the improved SimplifyAndInst now simplifies a partial result, which then allows *further* simplification in InstSimplify, in my case in ThreadBinOpOverPHI. This latter simplification doesn't seem to be done anywhere else.
The testcase is extracted from a real-word program. On that program, the transformation (moving some of those operations out of a hot loop) is a significant overall win (about 10% improved performance of the whole program). I agree that this application is a quite special case -- this patch doesn't make much difference to the overall performance of the platform in general.
Sure, it would be possible to add the corresponding transforms to SimplifyOrInst. However, I have been unable to trigger this at all, with any workload I've tried. Should I still add the transforms anyway?
Well, I'd be fine with NewGVN doing this transform. But at least right now, it doesn't appear to be doing so -- running the new test case (@phi_multi in phi.ll) through "opt -O2 --enable-newgvn" does not move the zext/or out of the loop. Is this supposed to happen?
OK, the SystemZ parts look good to me now. Thanks!
Aug 1 2017
Thanks! The SystemZ part now looks very reasonable. Just a couple of final, really just cosmetic comments inline.
Jul 31 2017
A couple more comments on the SystemZ parts, see inline comments.
Jul 28 2017
Jul 27 2017
The SystemZ part looks correct to me, see the inline comment for a style/readability suggestion.
Jul 24 2017
It does indeed seem to be necessary to get an explicit guarantee from common code that scheduling happens in a particular order, otherwise our back-end code may silently break if common code logic happens to change in the future.