- User Since
- Jul 7 2012, 2:54 PM (293 w, 1 d)
Fri, Feb 16
Sorry for showing up late, but I was looking at this code because it turns out it is miscompiling code for us. We're working on a test case, but there are really a number of basic LLVM coding convention problems here.
Tue, Feb 13
So, if I understand correctly, this prevents the use of -mregparm=3, 32-bit x86, and a caller-save-all calling convention. And the reason this is reliable is because LLVM doesn't have such a calling convention (except maybe AnyReg) and if someone even managed to craft such IR they would just hit the assert and have to extend this? And we believe Clang definitively cannot create such a calling convention?
Wed, Feb 7
Ping, love to get a review on the actual functionality here....
Update to tidy up MIR and reflect a syntax change.
Tue, Feb 6
Code review suggestion.
Sat, Feb 3
Some quick comments on the code etc., want to think a bit harder about the loop access checks just to make sure I'm not missing anything. Ben may also have thoughts there.
Fri, Feb 2
Wed, Jan 31
(Also, thanks for review on my first ever usage of MIR! I have so little idea of what I'm doing with it....)
Update with fixes from review.
Thanks, applied fixes. I can update the patch if useful.
Tue, Jan 30
Sorry for the delay, took me a bit to internalize what you were proposing.
Wed, Jan 24
Adding a few folks on my team who will hopefully take over helping here (my time has vanished). Note that they aren't on llvm-commits, so please reply-to-all etc.
Tue, Jan 23
Mon, Jan 22
Thanks Matthias and Eric!
Jan 19 2018
Ping! Would really like to get this landed, but as it has changed somewhat, looking for a final round of review....
Jan 18 2018
Add the lfence instruction to all of the speculation capture loops, both
those generated by LLVM and those generated by LLD for the PLT.
Jan 16 2018
Jan 15 2018
Ok, I think this is all done. Rafael, I think I've implemented your suggestion as well and it still passes all my tests (including the test-suite) and a bunch of internal code I have.
Couple of other minor fixes.
Re-implement the indirectbr removal pass based on Rafael's suggestion. It now
creates a single switch block and threads all the indirictbr's in the function
through that block. This should give substantially smaller code especially in
the case of using retpoline where the switch block expands to a search tree
that we probably don't want to duplicate 100s of times.
Jan 12 2018
Jan 5 2018
Teach the thunk emission to put them in comdats and enhance tests to verify
(Er sorry, have two updates from Rafael I need to actually include, will be done momentarily...)
Any more comments? I'd love to land this and start the backporting and merging process. =D
Just as an FYI, we have been experimenting with similar APIs ourselves. We developed two candidate alternative APIs that, IMO, seem substantially better than this.
Add explicit checks for various lowerings that would need direct retpoline
support that are not yet implemented. These are constructs that don't show up
in typical C, C++, and other static languages today: state points, patch
points, stack probing and morestack.
Rebase and fold in the latest batch of changes from Rui based on the review
Jan 4 2018
All comments addressed now I think, thanks everyone! See some thoughts about a few of these below though.
Add the new test file for external thunks.
Add support for externally provided thunks. This is an independent feature;
when combined with the overall retpoline feature it suppresses the thunk
emission and rotates the names to be distinct names that an external build
system for the kernel (for example) can provide.
Add another minor suggestion that I forgot previously.
Fix several bugs caught in code review, as well as implementing suggestions.
Thanks for the review!
Fix suggested in code review and a couple of indentation fixes via clang-format
that I missed previously.
Comment for Rui about the 32-bit PLT sequence...
Adding a bunch of subscribers who likely will care about this, and our release managers.
Dec 29 2017
There was a comment by @nemanjai on the submitted revision in phab that I think got lost:
Dec 27 2017
I don't think this patch's approach is the right one, and we should probably pursue Eli's suggestion.... It also causes very significant regressions in benchmarks, so I think we should revert it for now.
Dec 26 2017
Dec 22 2017
Dec 21 2017
Thanks all! Going to land this after a touch more testing so that my test runs start being faster. =D
Dec 13 2017
LGTM. It'd be great to add an instcombine test that uses new PM and requires AA to optimize as well. Can be in a follow-up commit if needed.
Dec 8 2017
Thanks for all the feedback!
Update based on code review comments.
If MSan doesn't complain about these, I'm inclined to leave them uninitialized. I'm not sure how valuable it is to workaround valgrind false-positives.
Dec 7 2017
Nov 28 2017
Thanks for all the feedback, landing with suggested fixes.
Nov 20 2017
Update to address the issue highlighted by David and add more test cases.
Nov 17 2017
Nov 15 2017
Thanks Davide, I think I've got these covered either by adjustments or future work. Give a shout about anything you'd still like addresed in follow-up patches.