- User Since
- Feb 4 2016, 3:03 PM (141 w, 3 d)
Fri, Oct 19
Remove conflict line.
@joerg Sorry but I'm not sure if I understand your question. This doesn't pretend to honor source code order, but makes linker to place "hot" functions under .text.hot section (There's no guarantee of ordering between functions inside .hot.text section) while "cold" functions under .text.unlikely section. This is purely for performance.
Thu, Oct 18
Rebase. Tests are provided in the clang counterpart (D34796).
Rebase. Sorry I somehow missed the recent comments. I addresses @davidxl's comment on documentation. Thanks!
Sep 19 2018
Sep 7 2018
Addressing comments from @echristo. Reverted option name change, and added a test case. Sorry I haven't work on this code for a while so it took time to invent a test case.
Sep 5 2018
Hello, I observed a case where atomic builtin generates libcall when the corresponding sync builtin generates an atomic instruction (https://bugs.llvm.org/show_bug.cgi?id=38846). It seems that the alignment checking for __atomic builtins (line 759 of this patch) results the difference, and wonder if the check is actually necessary. Could anyone please shed some light on understanding this? Thanks!
Aug 16 2018
Hello, I wonder if we need to keep linkonce_odr symbols live here as well. I observe a case that a vtable for template class initiated has linkonce_odr linkage and marked dead here, which results compiler crash at WholeProgramDevirt because the global variable for vtable doesn't have initializer (https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/IPO/WholeProgramDevirt.cpp#L676 assumes that GV has the initializer). Thanks!
Jul 16 2018
Addressing comments from @Hahnfeld. Thanks!
Apr 25 2018
Addressing comment from @mssimpso. Thanks!
Apr 24 2018
Hello @jlpeyton @tlwilmar, is there a chance that this breaks runtime/test/ompt/misc/api_calls_from_other_thread.cpp test? ompt_get_num_places() is expected to return 0 from the test but with this patch it returns 1, and I think it is because kmp_affinity_num_masks is set to 1 from kmp_create_affinity_none_places().
Apr 4 2018
Apr 3 2018
Apr 2 2018
Thanks @fhahn for the comments! I added comments about why we don't perform splitting for the callsites inside the landing pad. Please let me know if you think it is too verbose.
Addressing @junbuml's comments. Thanks!
Mar 31 2018
Mar 12 2018
Update test to check f2.llvm.0 as well.
@tejohnson Thanks for the clarification. Regarding hotness, I'm not sure if providing "some" hotness is better than leaving it as unknown if profile data is not provided (If profile data is given, as you said, VP metadata will be attached to the callsite). I'm afraid that synthesized hotness may confuse optimizers, but please let me know if you have different idea.
@tejohnson I think your right. What I meant was that when the metadata is imported to bar.o, it references f1 and f2 by their promoted names, which makes the declarations with the promoted names to be added. Did I get it right, or still miss something?
Dec 6 2017
Dec 5 2017
@jkorous-apple Got it. I agree that it would be better to move the comments to the header. Will land it soon. Thanks!
Dec 4 2017
Thanks @jkorous-apple for the comment. I think your suggestion is a more
precise description for the implementation, and adjusted the comments
Nov 29 2017
@jkorous-apple Thanks for the comments! Yeah, I was thinking of O(lenght_of_string) approach, but considering the complicatedness of the implementation (I guess the real implementation would be a bit more complex than your pseudo implementation to handle quote and '\n\r' '\r\n' cases) I decided to stay with O(length_of_string * number_of_endlines_in_string) but optimizing the number of move operations.
Nov 27 2017
Thanks @jkorous-apple for your comments. I modified the type for the variables and replaced unnecessary inserts and erases with updates.
Nov 20 2017
Addressing @vsapsai's comments. Thank you for the suggestion! Added test case actually finds an off-by-one error in the original patch. I improved the comments as well.
Nov 1 2017
Oct 25 2017
Oct 24 2017
Sep 29 2017
Sep 27 2017
Sep 25 2017
Sep 19 2017
Aug 28 2017
Fix a test.
Aug 25 2017
Aug 16 2017
Friendly ping. @davidxl, I think there's no harm to make clang consistent with gcc for compiler options, and I wonder if you have any concerns that I may miss. Thanks!
Aug 10 2017
Aug 8 2017
Addressing dblaikie's comments. Thanks!
Aug 7 2017
Aug 5 2017
I think it is generally good to match what GCC does to not to confuse people.
Aug 3 2017
Jul 31 2017
Update documentation. Please let me know if I need to update other documents as well. Thanks!
@davidxl I think it is theoretically possible, if the if branch is not taken on line 294. Did I miss something? Thanks!
Jul 28 2017
Delete unnecessary line from the test.
Jun 29 2017
@wmi Good call! I fixed the code per your suggestion. Thanks!
Addressing comments from @wmi. Thank you for the suggestion!
Jun 28 2017
https://reviews.llvm.org/D34796 is clang side change.
Jun 25 2017
Jun 20 2017
Addressing @wmi's concern by limiting the targets to the recurrence cycles that only the last instruction of the recurrence (that feeds the PHI instruction) can have uses outside of the recurrence. This is not an ideal solution yet, and more fundamental solution (such as having recurrence optimization as a separate pass and/or using live range analysis for it) should follow. But still I think it is worth to have it here.
Jun 19 2017
I think this is a right approach, but concerned that the experimental results I shared on D32451 show that it is generally better to not to vectorize the low trip count loops. @Ayal, I wonder if you have any results that this patch actually improves the performance. Thanks!
Addressing comments from @tejohnson. Thanks!
Jun 18 2017
@tejohnson Sorry for the late reply. I was out of internet for days. I have no objection to set the default value to true, but prefer to have it as a separate patch so that we can track the impact on performance of each chance more easily.
Jun 12 2017
Jun 10 2017
@wmi Not at all! Thanks for your comments.
Jun 7 2017
Jun 4 2017
Jun 2 2017
May 31 2017
FYI, below is the compile difference in percentage for spec2006: