Updating to use @MaskRay's suggestion.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
Today
Perhaps wait a day or so in case others have opinions.
LGTM. Please wait for @fhahn LGTM as well.
Updated after rebasing and updating earlier patches, no functional change since last update.
No longer need to disable PostRAScheduler, and fixed tail merging by setting the MachineInst::NoMerge flag.
Added checks to make sure that prologs and epilogs are properly terminated, and epilog start/end are paired correctly. This gives clearer indication if there were issues in code generation (like epilogs being mangled by tail merging).
Hello, this change was approved. Could one of the reviewers help commit this, since I don't have permissions? The failing tests look like unrelated flakes, as they are timeouts and seem to fail differently after each rebase.
LGTM.
Basic idea makes a lot of sense. I'd be tempted to take some cost here, just for the robustness.
this makes sense
Fix func.func
In D125648#3531768, @efriedma wrote:For scheduling, the AArch64 backend overrides "isSchedulingBoundary()" for SEH instructions.
I thought about it and now I don't think embedding html and javascript code into BOLT binary is a good idea.
A better approach would be to embed dot file into html template using an external tool: D126218.
missed llvm-dwp
I'm going to try to summarize an offline discussion w/ @mcgrathr about this here:
IIRC there is no built-in way supporting multiple (but fixed number of) values for an option (e.g. -Xoffload-linker-<triple> <arg>). In D105330 (llvm-nm option refactoring) I used a hack to support -s __DATA __data.
The multiple-value support for OptTable does not allow positional arguments after the option.
One small nit in the code you can choose to fix or not. LGTM. Thanks for the improvements!
David's comments.
Update patch to actually use 100 as limit
Address feedback.
Thank you for the review.
Since this "optimized" value isn't part of the actual JSON schema for the stack frames, and since we aren't using it in the IDE, best to remove this until it is either in the protocol definition or used by the IDE.
A 1 to 2 percent improvement is probably going to go away if the asserts are enabled when it has to do the check each time. If would still hesitate to add this kind of API as it will encourage others to use the APIs. I would be ok with it if the methods are protected or private and if we can add the DWARFLinker as a friend class to allow access to these performant but dangerous APIs only in certain situations that really require it. I really want to avoid most people from being able to use these APIs if possible to make sure we don't end up with bugs. We had many many bugs that stemmed from mismatching DWARFUnit + DWARFDebugIntoEntry before the DWARFDie class was created, mostly when doing cross CU references or in Fission cases where you have the skeleton CU vs the actual DWO CU + a DWARFDebugInfoEntry that comes from the actual DWO file. So there is a real danger to the safety and correctness of the code here.
In D126226#3532257, @MaskRay wrote:It's better to avoid JoinedAndSeparate for new options. It is for --xxx val and --xxxval but not intended for the option this patch will add.
It's better to avoid JoinedAndSeparate for new options. It is for --xxx val and --xxxval but not intended for the option this patch will add.
Perform rebase.
In D124892#3529768, @dblaikie wrote:With ThinLTO enabled we can have a function in a CU that doesn't have split dwarf inlining enabled that gets inlined into CU that has it enabled.
When we try to lookup Abstract Scope Die for the inlined scope in constructInlinedScopeDIE, it's not present and we assert.So it seems there's probably some disconnect between the logic that adds abstract scope DIEs and the logic that looks them up? (one's using the source CU, one's using the destination CU?) - can you confirm that/which one's doing which?
& maybe them consistent would be good/suitable - I guess the right behavior is probably to treat the non-split-dwarf-inlining function as having no debug info for the purposes of the place it's inlined into?
Addressed review comments:
- Added options for sizing of specific instruction groups
- Renamed the DAG Mutation -- renaming suggests welcomed.
- Minor details
Maybe also worth to check gcc's issues with this transformation:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91734
ping
Changing the -Xoffload-linker= to -Xoffload-linker-.
In D126226#3532216, @tra wrote:You do not need to hardcode it. The idea of JoinedAndSeparate is that an option foo assepts two argumants, one glued to it and another following after a whitespace.
So, when you define an option -Xoffload-linker-, and then pass -Xoffload-linker-nvptx64=foo, you will get OPT_offload-linker__ with two arguments. As an example see implementation of plugin_arg which deals with the same kind of problem of passing arguments to an open-ended set of plugins.
Fix wrongly moved clang-format disabling comment when trying to fix clang-format
In D114171#3530859, @alexfh wrote:And finally a reduced test case:
$ cat q.cc struct S { template<int N> bool f() const; }; int f(const S& s) { int d = 0; if (s.f<1>()) ++d; // 4998 lines skipped if (s.f<5000>()) ++d; if (d == 0) { return s.f<-1>(); } return 0; } $ ./clang-11004 --target=x86_64--linux-gnu -O1 -fslp-vectorize -c -xc++ q.cc -o q.o -ftime-report ===-------------------------------------------------------------------------=== ... Pass execution timing report ... ===-------------------------------------------------------------------------=== Total Execution Time: 8.7398 seconds (8.8367 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 7.9827 ( 91.6%) 0.0081 ( 29.3%) 7.9908 ( 91.4%) 8.0861 ( 91.5%) SLPVectorizerPass 0.1950 ( 2.2%) 0.0002 ( 0.6%) 0.1952 ( 2.2%) 0.1962 ( 2.2%) InstCombinePass ...
In D126226#3532147, @jhuber6 wrote:We already use this approach for the -Xopenmp-target=<triple> <arg> option that forwards arguments to the given toolchain, so that's what I was using as a basis.
Remove vector.fma generation
In D126059#3529762, @dblaikie wrote:You mention performance concerns - even in an optimized build?
Nice, thanks for this, Thomas!
In D125979#3529745, @dblaikie wrote:(any chance we can avoid duplicating this, and instead refactor the lld one into LLVM and have lld use that one instead of using its own local copy?)
LGTM (I'd like to see SANTIZER_OSX become SANITIZER_MACOS at some point for consistency.)
In D126226#3532127, @tra wrote:-Xoffload-linker=<triple> <arg>
The syntax is confusing. Normally only triple would be the argument for -Xoffload-linker option.
Having both -Xoffload-linker and -Xoffload-linker= variants also looks odd to me.In effect you're making -Xoffload-linker=foo a full option (as opposed to it being an option -Xoffload-linker= + argument foo) with a separate argument that follows. I guess that might work, but it's a rather unconventional use of command line parser, IMO.
I think the main issue with this approach is that it makes the command line hard to understand. When one sees -Xsomething=a -b it's impossible to tell whether -b is a regular option or something to be passed to -Xsomething=a. My assumption would be the former as -Xsomething= already got its argument a and should have no business grabbing the next one.
I think it would work better if the option could use a - or`_` for the variant that passes the triple. E.g. -Xoffload-linker-nvptx64=-foo or -Xoffload-linker-nvptx64 -foo would be easily interpretable.
Not needed anymore.
Note: design discussion continuing on discourse: https://discourse.llvm.org/t/rfc-introduce-ml-program-dialect-and-top-level-ops-proposal-v2/60907/51
The syntax is confusing. Normally only triple would be the argument for -Xoffload-linker option.
Having both -Xoffload-linker and -Xoffload-linker= variants also looks odd to me.
I tested SPEC2006 int on my skylake desktop.
In D126215#3531780, @martong wrote:We should support deprecated analyzer flags for at least one release. In this case I'm planning to drop this flag in clang-17
Should we emit a warning for the user about the deprecation?
This has been superseded by the above set of patches.
The name is not descriptive. It misses the important words about "hash map". ConcurrentHashMap may be a better name.