vsk (Vedant Kumar)
User

Projects

User Details

User Since
Jul 8 2015, 10:26 AM (174 w, 6 d)

Recent Activity

Yesterday

vsk added inline comments to D54175: [PGO] context sensitive PGO.
Tue, Nov 13, 10:22 PM

Sat, Nov 10

vsk accepted D54385: Remove comments after header includes..

Thanks, lgtm.

Sat, Nov 10, 4:22 PM · Restricted Project

Fri, Nov 9

vsk updated the diff for D54244: [HotColdSplitting] Refine definition of unlikelyExecuted.
  • Split out the change to mark llvm.trap cold, per review feedback.
Fri, Nov 9, 10:46 AM
vsk created D54329: Mark @llvm.trap cold.
Fri, Nov 9, 9:46 AM

Thu, Nov 8

vsk added a comment to D54217: Add total function byte size and inline function byte size to "llvm-dwarfdump --statistics".

Why not check in a .o? The numbers won't change, so they can meaningfully be compared.

Thu, Nov 8, 5:04 PM
vsk accepted D54266: [llvm-cov] Add lcov tracefile export format..

Thanks.

Thu, Nov 8, 3:38 PM
vsk added a comment to D54217: Add total function byte size and inline function byte size to "llvm-dwarfdump --statistics".

This looks good to me. I think it'd help to have a +1 from Adrian, as he's worked on this file much more than I have.

Thu, Nov 8, 2:56 PM
vsk added inline comments to D54266: [llvm-cov] Add lcov tracefile export format..
Thu, Nov 8, 12:58 PM
vsk added a comment to D54266: [llvm-cov] Add lcov tracefile export format..

Oh, I just realized this also needs an entry in CommandGuide (I think the file is called llvm-cov.rst).

Thu, Nov 8, 12:56 PM
vsk added a comment to D54195: Fix linker option for -fprofile-arcs -ftest-coverage.

Thanks for the patch. Would you mind uploading a diff with additional context (e.g. git diff -U10000)? Phab doesn't render it.

Thu, Nov 8, 12:01 PM
vsk added reviewers for D54195: Fix linker option for -fprofile-arcs -ftest-coverage: calixte, marco-c, sylvestre.ledru.
Thu, Nov 8, 11:59 AM
vsk updated subscribers of D54175: [PGO] context sensitive PGO.

Hi Rong, at a high-level, I like what this patch is doing. I'll try to leave in-depth comments by next week -- please ping the review otherwise.

Thu, Nov 8, 11:16 AM
vsk accepted D54266: [llvm-cov] Add lcov tracefile export format..

I suggest adding a note in docs/ReleaseNotes.rst about this.

Thu, Nov 8, 11:02 AM

Wed, Nov 7

vsk added a comment to D54244: [HotColdSplitting] Refine definition of unlikelyExecuted.

This is motivated by functions like exit() and longjmp(), which are not beneficial to outline.

exit() seems like a weird example: it executes at most once per process. What's the downside of outlining it?

Wed, Nov 7, 5:39 PM
vsk updated the summary of D54244: [HotColdSplitting] Refine definition of unlikelyExecuted.
Wed, Nov 7, 5:14 PM
vsk created D54244: [HotColdSplitting] Refine definition of unlikelyExecuted.
Wed, Nov 7, 5:14 PM
vsk added a comment to D54217: Add total function byte size and inline function byte size to "llvm-dwarfdump --statistics".

Looks reasonable to me, but it'd help to check in a .o for a test.

Wed, Nov 7, 5:14 PM
vsk accepted D52034: [Clang] Add options -fprofile-filter-files and -fprofile-exclude-files to filter the files to instrument with gcov.
Wed, Nov 7, 3:23 PM
vsk updated the diff for D54189: [HotColdSplitting] Ensure PHIs have unique incoming values.
  • Do not add duplicate blocks to the outlining region. The test for this is 'forward-dfs-reaches-marked-block.ll'.
Wed, Nov 7, 2:48 PM
vsk added a comment to D53887: [HotColdSplitting] Outline more than once per function.
In D53887#1290587, @vsk wrote:

I guess it would be good to give some stress test on this pass to see if there is any hidden bug by relaxing conditions in unlikelyExecuted(). As we treat more blocks as cold without being limited on unlikelyExecuted(), we maybe able to expose hidden issues with it.

That's a great idea. I'll do that now with the test-suite + externals. FWIW, I also built all of iOS with this pass enabled, and the only compiler crash I found was https://bugs.llvm.org/show_bug.cgi?id=39564. I'm still investigating possible miscompiles.

Wed, Nov 7, 2:36 PM
vsk added a comment to D54217: Add total function byte size and inline function byte size to "llvm-dwarfdump --statistics".
In D54217#1290504, @vsk wrote:

Thanks. There seems to be some overlap with

[llvm-dev] RFC: Adding a code size analysis tool
https://lists.llvm.org/pipermail/llvm-dev/2018-September/126501.html

how was that discussion resolved?

I think the response was generally positive but I haven't had the time to push things into tree. @clayborg, have you tried out the tool here: https://github.com/vedantk/llvm-project ? The RFC has some usage instructions.

If you just want the ratio of inlined to not-inlined bytes, this patch is a simpler way to do it. But if you need something more fine-grained (i.e. *which* functions were inlined more compared to a baseline, by how much, etc.), it might be worth checking out. Happy to have someone else run with the code.

This current patch is indeed for an overview of inlined to not-inlined bytes. I have other patches to --statistics that will display more output coming. Breaking down exact sizes of inlined functions by count and size will be part of this future output. But this current one is just for a quick breakdown. Like you said in the URL provided, it is nice to know that say 30% of your code is inlined. If you are expecting less inlining and try a new option, this can be a quick way to see results in a very concise kind of way.

Wed, Nov 7, 1:29 PM
vsk added a comment to D53887: [HotColdSplitting] Outline more than once per function.

I guess it would be good to give some stress test on this pass to see if there is any hidden bug by relaxing conditions in unlikelyExecuted(). As we treat more blocks as cold without being limited on unlikelyExecuted(), we maybe able to expose hidden issues with it.

Wed, Nov 7, 1:00 PM
vsk added a comment to D53887: [HotColdSplitting] Outline more than once per function.

IMHO the right fix is to not treat noreturn calls as cold (also as a follow-up).

I'm not sure if handling noreturn is a right fix. A block containing exit(0) will have "unreachable", so it must be still considered as a cold block even after removing noreturn from unlikelyExecuted().

Wed, Nov 7, 12:55 PM
vsk updated subscribers of D54217: Add total function byte size and inline function byte size to "llvm-dwarfdump --statistics".

Thanks. There seems to be some overlap with

[llvm-dev] RFC: Adding a code size analysis tool
https://lists.llvm.org/pipermail/llvm-dev/2018-September/126501.html

how was that discussion resolved?

Wed, Nov 7, 11:59 AM
vsk added inline comments to D54189: [HotColdSplitting] Ensure PHIs have unique incoming values.
Wed, Nov 7, 11:52 AM
vsk updated the diff for D54189: [HotColdSplitting] Ensure PHIs have unique incoming values.
  • Address review feedback from Aditya.
Wed, Nov 7, 11:52 AM
vsk accepted D53390: [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG..

LGTM, although I don't think the 'pr39187-gline-tables-only.ll' adds any additional test coverage, so I'd suggest deleting it. The other test already tests the interesting case (dbg intrinsic prior to terminator with a different location).

Wed, Nov 7, 9:42 AM · debug-info

Tue, Nov 6

vsk added inline comments to D54189: [HotColdSplitting] Ensure PHIs have unique incoming values.
Tue, Nov 6, 5:20 PM
vsk added inline comments to D54189: [HotColdSplitting] Ensure PHIs have unique incoming values.
Tue, Nov 6, 5:14 PM
vsk created D54189: [HotColdSplitting] Ensure PHIs have unique incoming values.
Tue, Nov 6, 4:35 PM
vsk updated the diff for D53887: [HotColdSplitting] Outline more than once per function.

Removing the WIP label, as I'm confident in the results now.

Tue, Nov 6, 3:29 PM
vsk added inline comments to D53887: [HotColdSplitting] Outline more than once per function.
Tue, Nov 6, 12:02 PM
vsk added inline comments to D53887: [HotColdSplitting] Outline more than once per function.
Tue, Nov 6, 11:58 AM
vsk added inline comments to D53390: [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG..
Tue, Nov 6, 7:28 AM · debug-info

Mon, Nov 5

vsk updated subscribers of D53596: [ThinLTO] Fix a crash in lazy loading of Metadata.
Mon, Nov 5, 10:58 AM
vsk added inline comments to D53390: [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG..
Mon, Nov 5, 8:43 AM · debug-info

Sun, Nov 4

vsk added a comment to D53887: [HotColdSplitting] Outline more than once per function.

Can we push this patch? This is not enabled by default so we can continue development in subsequent patches.

Sun, Nov 4, 3:07 PM

Fri, Nov 2

vsk updated the summary of D53887: [HotColdSplitting] Outline more than once per function.
Fri, Nov 2, 4:38 PM
vsk added inline comments to D53887: [HotColdSplitting] Outline more than once per function.
Fri, Nov 2, 4:16 PM
vsk updated the diff for D53887: [HotColdSplitting] Outline more than once per function.
  • While doing performance testing I found a miscompile in ./SingleSource/Regression/C++/EH/Regression-C++-class_hierarchy. I'll file a PR with more details by next week. It looks like it could be an existing bug that surfaces due to more aggressive outlining.
  • Added tests (including the one from @sebpop's earlier patch).
Fri, Nov 2, 4:16 PM
vsk accepted D54019: [DebugInfo][InstMerge] Fix -debugify for phi node created by -mldst-motion.

Thanks, lgtm!

Fri, Nov 2, 8:08 AM · debug-info

Wed, Oct 31

vsk added inline comments to D53390: [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG..
Wed, Oct 31, 1:49 PM · debug-info
vsk added a comment to D53596: [ThinLTO] Fix a crash in lazy loading of Metadata.
In D53596#1282167, @vsk wrote:

Hi Teresa, with this patch applied, I do still see a (possibly unrelated) crash when building an internal framework with ThinLTO + hot/cold splitting. I'm not sure what the best way is to prepare a reproducer (please let me know), but I've tried to collect some output from the debugger:

Doesn't look like this one has anything to do with the issue I'm fixing here. The stack trace shows it is coming from the call to materializeMetadata() from FunctionImport.cpp, and that is called before we invoke the IRMover (which is where we encountered the code with the bug, and the function changed by my patch). The code changed in this patch is not reached via materializeMetadata.

Can you file a new bug for the problem you encountered, along with some kind of reproducer? Looks like it might be hitting infinite, or at least very deep, recursion and running out of stack.

But what version are you on? The stack trace says the call to materializeMetadata is on line 765 of FunctionImport.cpp, and it is currently nowhere near that location and hasn't been in awhile.

Wed, Oct 31, 1:19 PM
vsk added inline comments to D53887: [HotColdSplitting] Outline more than once per function.
Wed, Oct 31, 9:03 AM
vsk added a comment to D53596: [ThinLTO] Fix a crash in lazy loading of Metadata.

Hi Teresa, with this patch applied, I do still see a (possibly unrelated) crash when building an internal framework with ThinLTO + hot/cold splitting. I'm not sure what the best way is to prepare a reproducer (please let me know), but I've tried to collect some output from the debugger:

Wed, Oct 31, 7:09 AM

Tue, Oct 30

vsk added a comment to D53729: [llvm-cov] Don't remap existing paths.

Marking the tests are 'unsupported' would be a good way to surface the warning, as the number (and possibly the list?) of unsupported tests is printed out at the end of each test run.

Tue, Oct 30, 2:41 PM
vsk created D53887: [HotColdSplitting] Outline more than once per function.
Tue, Oct 30, 1:26 PM
vsk added a comment to D53588: [hot-cold-split] split more than a cold region per function.

For splitting more than one cold region, maintaining a DT maybe expensive but we don't have to do that. All we need is to 'color/mark' the blocks which we want to outline. In the next iteration the colored blocks need not be considered. It may be slightly non-trivial in a general case where coloring SEME and reasoning about DT/PDT of blocks which are non-colored. What we can do is in the subsequent iterations we can take a sub-graph which do not intersect anywhere except at entry or exit. This way DT/PDT will still be preserved in the non-intersecting regions. I think jump-threading also works(should work) on similar lines.

Tue, Oct 30, 10:00 AM

Mon, Oct 29

vsk updated the diff for D53835: [HotColdSplitting] Use TTI to inform outlining threshold.
  • Move a test that's particularly X86-specific into an X86 directory.
Mon, Oct 29, 2:57 PM
vsk created D53835: [HotColdSplitting] Use TTI to inform outlining threshold.
Mon, Oct 29, 2:42 PM
vsk added inline comments to D53824: [HotColdSplitting] Allow outlining single-block cold regions.
Mon, Oct 29, 2:01 PM
vsk updated subscribers of D53779: [CodeExtractor] Allow extracting allocas within simple stack{save,restore} pairs.

Eli points out that D53514 will improve the codegen for the use case I'm most interested in, but in a way that makes this patch a bit less relevant. While it should still be useful to deal with actual VLAs, I'll need some other solution to allow outlining calls to os_log.

Mon, Oct 29, 1:47 PM
vsk created D53824: [HotColdSplitting] Allow outlining single-block cold regions.
Mon, Oct 29, 11:53 AM
vsk added a comment to D53729: [llvm-cov] Don't remap existing paths.

Maybe we could just add 'REQUIRES: build-dir-not-slash-tmp' or something to the affected lit tests.

It would be unfortunate to disable tests just because of this issue. I understand your point, though, and remapping always certainly has simpler semantics.

How difficult is it to add new REQUIRES clauses? I've never done it before.

Mon, Oct 29, 9:24 AM

Fri, Oct 26

vsk added a comment to D53779: [CodeExtractor] Allow extracting allocas within simple stack{save,restore} pairs.

__builtin_os_log_format_buffer_size is a compiler builtin which always returns a constant integer.

Fri, Oct 26, 5:19 PM
vsk updated the diff for D53779: [CodeExtractor] Allow extracting allocas within simple stack{save,restore} pairs.
  • Fix the issues Eli pointed out.
Fri, Oct 26, 4:12 PM
vsk created D53779: [CodeExtractor] Allow extracting allocas within simple stack{save,restore} pairs.
Fri, Oct 26, 3:00 PM
vsk added a comment to D53593: [GCOV] Flush counters before to avoid counting the execution before fork twice and for exec** functions we must flush before the call.

Don't you just want to reset the counters in the child process?

Fri, Oct 26, 2:39 PM
vsk added a comment to D53729: [llvm-cov] Don't remap existing paths.

Even if there's a file at the path we're supposed to remap, shouldn't we still do the remap? Couldn't the file we're intended to read be at the remapped path?

Fri, Oct 26, 9:59 AM

Thu, Oct 25

vsk added a comment to D53740: [globalisel][irtranslator] Verify that DILocations aren't lost in translation.

The verifier bits look great, I'll let someone else comment on the functionality changes. Thanks!

Thu, Oct 25, 4:48 PM
vsk added a comment to D53731: [NativePDB] Add the ability to display global variables.

Could you also use Vedant's new FileCheck dotest test class? That should allow you to write the tests exactly as you are, but use the dotest mechanism to build and run the example.

Adding Vedant here. My understanding is that the work he did there is like the inverse of what I'm doing. It allows you to use FileCheck from inside of dotest tests, but I was trying to bypass dotest here. I don't necessarily think "dotest for all things" should be an explicit goal (i actually think long term we should move away from it, but that's for another day). Aside from that though, I don't think this particular test needs any of the functionality that dotest offers. It offers building in multiple configurations, but we can get that from this test by just specifying mulitple run lines (I'm testing this out locally and it seems like I can get it to work). Eventually, using some kind of configuration / builder type system like I brainstormed in the review thread I linked previously, I think we could have all the advantages of dotest's builder even with this style of test.

FWIW, I was also under the impression that Vedant's solution with FileCheck in dotest was only intended as sort of an immediate solution to a problem, but not necessary the long term desired end-state. (I could be wrong about this though).

Thu, Oct 25, 4:19 PM
vsk added a comment to D53588: [hot-cold-split] split more than a cold region per function.

@sebpop are you interested in rebasing this on the new cold block propagation code? If not, I'd be happy to give it a try. Is the main challenge in teaching CodeExtractor to update the DT and PDT?

Thu, Oct 25, 3:45 PM
vsk resigned from D51974: [GCOV] Handle correctly multiple CUs when profiling.

Thank you for working on this. I don't have any experience or familiarity with gcov to give this a proper review. @uweigand or @sylvestre.ledru may be able to give a better review.

Thu, Oct 25, 10:06 AM
vsk added a comment to D52034: [Clang] Add options -fprofile-filter-files and -fprofile-exclude-files to filter the files to instrument with gcov.
Thu, Oct 25, 10:01 AM
vsk added a comment to D52034: [Clang] Add options -fprofile-filter-files and -fprofile-exclude-files to filter the files to instrument with gcov.

@vsk, gcc guys are ok for -fprofile-filter-files and -fprofile-exclude-files, are you ok with that ?

Thu, Oct 25, 10:00 AM

Wed, Oct 24

vsk updated the diff for D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.
Wed, Oct 24, 1:04 PM
vsk updated the diff for D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.
  • Disable a problematic CodeExtractor unit test. CodeExtractor miscompiles the code in the test. Any time there are live output values from an outlined region, there's a chance CodeExtractor will leave around a phi node with incoming values from a different function.
Wed, Oct 24, 12:33 PM
Herald updated subscribers of D36109: [CodeGen] Provide an advanced shrink-wrapping interface.
Wed, Oct 24, 12:26 PM
vsk updated the diff for D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.

Fixed a bug in CodeExtractor found by building the test suite with an asserts-enabled compiler.

Wed, Oct 24, 12:02 PM
vsk planned changes to D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.

I've been fuzzing+testing this patch further and found a verification failure in CodeExtractor. As we now support outlining regions with multiple exits, CodeExtractor cannot leave around duplicate incoming values from the codeRepl block. I think this should be simple to fix.

Wed, Oct 24, 11:21 AM
vsk added a comment to D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.

Thanks for the review.

Wed, Oct 24, 10:44 AM
vsk added a comment to D53534: [hot-cold-split] Name split functions with ".cold" suffix.

Does the pass groups hot code together? is it in your plans?

@sebpop, @vsk or @hiraditya can probably answer this. I'm not working on the core hot cold splitting algorithm, just some issues I'm finding as I try it. But presumably the linker should be grouping the split hot functions together and the split cold functions together, when building with -ffunction-sections and PGO which should put them in the .text.hot and .text.unlikely sections, respectively.

Wed, Oct 24, 9:46 AM
vsk accepted D53287: [DebugInfo][Dexter] Unreachable line stepped onto after SimplifyCFG..

LGTM, thanks. I'm sorry for the delay in reviewing this.

Wed, Oct 24, 8:04 AM · debug-info

Tue, Oct 23

vsk updated the diff for D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.
  • Before this patch, 47KB of text was outlined in a test-suite run (X86, -Os, LNT + externals). After this patch, 134KB was outlined, so roughly a ~3x improvement.
  • Fix a typo in a comment.
Tue, Oct 23, 6:15 PM
vsk created D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.
Tue, Oct 23, 5:44 PM
vsk accepted D53611: [hot-cold-split] Only perform splitting in ThinLTO backend post-link.

Thank you! For completeness, it might help to have a test for the case 'Phase = PostLink'. Happy to leave it to you to decide.

Tue, Oct 23, 2:15 PM
vsk added a comment to D53437: Schedule Hot Cold Splitting pass after most optimization passes.
In D53437#1271395, @vsk wrote:

Thanks for your notes @tejohnson.

Note that this addresses in the old PM one issue I found when enabling splitting with ThinLTO, but exposes some related issues. In the old location, which was before the early return when PrepareForThinLTO==true around line 590, we were doing two rounds of splitting: one during the compile step, and a second later in the ThinLTO backends. I saw that we split an already split function (see below for more info). But I notice that there are still a couple issues:

  1. In the old PM, we will still do splitting early (during the compile step) for regular LTO, which doesn't return early from this routine. You should presumably guard the adding of the hot cold split pass by "if (!PrepareForLTO)"
  2. Once you've done 1), regular LTO will no longer perform any splitting, since its backend doesn't invoke populateModulePassManager. You can fix this by adding your hot cold splitting pass to an equivalent location in addLTOOptimizationPasses(), which is invoked only in the regular LTO backend.
  3. In the new PM, the move of the hot cold splitting pass does not fix the issue for ThinLTO. The reason is that there is no early exit from buildModuleSimplificationPipeline. To fix, you can guard the hot cold split pass by "Phase != ThinLTOPhase::PreLink".
  4. In the new PM, regular LTO invokes buildModuleSimplificationPipeline in the compile step, but without any extra info to indicate that it is an LTO compile. So something else will need to be done to suppress the hot cold splitting from happening early for regular LTO (e.g. pass down a new flag?).
  5. Once 4) is done, regular LTO will no longer perform any splitting in the new PM since its backend doesn't invoke buildModuleSimplificationPipeline. To fix, you will want to add the hot cold split pass to an appropriate place in buildLTODefaultPipeline().

    This all assumes that you want to only do splitting in the *LTO backends (i.e. after cross module inlining). What are the tradeoffs of doing the splitting before/after inlining?

One advantage of doing splitting before inlining is, as you pointed out, that more inlining opportunities are created because hot functions become smaller. The downside is that outlining may be a barrier for intra-function optimizations which benefit from seeing both sides of a branch (hoisting/sinking, const prop, maybe jump threading?).

I think the right thing to do is to conservatively avoid introducing barriers by delaying splitting as much as possible. I'll work on gathering some pre/post-patch performance numbers on arm64.

One advantage of doing the splitting earlier for ThinLTO, in the compile step, is that it would reduce the size of the functions during the import analysis, and we might import more hot (split) functions. But then the splitting would be happening before the cross-module inlining in the backends.

Regarding what I am currently seeing with two rounds of splitting with ThinLTO: I noticed that the second round of splitting in the ThinLTO backends was resplitting an already split function - specifically, the cold outlined function was getting split again, which doesn't make a lot of sense since presumably the edge weights were all cold. Note this is after r344558 (the fix for the issue of outlining the whole function). Why would that happen?

Tue, Oct 23, 11:56 AM
vsk added a comment to D53596: [ThinLTO] Fix a crash in lazy loading of Metadata.

Thanks for the patch! Your explanation makes sense to me, but I'm not familiar enough with this code to give a +1. @steven_wu, any thoughts on this?

Tue, Oct 23, 11:52 AM
vsk added a comment to D53588: [hot-cold-split] split more than a cold region per function.
In D53588#1272789, @vsk wrote:

@sebpop thanks for this patch! I don't see any problems with it (although I would prefer that the test explicitly check that outlined functions contain the correct instructions).

At a higher-level, I'm seeing some problems with the forward/back cold propagation done in getHotBlocks on internal projects. The propagation seems to stop when it encounters simple control flow, like an if-then-else or a for loop, after which cold code is unconditionally executed.

I have a prototype of a different propagation scheme which overcomes some of these limitations. The idea is to mark blocks which are post-dominated by a cold block, or are dominated by a cold block, as cold. This is able to handle the control flow I described, and isn't limited to requiring a single exit block. Could you give me a day to evaluate it further, run benchmarks etc. and report back? If it turns out to be promising, istm that it'd make sense to rebase this patch on top of it.

I was discussing with Aditya over the llvm dev meeting that it would be good to move the static analysis of hot/cold blocks to an analysis pass instead of carrying it in the hot/cold split pass.
That way we will have a smaller transform pass and other passes could use the static analysis of hot/cold blocks.

Tue, Oct 23, 11:22 AM
vsk added a comment to D53588: [hot-cold-split] split more than a cold region per function.
In D53588#1272789, @vsk wrote:

@sebpop thanks for this patch! I don't see any problems with it (although I would prefer that the test explicitly check that outlined functions contain the correct instructions).

At a higher-level, I'm seeing some problems with the forward/back cold propagation done in getHotBlocks on internal projects. The propagation seems to stop when it encounters simple control flow, like an if-then-else or a for loop, after which cold code is unconditionally executed.

Tue, Oct 23, 11:22 AM
vsk added a comment to D53588: [hot-cold-split] split more than a cold region per function.

@sebpop thanks for this patch! I don't see any problems with it (although I would prefer that the test explicitly check that outlined functions contain the correct instructions).

Tue, Oct 23, 10:39 AM

Mon, Oct 22

vsk added a comment to D53437: Schedule Hot Cold Splitting pass after most optimization passes.

Thanks for your notes @tejohnson.

Mon, Oct 22, 1:58 PM
vsk created D53518: [HotColdSplitting] Attach MinSize to outlined code.
Mon, Oct 22, 11:39 AM
vsk added a comment to D53437: Schedule Hot Cold Splitting pass after most optimization passes.

It looks like the test is failing on some of the Arm buildbots. In particular some of them don't have the X86 backend: http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/4611 so it looks like it is most likely missing a REQUIRES: x86. Can you take a look?

Mon, Oct 22, 9:52 AM
vsk accepted D53459: Ensure sanitizer check function calls have a !dbg location.

Thanks!

Mon, Oct 22, 9:11 AM

Sat, Oct 20

vsk updated subscribers of D53243: [DeadArgElim][ArgumentPromotion] Preserve MD_Associated metadata when re-creating function.

Why not use RAUW to fix up any metadata which refers to a function?

Sat, Oct 20, 3:31 PM
vsk accepted D53437: Schedule Hot Cold Splitting pass after most optimization passes.

Looks great, thanks!

Sat, Oct 20, 11:11 AM

Fri, Oct 19

vsk created D53469: [DWARF] Use a function-local offset for AT_call_return_pc.
Fri, Oct 19, 6:56 PM
vsk added inline comments to D53459: Ensure sanitizer check function calls have a !dbg location.
Fri, Oct 19, 5:21 PM
vsk added inline comments to D53459: Ensure sanitizer check function calls have a !dbg location.
Fri, Oct 19, 5:12 PM
vsk added a comment to D53437: Schedule Hot Cold Splitting pass after most optimization passes.

Could you add a test for this similar to 'test/Other/opt-Os-pipeline.ll'?

Fri, Oct 19, 8:53 AM
vsk accepted D53415: [lldbsuite, windows] Disable two tail call frames tests that fail on Windows.

Thanks, lgtm.

Fri, Oct 19, 8:49 AM

Mon, Oct 15

vsk added a comment to D41474: Fix a crash in lazy loading of Metadata in ThinLTO.

Hi @tejohnson @Sunil_Srivastava, just a heads-up that with https://reviews.llvm.org/rL344545 applied, I'm hitting "Invalid function metadata: outgoing forward refs" in an internal project with ThinLTO and hot/cold splitting enabled. There's some sort of problem caused by having an outlined function with debug locations in it but no DISubprogram. I'll try to dig a bit deeper.

Mon, Oct 15, 1:50 PM
Herald updated subscribers of D41474: Fix a crash in lazy loading of Metadata in ThinLTO.
Mon, Oct 15, 1:39 PM
vsk added a comment to D53267: [CodeExtractor] Erase debug intrinsics in outlined thunks.
In D53267#1265681, @vsk wrote:
In D53267#1265669, @vsk wrote:

Does this patch fix https://bugs.llvm.org/show_bug.cgi?id=22900
In which case you may want to add the testcases from that bug.

It does. I'm not sure that there's extra value in including the attached test case -- it uses an old textual IR format that's hard to update, and reduces to something similar to the test in this patch.

Fwiw I've tested this patch by building several internal frameworks and running Csmith over the weekend. I found no regressions. With Csmith, the testing method was to run -Os -g -mllvm -hot-cold-split={true, false} and check that the two CRCs were the same.

Thanks for the fix. I didn't see this patch before I updated that bug. This should fix my issue, which related to some llvm.dbg.value intrinsics in outlined code that weren't updated properly.

I should add that in the one case I looked at, the first argument to the outlined llvm.dbg.value was not updated correctly, which seems a little different than the failure mode here. So there might be multiple issues with these outlined intrinsics that need fixing. Not sure if you want to update the comment.

Hm, I think that's the same failure mode I saw. A dbg.value from the original function was moved to the outlined function, but its ValueAsMetadata operand (the "function-local metadata") wasn't updated to point to an Argument in the outlined function.

Yeah, sounds like it.

The error I'm referring to comes from Verifier::visitValueAsMetadata.

Yep, I saw that one when building with hot cold splitting and -g. Originally the failure I saw was different, because it manifests differently when building with ThinLTO, where essentially the bitcode gets corrupted because the operand types don't match.

Mon, Oct 15, 11:42 AM
vsk added a comment to D53267: [CodeExtractor] Erase debug intrinsics in outlined thunks.
In D53267#1265669, @vsk wrote:

Does this patch fix https://bugs.llvm.org/show_bug.cgi?id=22900
In which case you may want to add the testcases from that bug.

It does. I'm not sure that there's extra value in including the attached test case -- it uses an old textual IR format that's hard to update, and reduces to something similar to the test in this patch.

Fwiw I've tested this patch by building several internal frameworks and running Csmith over the weekend. I found no regressions. With Csmith, the testing method was to run -Os -g -mllvm -hot-cold-split={true, false} and check that the two CRCs were the same.

Thanks for the fix. I didn't see this patch before I updated that bug. This should fix my issue, which related to some llvm.dbg.value intrinsics in outlined code that weren't updated properly.

I should add that in the one case I looked at, the first argument to the outlined llvm.dbg.value was not updated correctly, which seems a little different than the failure mode here. So there might be multiple issues with these outlined intrinsics that need fixing. Not sure if you want to update the comment.

Mon, Oct 15, 11:24 AM
vsk added a comment to D53267: [CodeExtractor] Erase debug intrinsics in outlined thunks.

Does this patch fix https://bugs.llvm.org/show_bug.cgi?id=22900
In which case you may want to add the testcases from that bug.

Mon, Oct 15, 11:15 AM
vsk added a comment to D53267: [CodeExtractor] Erase debug intrinsics in outlined thunks.

What kind of DISubprogram do we assign to the outlined function now?

Mon, Oct 15, 9:16 AM

Oct 14 2018

vsk created D53267: [CodeExtractor] Erase debug intrinsics in outlined thunks.
Oct 14 2018, 11:20 PM
vsk accepted D53244: [Coverage] Fix PR39258: support coverage regions that start deeper than they end.

Thanks, LGTM.

Oct 14 2018, 6:17 PM