Page MenuHomePhabricator

vsk (Vedant Kumar)
User

Projects

User Details

User Since
Jul 8 2015, 10:26 AM (309 w, 3 d)

Recent Activity

Fri, Jun 11

vsk added a comment to D103588: [trace] Create a top level ThreadTrace class.

I think this is moving in the right direction! However, the way ThreadTrace is specified still raises the same scalability concerns (see inline). The "instruction index" concept doesn't seem sufficient. Decoding needs to be done in two stages: a separate "trace index" concept would help with that. In the first stage, lldb identifies which subset of the trace it needs decoded by specifying start/stop trace indices. This should be fast, as counting the number of available trace bytes/"chunks" is O(1). Nothing needs to be decoded at this point. The second stage starts after a trace subset is decoded, and is where instruction indices are useful.

Fri, Jun 11, 2:30 PM · Restricted Project

Thu, Jun 10

vsk added a comment to D104060: Machine IR Profile.

Thanks for sharing this! Specifically re: the -fprofile-generate v. MIP comparison for code coverage, perhaps it would be more fair to compare against -finstrument-function-entry-bare. The latter only adds one call per function, whereas -fprofile-generate can add instrument a function in more than one place.

Thu, Jun 10, 4:42 PM · Restricted Project
vsk added a comment to D25456: [InstrProf] Add support for dead_strip+live_support functionality.
In D25456#2794689, @vsk wrote:

If llvm stopped appending __profd_foo to llvm.compiler.used, there might not be anything left to prevent the optimizer from discarding it. The current scheme of having the linker ignore N_NO_DEAD_STRIP if S_ATTR_LIVE_SUPPORT is in effect seems like a neat way to solve the issue. Is there a tidy alternative?

Currently __profd_foo is in llvm.used instead of llvm.compiler.used on Mach-O.

D103372 My thought (but I don't have a device to test) is that Mach-O can use appendToCompilerUsed(*M, CompilerUsedVars); as well.

Thu, Jun 10, 2:25 PM · Restricted Project
vsk added a comment to D96854: [CodeExtractor] Enable partial aggregate arguments.

@ggeorgakoudis apologies again for the delay here.

Thu, Jun 10, 2:21 PM · Restricted Project

Wed, Jun 9

vsk added a comment to D103588: [trace] Create a top level ThreadTrace class.

I've been thinking about what you said and I'm having second thoughts on my implementation. I'll share more context:

  • I want to work in the short term on reverse debugging and reconstruction of stack traces, for that i'll need to know the instruction type of each instruction in the trace, which will be used as part of some heuristics to identify calls and returns between functions.
  • A future application that we plan to work on is adding profiling information to the instructions
  • Right now the intel-pt plugin is storing the decoded instructions in a vector, which works for small traces but wont' for gigantic traces. I imagine that I'll end up removing that vector and make the TraverseInstruction API decode instructions in place keeping one instruction in memory at a time within the intel pt plugin for a given traversal. For that I'll need accessors that can provide information of the current Instruction. As there could be two or more concurrent traversals happening at the same time (I'm trying to be generic here), it might make sense to create an abstract class TraceInstruction that can be extended by each plug-in and implement its getters.

I'm thinking about something like this

class TraceInstruction {
  virtual lldb::addr_t GetLoadAddress() = 0;
  virtual TraceInstructionType() GetInstructionType() = 0;
  virtual uint64_t GetTimestamp() = 0;
  ... anything that can appear in the future 
};

and have no members, leaving to each plug-in the decision of which of those methods to implement and how.

What do you think of this? I think this incorporates your feedback.

Wed, Jun 9, 3:34 PM · Restricted Project

Tue, Jun 8

vsk added a comment to D103588: [trace] Create a top level ThreadTrace class.

This approach - of prototyping trace analyses on a vector<TraceInstruction> representation and then rewriting them later when scaling issues arise - doesn't sound good to me. Even for something simple, like finding a backtrace, the window of instructions needed for analysis can easily grow to a point where the full vector<TraceInstruction> can't be maintained in memory.

It's worth noticing that the vector<TraceInstruction> is internal to the Intel PT plugin. It's not exposed at the Trace.h interface level, so any consumers can't make any assumptions on how the data is stored. We will improve the intel pt plugin itself as we make progress on this, but we are ensuring that the interfaces are correct for the future.

Tue, Jun 8, 5:07 PM · Restricted Project
vsk added a comment to D103588: [trace] Create a top level ThreadTrace class.

Right now we haven't implemented lazy decoding; we are decoding the entire trace. But that's just because so far we have been working with small traces. As we progress in this work, we'll start working with larger traces and we'll have to do implement the lazy decoding, for which the TraverseInstructions API will come handy.

Tue, Jun 8, 2:52 PM · Restricted Project
vsk requested changes to D103588: [trace] Create a top level ThreadTrace class.

I'm quite concerned about the design decision to represent a trace as a vector of TraceInstruction. This bakes considerations that appear to be specific to Facebook's usage of IntelPT way too deep into lldb's APIs, and isn't workable for our use cases.

Tue, Jun 8, 11:09 AM · Restricted Project
vsk requested changes to D103500: [trace][intel-pt] Create basic SB API.

Thanks, excited to see this work progress!

Tue, Jun 8, 10:38 AM · Restricted Project

Wed, Jun 2

vsk added a comment to D25456: [InstrProf] Add support for dead_strip+live_support functionality.

If llvm stopped appending __profd_foo to llvm.compiler.used, there might not be anything left to prevent the optimizer from discarding it. The current scheme of having the linker ignore N_NO_DEAD_STRIP if S_ATTR_LIVE_SUPPORT is in effect seems like a neat way to solve the issue. Is there a tidy alternative?

Wed, Jun 2, 2:24 PM · Restricted Project
vsk added a comment to D103372: [InstrProfiling] If no value profiling, make data variable private and (for Windows) use one comdat.

While I haven't spotted any issues with this patch, I feel it'd be better for someone with more familiarity with COFF linking to offer a second +1.

Wed, Jun 2, 2:11 PM · Restricted Project, Restricted Project, Restricted Project
vsk added a comment to D103355: [InstrProfiling] Delete linkage/visibility toggling for Windows.

Thanks for adding these excellent tests. Lgtm as well.

Wed, Jun 2, 2:01 PM · Restricted Project, Restricted Project
vsk added a comment to D103220: Support stripping indirectly referenced DILocations from !llvm.loop metadata in stripDebugInfo().

Thanks Adrian. + 1 from me as well, FTR.

Wed, Jun 2, 7:53 AM · Restricted Project

Tue, May 25

vsk added a comment to D102428: [StopInfoMachException] Summarize arm64e BLRAx/LDRAx auth failures.

Friendly ping.

Tue, May 25, 11:12 AM · Restricted Project

Mon, May 24

vsk accepted D103020: [lldb] Add missing mutex guards to TargetList::CreateTarget.

Thanks, lgtm.

Mon, May 24, 10:10 AM · Restricted Project

Thu, May 20

vsk added a comment to D20401: [Lexer] Don't merge macro args from different macro files.

I know this was sped up slightly in 3339c568c43e4644f02289e5edfc78c860f19c9f, but this change makes updateConsecutiveMacroArgTokens the hottest function in clang in a bottom up profile of an entire build of the Linux kernel. It thrashes the one entry LastFileIDLookup cache, and we end up looking up the same FileID again and again and again for each token when we expand nested function like macros.

Is there anything we can do to speed this up? Is there any way to record which FileID corresponds to a given Token so that we don't have to keep rematerializing that? Is it possible to find whether two SourceLocations correspond to the same FileID without having to figure out which FileID in particular each belongs to?

Thu, May 20, 10:19 AM
vsk accepted D102818: [PGO] Don't reference functions unless value profiling is enabled.

Thanks, lgtm as well. On Darwin, the __llvm_prf_data section is marked with S_ATTR_LIVE_SUPPORT to allow the linker to dead strip functions even if they are pointed-to by a profd global. Removing the reference altogether should yield even better size benefits.

Thu, May 20, 9:53 AM · Restricted Project, Restricted Project, Restricted Project

Wed, May 19

vsk accepted D101780: [CoverageMapping] Handle gaps in counter IDs for source-based coverage.

Thanks, lgtm.

Wed, May 19, 9:56 AM · Restricted Project
vsk committed rG7014a1016143: [profile] Skip mmap() if there are no counters (authored by vsk).
[profile] Skip mmap() if there are no counters
Wed, May 19, 9:32 AM
vsk closed D102735: [profile] Skip mmap() if there are no counters.
Wed, May 19, 9:32 AM · Restricted Project

Tue, May 18

vsk requested review of D102735: [profile] Skip mmap() if there are no counters.
Tue, May 18, 4:37 PM · Restricted Project

Mon, May 17

vsk added inline comments to D102428: [StopInfoMachException] Summarize arm64e BLRAx/LDRAx auth failures.
Mon, May 17, 10:54 AM · Restricted Project
vsk updated the diff for D102428: [StopInfoMachException] Summarize arm64e BLRAx/LDRAx auth failures.

Address review feedback.

Mon, May 17, 10:54 AM · Restricted Project
vsk added inline comments to D102624: [lldb] Optimize expressions.
Mon, May 17, 8:53 AM · Restricted Project

Fri, May 14

vsk added a comment to D102428: [StopInfoMachException] Summarize arm64e BLRAx/LDRAx auth failures.

Thanks for the review!

Fri, May 14, 11:37 AM · Restricted Project
vsk updated the diff for D102428: [StopInfoMachException] Summarize arm64e BLRAx/LDRAx auth failures.

Address code review feedback.

Fri, May 14, 11:37 AM · Restricted Project
vsk accepted D102448: Remove (unneeded) '-asan-use-after-return' from hoist-argument-init-insts.ll..

Great - I just wanted to understand the motivation. Thanks for the cleanup!

Fri, May 14, 10:53 AM · Restricted Project

May 13 2021

vsk added a comment to D102448: Remove (unneeded) '-asan-use-after-return' from hoist-argument-init-insts.ll..

Is there a planned change to the use-after-return instrumentation that would regress this test as-is?

May 13 2021, 5:35 PM · Restricted Project
vsk requested review of D102428: [StopInfoMachException] Summarize arm64e BLRAx/LDRAx auth failures.
May 13 2021, 11:05 AM · Restricted Project

May 12 2021

vsk added inline comments to D101780: [CoverageMapping] Handle gaps in counter IDs for source-based coverage.
May 12 2021, 9:23 AM · Restricted Project

May 11 2021

vsk updated subscribers of D101780: [CoverageMapping] Handle gaps in counter IDs for source-based coverage.

Just so I'm clear, are you saying that every Counter and Expression generated is guaranteed to be added to the coverage map (so there are no index gaps), even if the counter is part of an unreachable (and therefore eliminated) branch/block?

May 11 2021, 4:45 PM · Restricted Project
vsk added a comment to D101780: [CoverageMapping] Handle gaps in counter IDs for source-based coverage.

Thanks for the context. I haven't looked at the Rust FE's coverage implementation, so I'm not familiar with any tradeoffs inherent to the implementation approach. One thing you wrote does concern me:

May 11 2021, 9:26 AM · Restricted Project

May 7 2021

vsk added a comment to D101748: Add inclusion filename filters to llvm-cov.

Please take a look at this patch by Michael Daniels that implements a similar feature:
bfed824b57d14e2ba98ddbaf1a1410cf04a3e279

May 7 2021, 10:19 AM · Restricted Project
vsk added a comment to D101780: [CoverageMapping] Handle gaps in counter IDs for source-based coverage.

Thanks for digging into this. Could you explain why it's necessary to use more coverage counters than there are regions? If there's no longer a reserved counter (counter #0), then are counters N >= Regions.size() somehow redundant (and if so, are they worth deduplicating in the Rust FE)?

May 7 2021, 10:16 AM · Restricted Project

Apr 29 2021

vsk added a comment to D101505: [Driver][sanitizers] Warn about ignored -fsanitize-trap or -fsanitize-recover.

Since this is a fairly impactful driver change, I'd suggest sending a heads-up about it to llvm-dev and adding a release note for extra visibility.

Apr 29 2021, 10:14 AM

Apr 26 2021

vsk committed rZORG2c899820e1ec: [jenkins] Add groovy script for the llvm-cov job (authored by vsk).
[jenkins] Add groovy script for the llvm-cov job
Apr 26 2021, 11:00 AM

Apr 22 2021

vsk resigned from D100446: [Transforms][Debugify] Fix "Missing line" false alarm on PHI nodes.

Context: In D59417 (see also llvm.org/PR37964), there was some debate about whether it's worthwhile to assign a line 0 location to a phi instruction which would otherwise not have a location. The consensus seemed to be: no, it's not worth it. So D75242 was spun off to suppress the false positive in debugify. It looks like D75242 had the unexpected/unfortunate side-effect of introducing a new FP: any unique, preserved DebugLoc attached to a phi instruction because "missing".

Apr 22 2021, 1:57 PM · Restricted Project
vsk accepted D100232: [Coverage] Support overriding compilation directory.

Thanks, lgtm.

Apr 22 2021, 12:13 PM · Restricted Project
vsk accepted D100535: [llvm-cov] Support for v4 format in convert-for-testing.

Sgtm, thanks.

Apr 22 2021, 12:02 PM · Restricted Project
vsk added a comment to D96854: [CodeExtractor] Enable partial aggregate arguments.
In D96854#2695426, @vsk wrote:

for example it makes possible to extract the same region with different exclusions without creating another CodeExtractor instance,

As extractCodeRegion mutates the original function, I assumed it was not possible to reuse a CodeExtractor instance in this way. Is there an in-tree example of CE instance reuse I can take a look at?

Unfortunately, there isn't an example so far. The CodeExtractor instance can be re-used because the analysis of CE stays the same. The exclusion from aggregates affects only the extraction of the outlined function when calling extractCodeRegion.

Apr 22 2021, 11:46 AM · Restricted Project
vsk added a comment to D100535: [llvm-cov] Support for v4 format in convert-for-testing.

@phosek I apologize for the delay in reviewing this.

Apr 22 2021, 11:25 AM · Restricted Project
vsk accepted D101000: Coverage: Document how to collect a profile without a filesystem.

This looks great, thanks.

Apr 22 2021, 10:27 AM · Restricted Project

Apr 21 2021

vsk updated subscribers of D101000: Coverage: Document how to collect a profile without a filesystem.

Thanks for doing this!

Apr 21 2021, 3:33 PM · Restricted Project

Apr 20 2021

vsk accepted D100886: Make sure PHIElimination doesn't copy debug locations across basic blocks..

Thanks, lgtm!

Apr 20 2021, 2:37 PM · Restricted Project, debug-info

Apr 16 2021

vsk added inline comments to rG8359511c62b7: [CodeExtractor] Remove stale llvm.assume calls from extracted region.
Apr 16 2021, 11:44 AM
vsk added a comment to D96854: [CodeExtractor] Enable partial aggregate arguments.

for example it makes possible to extract the same region with different exclusions without creating another CodeExtractor instance,

Apr 16 2021, 11:30 AM · Restricted Project

Apr 9 2021

vsk accepted D100212: [lldb] Delete dead StackFrameList::Merge.

+1

Apr 9 2021, 11:05 AM · Restricted Project

Apr 8 2021

vsk added a comment to D100137: [JumpThreading] merge debug info when merging select+br.

If you'd like, you can pass -debugify-level=locations to elide those dbg.value calls.

Apr 8 2021, 2:03 PM · debug-info, Restricted Project

Apr 1 2021

vsk added a comment to D99695: [llvm-cov] Use -path-equivalence to support relative path..

I think then you would somehow need multiple -compilation-dir options if e.g. you have paths with different levels of ../ nesting, e.g. ../../foo.c and ../../../../bar.c. That sounds like it could become hard to manage.

Apr 1 2021, 11:11 AM · Restricted Project
vsk added a comment to D99695: [llvm-cov] Use -path-equivalence to support relative path..

This patch let llvm-cov construct output directory with the to part of -path-equivalence=from,to as prefix

Apr 1 2021, 10:18 AM · Restricted Project
vsk committed rG7d15fb577945: [lldb/test] Respect --apple-sdk path when querying SDK info (authored by vsk).
[lldb/test] Respect --apple-sdk path when querying SDK info
Apr 1 2021, 10:15 AM
vsk closed D99746: [lldb/test] Respect --apple-sdk path when querying SDK info.
Apr 1 2021, 10:15 AM · Restricted Project
vsk added inline comments to D99746: [lldb/test] Respect --apple-sdk path when querying SDK info.
Apr 1 2021, 10:11 AM · Restricted Project
vsk requested review of D99746: [lldb/test] Respect --apple-sdk path when querying SDK info.
Apr 1 2021, 10:06 AM · Restricted Project
vsk added a comment to D40477: Enable Partial Inlining by default.

I was referring to CodeExtractor. The concerns I had about using it were addressed in D53267, D72801, D72795, and some other follow ups.

Apr 1 2021, 9:47 AM

Mar 29 2021

vsk accepted D99457: [llvm-profdata] Make sure to consume Error on the error path of setIsIRLevelProfile.

Thanks, lgtm!

Mar 29 2021, 4:09 PM · Restricted Project
vsk added a comment to D99457: [llvm-profdata] Make sure to consume Error on the error path of setIsIRLevelProfile.

Please add a test. I think I've reproduced the bug with this:

Mar 29 2021, 9:31 AM · Restricted Project

Mar 25 2021

vsk committed rG414412d3dcbc: [lldb/Commands] Fix spelling of target.move-to-nearest-code in helptext (authored by vsk).
[lldb/Commands] Fix spelling of target.move-to-nearest-code in helptext
Mar 25 2021, 2:25 PM

Mar 23 2021

vsk committed rG772e1dd1ddc0: [Coverage] Load records immediately (authored by tunz).
[Coverage] Load records immediately
Mar 23 2021, 4:26 PM
vsk closed D99110: [Coverage] Load records immediately.
Mar 23 2021, 4:25 PM · Restricted Project
vsk accepted D99110: [Coverage] Load records immediately.

Lgtm, thanks.

Mar 23 2021, 11:43 AM · Restricted Project
vsk added a comment to D96854: [CodeExtractor] Enable partial aggregate arguments.

Thanks for explaining. I'd suggest making ExcludedAggArgs part of a CodeExtractor instances internal state: e.g. the client may call CE.addArgExludedFromAggregate(Value *) some number of times before CE.extractCodeRegion(). This way, the client doesn't need to maintain a SetVector, and the rest of the interface isn't polluted with an option that's specific to the AggregateArg case.

Mar 23 2021, 11:13 AM · Restricted Project
vsk added a comment to D99110: [Coverage] Load records immediately.

Thanks for bearing with me :). The memory profile is really helpful.

Mar 23 2021, 10:48 AM · Restricted Project

Mar 22 2021

vsk added a comment to D99110: [Coverage] Load records immediately.

This should prevent the MemoryBuffer API from allocating a buffer for the object file, just to add a '\0' at the end. If the OS can just mmap() the buffer readonly, that should be essentially free.

Yes, I have tried disabling the RequiresNullTerminator bit, but it didn't help. Actually, the readonly buffer itself was fine. There are some places reading and storing data from them.

Mar 22 2021, 3:54 PM · Restricted Project
vsk added inline comments to D99110: [Coverage] Load records immediately.
Mar 22 2021, 2:36 PM · Restricted Project
vsk added a comment to D99110: [Coverage] Load records immediately.

Have you determined why opening a readonly binary costs memory?

Mar 22 2021, 2:32 PM · Restricted Project
vsk added a comment to D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.

Sorry about that. The file was removed in 8248dd91d7f042893d4a605e98d19cb1b89a44d4.

Mar 22 2021, 9:05 AM · Restricted Project

Mar 19 2021

vsk committed rG4bd2bfb6ec09: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY (reapply) (authored by vsk).
[lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY (reapply)
Mar 19 2021, 4:06 PM
vsk updated the diff for D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.

Add asserts checking that the library paths exist, and restrict the test to Darwin. See https://bugs.llvm.org/show_bug.cgi?id=49656

Mar 19 2021, 4:05 PM · Restricted Project
vsk added a reverting change for rGcb8c1ee269da: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY: rGd8d5ef2e9d84: Revert "[lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY".
Mar 19 2021, 3:27 PM
vsk committed rGd8d5ef2e9d84: Revert "[lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY" (authored by vsk).
Revert "[lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY"
Mar 19 2021, 3:27 PM
vsk added a reverting change for D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY: rGd8d5ef2e9d84: Revert "[lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY".
Mar 19 2021, 3:26 PM · Restricted Project
vsk committed rGcb8c1ee269da: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY (authored by vsk).
[lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY
Mar 19 2021, 3:14 PM
vsk closed D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.
Mar 19 2021, 3:14 PM · Restricted Project
vsk updated the diff for D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.
  • Delete an unused '#undef DLOPEN_OPTIONS'
Mar 19 2021, 2:08 PM · Restricted Project
vsk updated the diff for D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.
  • Define RTLD_LAZY in the expression as suggested
Mar 19 2021, 2:06 PM · Restricted Project
vsk added a reviewer for D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY: kastiglione.
Mar 19 2021, 8:19 AM · Restricted Project

Mar 18 2021

vsk updated the diff for D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.
  • Make the test generic (not Darwin-specific)
Mar 18 2021, 2:53 PM · Restricted Project
vsk updated the diff for D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.
  • Add a test. It's Darwin-specific as I couldn't work out how to pass the .dylib path to the linker in a platform-agnostic way.
  • Trim the comment.
Mar 18 2021, 2:47 PM · Restricted Project
vsk added a comment to D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.

The best way I can think of to test this is to:

Mar 18 2021, 2:02 PM · Restricted Project
vsk planned changes to D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.

This needs a test.

Mar 18 2021, 11:15 AM · Restricted Project
vsk requested review of D98879: [lldb/PlatformPOSIX] Change LoadImage default to RTLD_LAZY.
Mar 18 2021, 11:13 AM · Restricted Project

Mar 16 2021

vsk updated subscribers of D98668: Support !heapallocsite attachments in StripDebugInfo(). .
Mar 16 2021, 12:38 PM · Restricted Project, debug-info
vsk added inline comments to D98668: Support !heapallocsite attachments in StripDebugInfo(). .
Mar 16 2021, 9:59 AM · Restricted Project, debug-info

Mar 10 2021

vsk committed rGac29c35207a5: [lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC (authored by vsk).
[lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC
Mar 10 2021, 1:57 PM
vsk closed D98272: [lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC.
Mar 10 2021, 1:57 PM · Restricted Project
vsk updated the diff for D98272: [lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC.

Use a helper function, respect the force argument.

Mar 10 2021, 12:14 PM · Restricted Project
vsk added inline comments to D98272: [lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC.
Mar 10 2021, 12:13 PM · Restricted Project
vsk accepted D98061: [InstrProfiling] Generate runtime hook for ELF platforms.

Thanks for the detailed explanation of the -fprofile-list workflow; given the difference constraints, this patch lgtm. Please document the divergent behavior re: no .profraw file when #counters == 0 for non-MachO in the clang docs.

Mar 10 2021, 12:11 PM · Restricted Project, Restricted Project
vsk accepted D98325: [InstrProfiling] Don't generate __llvm_profile_runtime_user.

Thanks, lgtm.

Mar 10 2021, 12:02 PM · Restricted Project
vsk added a comment to D98135: [OpenMP][InstrProfiling] Fix a missing instr profiling counter.

LG, but I hope an OpenMP expert can take a look. Perhaps @vsk can comment on where the tests should be improved (currently we hardly have any @llvm.instrprof.increment IR test for clang -fprofile-instr-generate).

Mar 10 2021, 12:00 PM · Restricted Project

Mar 9 2021

vsk added a comment to D98061: [InstrProfiling] Generate runtime hook for ELF platforms.
In D98061#2615334, @vsk wrote:
In D98061#2615239, @vsk wrote:

@ributzka may have stronger thoughts about when -fprofile-instr-generate must imply that a known set of symbols appear with external visibility. Up until now, the answer has been "always", and this is what tapi enforces for MachO. It's awkward to have this be inconsistent between MachO/ELF, but if there's a compelling performance reason then I think it's fine.

From the perspective of Fuchsia, we've seen a noticeable impact of this change when using -fprofile-instr-generate together -fprofile-list to apply instrumentation selectively only to modified files.

What kind of impact do you see? If #counters > 0, is it mostly binary size cost? If #counters == 0, what's the avg. overhead of writing out the full profile?

It depends a bit on the runtime and the platform. In Fuchsia where we always use the continuous mode, there's quite a bit of upfront cost (see https://github.com/llvm/llvm-project/blob/75f3f778052cdcd89e79f7a42a50915ee5ce2281/compiler-rt/lib/profile/InstrProfilingPlatformFuchsia.c#L109), we need to allocate a memory object, map it into address space, publish it, write some additional information into the log.

While it may not be so bad for a single binary, over hundreds of tests it adds up and with this change we saw the total test execution time go down from 30 to 18 minutes. There are other steps we're taking, like eliminating the need for logging, but that's unlikely to eliminate all of the overhead.

Can it be fixed by doing an early-exit in the runtime initializer, writing out an empty .profraw?

I considered that initially, but that's less efficient than the approach implemented here, especially when it comes to binary size.

Mar 9 2021, 4:37 PM · Restricted Project, Restricted Project
vsk added a comment to D98061: [InstrProfiling] Generate runtime hook for ELF platforms.
In D98061#2615239, @vsk wrote:

@ributzka may have stronger thoughts about when -fprofile-instr-generate must imply that a known set of symbols appear with external visibility. Up until now, the answer has been "always", and this is what tapi enforces for MachO. It's awkward to have this be inconsistent between MachO/ELF, but if there's a compelling performance reason then I think it's fine.

From the perspective of Fuchsia, we've seen a noticeable impact of this change when using -fprofile-instr-generate together -fprofile-list to apply instrumentation selectively only to modified files.

Mar 9 2021, 3:28 PM · Restricted Project, Restricted Project
vsk updated subscribers of D98061: [InstrProfiling] Generate runtime hook for ELF platforms.

@ributzka may have stronger thoughts about when -fprofile-instr-generate must imply that a known set of symbols appear with external visibility. Up until now, the answer has been "always", and this is what tapi enforces for MachO. It's awkward to have this be inconsistent between MachO/ELF, but if there's a compelling performance reason then I think it's fine.

Mar 9 2021, 2:57 PM · Restricted Project, Restricted Project
vsk requested review of D98272: [lldb/Platform] Skip very slow xcrun queries for simulator platforms, NFC.
Mar 9 2021, 10:53 AM · Restricted Project
vsk accepted D98224: [cmake] Enable -Werror=return-type.
Mar 9 2021, 9:34 AM · Restricted Project
vsk accepted D98084: Revert "[llvm-cov] reset executation count to 0 after wrapped segment".
Mar 9 2021, 9:14 AM · Restricted Project

Mar 4 2021

vsk updated subscribers of D97966: [compiler-rt][profile][RISCV]: Enable Compiler-RT profile build on RISC-V.

It might help to have a bot online first to avoid shipping a regressed libprofile for RISCV, but I don't want to hold up work that depends on this. I'm not sure what the right call is, cc'ing @ro and @eugenis who have past experience (D40943)

Mar 4 2021, 2:09 PM · Restricted Project
vsk added inline comments to D97746: [ubsan] support print_module_map flag in standalone mode.
Mar 4 2021, 1:36 PM · Restricted Project
vsk added a comment to D97966: [compiler-rt][profile][RISCV]: Enable Compiler-RT profile build on RISC-V.

It's great to see this. Will these tests get picked up by an existing RISCV bot?

Mar 4 2021, 11:36 AM · Restricted Project