This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/tools/llvm-profgen/
-
tools/
-
llvm-profgen/
-
Inputs/
-
noinline-cs-pseudoprobe.perfscript
-
inline-noprobe2.test
-
invalid-range.test
-
pseudoprobe-decoding.test
-
tools/llvm-profgen/
-
llvm-profgen/
4/11
PerfReader.cpp
-
ProfileGenerator.h
-
ProfileGenerator.cpp
1
ProfiledBinary.h
-
ProfiledBinary.cpp

Differential D126827

[llvm-profgen] Fix inconsistent loading address issues
ClosedPublic

Authored by wlei on Jun 1 2022, 2:16 PM.

Download Raw Diff

Details

Reviewers

hoy
wenlei

Commits

rG467652486f24: [llvm-profgen] Fix inconsistent loading address issues

Summary

This is to fix two issues related with loading address:

When multiple MMAPs occur and their loading address are different, before it only used the first MMap as base address, all perf address after it used the wrong base address.

For pseudo probe profile, the address is always based on preferred loading address. If the base address is not equal to the preferred loading address, the pseudo probe address query will be wrong.

Solution: Instead of converting the address to offset lazily, right now all the address after parsing are converted on the fly based on preferred loading address in the parsing time. There is no "offset" used in profile generator any more.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

wlei created this revision.Jun 1 2022, 2:16 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 2:16 PM

Herald added subscribers: hoy, wenlei. · View Herald Transcript

wlei requested review of this revision.Jun 1 2022, 2:16 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 2:16 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

wlei retitled this revision from [llvm-profgen] fix a loading address bug for pseudo probe profile to [llvm-profgen] Fix a loading address bug for pseudo probe profile.Jun 1 2022, 2:21 PM

wlei edited the summary of this revision. (Show Details)

wlei added reviewers: hoy, wenlei.

Harbormaster completed remote builds in B167366: Diff 433547.Jun 1 2022, 3:07 PM

wenlei added inline comments.Jun 8 2022, 12:08 AM

llvm/tools/llvm-profgen/PerfReader.cpp
1204	Where do we use getBaseAddress in probe queries? Can we use getPreferredBaseAddress directly there? Ideally there should be one definition for base address which is the actual executable segment load address, and changing it here makes it inconsistent.
llvm/tools/llvm-profgen/PerfReader.h
374 ↗	(On Diff #433547)	Perhaps AddrBasedCtxKey as name is just fine? This is in contrast with StringBasedCtxKey, so Addr as a general term isn't a big deal, I don't have strong opinion though..

hoy added inline comments.Jun 8 2022, 10:53 AM

llvm/tools/llvm-profgen/PerfReader.cpp
1204	Agreed that `setBaseAddress` should be called once to be consistent. Pseudo probe decoding is based on the preferred address. I think a reasonable fix could be make the decoding offset-based. This will require a change to the bolt code. base; or do an offset to preferred addr translation for every probe look up in llvm-profgen. This is being done in some places, eg, `extractPrefixContextStack`. #2 sounds more practical. WDYT?

wlei added inline comments.Jun 8 2022, 2:30 PM

llvm/tools/llvm-profgen/PerfReader.cpp
1204	Hmm, I found it's very tricky case. Where do we use getBaseAddress in probe queries? We call it in `Binary->offsetToVirtualAddr`. However, we call `IP.advance()` to iterate all the address in the range, in `IP` it will convert the `Address` back to `offset` using the base address. like `Offset + preferred Address - Mmap loading address`. That's another inconsistency.. Then we need to change code inside of the IP. but IP also used in `PerfReader`
llvm/tools/llvm-profgen/PerfReader.h
374 ↗	(On Diff #433547)	Sounds good.

wlei added inline comments.Jun 8 2022, 2:39 PM

llvm/tools/llvm-profgen/PerfReader.cpp
1204	Or can we use virtual address(`Offset + preferred Address`) for all the code(PerfReader and ProfileGenerator)? to do this, we can convert the address at the very beginning of PerfReader. like `Address = PhysicalAddress - Mmap loading address + PreferredAddress`

hoy added inline comments.Jun 8 2022, 5:48 PM

llvm/tools/llvm-profgen/PerfReader.cpp
1204	I see. This is complicated. How about we always use preferred address when doing this offset to virtual addr conversion, and we rebase all LBR and stack sample addresses to be on the preferred load address in the the perf reader ?

changed all offset usage to preferred address based virtual address

Harbormaster completed remote builds in B172266: Diff 440317.Jun 27 2022, 1:35 PM

In D126827#3613023, @wlei wrote:

changed all offset usage to preferred address based virtual address

Thanks for the work! It should be more reliable and also looks cleaner.

llvm/tools/llvm-profgen/PerfReader.cpp
746	I'm a bit confused here. What is the semantics of using both offset and LoadableSegmentAsBase?

wlei added inline comments.Jul 11 2022, 4:20 PM

llvm/tools/llvm-profgen/PerfReader.cpp
746	I think this is to be compatible to our internal old tool where some services use offset and the FirstLoadableAddress as the output of unsymbolized profile. The original patch is https://reviews.llvm.org/D113727.

hoy added inline comments.Jul 13 2022, 11:33 AM

llvm/tools/llvm-profgen/PerfReader.cpp
746	Do we still need this now? The offset here is computed by subtracting the runtime base address from preferred-based virtual address. This seems different from what previous computed, ie., subtracting the runtime base address from runtime virtual address.

wlei added inline comments.Jul 13 2022, 4:40 PM

llvm/tools/llvm-profgen/PerfReader.cpp
746	It should be the same behavior as previous offset computation. This is about how we define the "offset": Assuming MMAP loading address is missing or just equal to preferred-based-address, then the `virtual address` = `offset + preferred-based-address`. and for this two branches, if it's `UseLoadableSegmentAsBase` is False, then the offset is `virtual address - preferred-based-address` --> `offset + preferred-based-address - preferred-based-address` which is the original "offset". but when `UseLoadableSegmentAsBase` is True, the offset is `virtual address - FirstLoadableAddress` -->`offset + preferred-based-address - FirstLoadableAddress` so the offset is not the `offset` as we defined before. However, even it's "wrong", if depends on the source of input, our internal tool which just produced this "wrong" offset, so llvm-profgen also need to use the same way to recover it. Yeah, but if we won't use the old tool anymore, I think we can remove all the `UseLoadableSegmentAsBase` code.

hoy added inline comments.Jul 13 2022, 5:05 PM

llvm/tools/llvm-profgen/PerfReader.cpp
746	so the offset is not the offset as we defined before. However, even it's "wrong", Could the "wrong" offsets now cause trouble for pseudo probes which always have the "correct" offset? It should work fine previously? Yeah, but if we won't use the old tool anymore, I think we can remove all the UseLoadableSegmentAsBase code. Agreed.

hoy added inline comments.Jul 13 2022, 5:24 PM

llvm/tools/llvm-profgen/PerfReader.cpp
746	I looked a bit deeper. The "wrong" offsets should work. `getFirstLoadableAddress` stands for the preferred first load address, not the runtime one where I got the impression from its definition. // The runtime base address that the first loadabe segment is loaded at. uint64_t FirstLoadableAddress = 0; Then we no longer need to remove the feature in this change.

hoy accepted this revision.Jul 14 2022, 9:52 AM

This revision is now accepted and ready to land.Jul 14 2022, 9:52 AM

Agreed that having everything canonicalized to use preferred load address as base is clean and practical. Thanks for making the changes. LGTM with a nit.

llvm/tools/llvm-profgen/ProfiledBinary.h
347	nit: `canonicalizeVirtualAddress` is probably a better name.

rebase and rename convertAddress --> canonicalizeVirtualAddress

wlei retitled this revision from [llvm-profgen] Fix a loading address bug for pseudo probe profile to [llvm-profgen] Fix inconsistent loading address issues.Oct 13 2022, 11:15 PM

wlei edited the summary of this revision. (Show Details)

This revision was landed with ongoing or failed builds.Oct 13 2022, 11:25 PM

Closed by commit rG467652486f24: [llvm-profgen] Fix inconsistent loading address issues (authored by wlei). · Explain Why

This revision was automatically updated to reflect the committed changes.

wlei added a commit: rG467652486f24: [llvm-profgen] Fix inconsistent loading address issues.

Harbormaster completed remote builds in B192120: Diff 467684.Oct 13 2022, 11:34 PM

Revision Contents

Path

Size

llvm/

test/

tools/

llvm-profgen/

Inputs/

noinline-cs-pseudoprobe.perfscript

20 lines

inline-noprobe2.test

3 lines

invalid-range.test

4 lines

pseudoprobe-decoding.test

34 lines

tools/

llvm-profgen/

147 lines

2 lines

88 lines

165 lines

119 lines

Diff 467692

llvm/test/tools/llvm-profgen/Inputs/noinline-cs-pseudoprobe.perfscript

	PERF_RECORD_MMAP2 1243676/1243676: [0x201000(0x1000) @ 0 00:1d 224517108 1044165]: r-xp /home/noinline-cs-pseudoprobe.perfbin			PERF_RECORD_MMAP2 1243676/1243676: [0x301000(0x1000) @ 0 00:1d 224517108 1044165]: r-xp /home/noinline-cs-pseudoprobe.perfbin

	20179e			30179e
	2017f9			3017f9
	7f83e84e7793			7f83e84e7793
	5541f689495641d7			5541f689495641d7
	0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0			0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0

	2017c4			3017c4
	2017f9			3017f9
	7f83e84e7793			7f83e84e7793
	5541f689495641d7			5541f689495641d7
	0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0			0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0

	2017c4			3017c4
	2017f9			3017f9
	7f83e84e7793			7f83e84e7793
	5541f689495641d7			5541f689495641d7
	0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0 0x2017bf/0x201760/P/-/-/0 0x2017cf/0x20179e/P/-/-/0 0x20177f/0x2017c4/P/-/-/0			0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0 0x3017bf/0x301760/P/-/-/0 0x3017cf/0x30179e/P/-/-/0 0x30177f/0x3017c4/P/-/-/0

llvm/test/tools/llvm-profgen/inline-noprobe2.test

	; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/artificial-branch.perfscript --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t --skip-symbolization --use-offset=0			; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/artificial-branch.perfscript --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t --skip-symbolization --use-offset=0
	; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-EXT-ADDR			; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-EXT-ADDR
	; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-noprobe2.perfscript --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t --skip-symbolization --use-offset=0			; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/inline-noprobe2.perfscript --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t --skip-symbolization --use-offset=0
	; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-RAW-PROFILE			; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-RAW-PROFILE
	; RUN: llvm-profgen --format=text --unsymbolized-profile=%t --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t1 --use-offset=0			; RUN: llvm-profgen --format=text --unsymbolized-profile=%t --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t1 --use-offset=0
	; RUN: FileCheck %s --input-file %t1 --check-prefix=CHECK			; RUN: FileCheck %s --input-file %t1 --check-prefix=CHECK

	; RUN: llvm-profgen --format=extbinary --perfscript=%S/Inputs/inline-noprobe2.perfscript --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t --populate-profile-symbol-list=1			; RUN: llvm-profgen --format=extbinary --perfscript=%S/Inputs/inline-noprobe2.perfscript --binary=%S/Inputs/inline-noprobe2.perfbin --output=%t --populate-profile-symbol-list=1
	; RUN: llvm-profdata show -show-prof-sym-list -sample %t \| FileCheck %s --check-prefix=CHECK-SYM-LIST			; RUN: llvm-profdata show -show-prof-sym-list -sample %t \| FileCheck %s --check-prefix=CHECK-SYM-LIST

	; CHECK-EXT-ADDR: 2			; CHECK-EXT-ADDR: 2
	; CHECK-EXT-ADDR-NEXT: 400870-400870:2			; CHECK-EXT-ADDR-NEXT: 400870-400870:2
	; CHECK-EXT-ADDR-NEXT: 400875-4008bf:1			; CHECK-EXT-ADDR-NEXT: 400875-4008bf:1
	; CHECK-EXT-ADDR-NEXT: 2			; CHECK-EXT-ADDR-NEXT: 2
	; CHECK-EXT-ADDR-NEXT: 4008bf->400870:2
	; Value 1 is external address			; Value 1 is external address
	; CHECK-EXT-ADDR-NEXT: 1->400875:1			; CHECK-EXT-ADDR-NEXT: 1->400875:1
				; CHECK-EXT-ADDR-NEXT: 4008bf->400870:2


	; CHECK-SYM-LIST: Dump profile symbol list			; CHECK-SYM-LIST: Dump profile symbol list
	; CHECK-SYM-LIST: main			; CHECK-SYM-LIST: main
	; CHECK-SYM-LIST: partition_pivot_first			; CHECK-SYM-LIST: partition_pivot_first
	; CHECK-SYM-LIST: partition_pivot_last			; CHECK-SYM-LIST: partition_pivot_last
	; CHECK-SYM-LIST: quick_sort			; CHECK-SYM-LIST: quick_sort
	; CHECK-SYM-LIST: swap			; CHECK-SYM-LIST: swap

	▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

llvm/test/tools/llvm-profgen/invalid-range.test

	Show All 25 Lines
	; CS-NEXT: 20179e-2017bf:5			; CS-NEXT: 20179e-2017bf:5
	; CS-NEXT: 2017c4-2017cf:5			; CS-NEXT: 2017c4-2017cf:5
	; CS-NEXT: 2017c4-2017d8:1			; CS-NEXT: 2017c4-2017d8:1
	; CS-NEXT: 4			; CS-NEXT: 4
	; CS-NEXT: 20177f->2017c4:6			; CS-NEXT: 20177f->2017c4:6
	; CS-NEXT: 2017bf->201760:6			; CS-NEXT: 2017bf->201760:6
	; CS-NEXT: 2017cf->20179e:6			; CS-NEXT: 2017cf->20179e:6
	; CS-NEXT: 2017d8->2017e3:1			; CS-NEXT: 2017d8->2017e3:1
	; CS-NEXT: [0x7f4]			; CS-NEXT: [0x2017f4]
	; CS-NEXT: 1			; CS-NEXT: 1
	; CS-NEXT: 2017c4-2017cf:1			; CS-NEXT: 2017c4-2017cf:1
	; CS-NEXT: 2			; CS-NEXT: 2
	; CS-NEXT: 2017bf->201760:1			; CS-NEXT: 2017bf->201760:1
	; CS-NEXT: 2017cf->20179e:2			; CS-NEXT: 2017cf->20179e:2
	; CS-NEXT: [0x7f4 @ 0x7bf]			; CS-NEXT: [0x2017f4 @ 0x2017bf]
	; CS-NEXT: 1			; CS-NEXT: 1
	; CS-NEXT: 201760-20177f:1			; CS-NEXT: 201760-20177f:1
	; CS-NEXT: 1			; CS-NEXT: 1
	; CS-NEXT: 20177f->2017c4:1			; CS-NEXT: 20177f->2017c4:1

	; clang -O3 -fuse-ld=lld -fpseudo-probe-for-profiling			; clang -O3 -fuse-ld=lld -fpseudo-probe-for-profiling
	; -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Xclang -mdisable-tail-calls			; -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Xclang -mdisable-tail-calls
	; -g test.c -o a.out			; -g test.c -o a.out
	Show All 21 Lines

llvm/test/tools/llvm-profgen/pseudoprobe-decoding.test

	Show All 11 Lines
	; CHECK: Hash: 72617220756			; CHECK: Hash: 72617220756



	; CHECK: <bar>:			; CHECK: <bar>:

	; CHECK: [Probe]: FUNC: bar Index: 1 Type: Block			; CHECK: [Probe]: FUNC: bar Index: 1 Type: Block
	; CHECK-NEXT: [Probe]: FUNC: bar Index: 4 Type: Block			; CHECK-NEXT: [Probe]: FUNC: bar Index: 4 Type: Block
	; CHECK-NEXT: 754: imull $2863311531, %edi, %eax			; CHECK-NEXT: 201754: imull $2863311531, %edi, %eax

	; CHECK: <foo>:			; CHECK: <foo>:
	; CHECK: [Probe]: FUNC: foo Index: 1 Type: Block			; CHECK: [Probe]: FUNC: foo Index: 1 Type: Block
	; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block			; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block
	; CHECK-NEXT: 770: movl $1, %ecx			; CHECK-NEXT: 201770: movl $1, %ecx

	; CHECK: [Probe]: FUNC: foo Index: 5 Type: Block			; CHECK: [Probe]: FUNC: foo Index: 5 Type: Block
	; CHECK-NEXT: 780: addl $30, %esi			; CHECK-NEXT: 201780: addl $30, %esi
	; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block			; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block
	; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block			; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block
	; CHECK-NEXT: 783: addl $1, %ecx			; CHECK-NEXT: 201783: addl $1, %ecx

	; CHECK: [Probe]: FUNC: foo Index: 3 Type: Block			; CHECK: [Probe]: FUNC: foo Index: 3 Type: Block
	; CHECK-NEXT: 78e: movl %ecx, %edx			; CHECK-NEXT: 20178e: movl %ecx, %edx

	; CHECK: [Probe]: FUNC: foo Index: 4 Type: Block			; CHECK: [Probe]: FUNC: foo Index: 4 Type: Block
	; CHECK-NEXT: [Probe]: FUNC: bar Index: 1 Type: Block Inlined: @ foo:8			; CHECK-NEXT: [Probe]: FUNC: bar Index: 1 Type: Block Inlined: @ foo:8
	; CHECK-NEXT: [Probe]: FUNC: bar Index: 4 Type: Block Inlined: @ foo:8			; CHECK-NEXT: [Probe]: FUNC: bar Index: 4 Type: Block Inlined: @ foo:8
	; CHECK-NEXT: 7bf: addl %ecx, %edx			; CHECK-NEXT: 2017bf: addl %ecx, %edx


	; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block			; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block
	; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block			; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block
	; CHECK-NEXT: 7cf: addl $1, %ecx			; CHECK-NEXT: 2017cf: addl $1, %ecx

	; CHECK: [Probe]: FUNC: foo Index: 7 Type: Block			; CHECK: [Probe]: FUNC: foo Index: 7 Type: Block
	; CHECK-NEXT: 7de: movl $2098432, %edi			; CHECK-NEXT: 2017de: movl $2098432, %edi

	; CHECK: [Probe]: FUNC: foo Index: 9 Type: DirectCall			; CHECK: [Probe]: FUNC: foo Index: 9 Type: DirectCall
	; CHECK-NEXT: 7e5: callq 0x930			; CHECK-NEXT: 2017e5: callq 0x201930


	; CHECK: <main>:			; CHECK: <main>:
	; CHECK: [Probe]: FUNC: main Index: 1 Type: Block			; CHECK: [Probe]: FUNC: main Index: 1 Type: Block
	; CHECK-NEXT: [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ main:2			; CHECK-NEXT: [Probe]: FUNC: foo Index: 1 Type: Block Inlined: @ main:2
	; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block Inlined: @ main:2			; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block Inlined: @ main:2
	; CHECK-NEXT: 7f0: movl $1, %ecx			; CHECK-NEXT: 2017f0: movl $1, %ecx

	; CHECK: [Probe]: FUNC: foo Index: 5 Type: Block Inlined: @ main:2			; CHECK: [Probe]: FUNC: foo Index: 5 Type: Block Inlined: @ main:2
	; CHECK-NEXT: 800: addl $30, %esi			; CHECK-NEXT: 201800: addl $30, %esi
	; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block Inlined: @ main:2			; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block Inlined: @ main:2
	; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block Inlined: @ main:2			; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block Inlined: @ main:2
	; CHECK-NEXT: 803: addl $1, %ecx			; CHECK-NEXT: 201803: addl $1, %ecx

	; CHECK: [Probe]: FUNC: foo Index: 3 Type: Block Inlined: @ main:2			; CHECK: [Probe]: FUNC: foo Index: 3 Type: Block Inlined: @ main:2
	; CHECK-NEXT: 80e: movl %ecx, %edx			; CHECK-NEXT: 20180e: movl %ecx, %edx

	; CHECK: [Probe]: FUNC: foo Index: 4 Type: Block Inlined: @ main:2			; CHECK: [Probe]: FUNC: foo Index: 4 Type: Block Inlined: @ main:2
	; CHECK-NEXT: [Probe]: FUNC: bar Index: 1 Type: Block Inlined: @ main:2 @ foo:8			; CHECK-NEXT: [Probe]: FUNC: bar Index: 1 Type: Block Inlined: @ main:2 @ foo:8
	; CHECK-NEXT: [Probe]: FUNC: bar Index: 4 Type: Block Inlined: @ main:2 @ foo:8			; CHECK-NEXT: [Probe]: FUNC: bar Index: 4 Type: Block Inlined: @ main:2 @ foo:8
	; CHECK-NEXT: 83f: addl %ecx, %edx			; CHECK-NEXT: 20183f: addl %ecx, %edx

	; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block Inlined: @ main:2			; CHECK: [Probe]: FUNC: foo Index: 6 Type: Block Inlined: @ main:2
	; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block Inlined: @ main:2			; CHECK-NEXT: [Probe]: FUNC: foo Index: 2 Type: Block Inlined: @ main:2
	; CHECK-NEXT: 84f: addl $1, %ecx			; CHECK-NEXT: 20184f: addl $1, %ecx

	; CHECK: [Probe]: FUNC: foo Index: 7 Type: Block Inlined: @ main:2			; CHECK: [Probe]: FUNC: foo Index: 7 Type: Block Inlined: @ main:2
	; CHECK-NEXT: 85e: movl $2098432, %edi			; CHECK-NEXT: 20185e: movl $2098432, %edi

	; CHECK: [Probe]: FUNC: foo Index: 9 Type: DirectCall Inlined: @ main:2			; CHECK: [Probe]: FUNC: foo Index: 9 Type: DirectCall Inlined: @ main:2
	; CHECK-NEXT: 865: callq 0x930			; CHECK-NEXT: 201865: callq 0x201930

	; SYM-NOT: <bar>:			; SYM-NOT: <bar>:
	; SYM: <foo>:			; SYM: <foo>:
	; SYM: <main>:			; SYM: <main>:



	; clang -O3 -fuse-ld=lld -fpseudo-probe-for-profiling			; clang -O3 -fuse-ld=lld -fpseudo-probe-for-profiling
	Show All 23 Lines

llvm/tools/llvm-profgen/PerfReader.cpp

Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	if (End == ExternalAddr \|\| Target == ExternalAddr) {
// traces contains a standalone external address failing to pair another		// traces contains a standalone external address failing to pair another
// one, likely due to interrupt jmp or broken perf script. Set the		// one, likely due to interrupt jmp or broken perf script. Set the
// state to invalid.		// state to invalid.
NumUnpairedExtAddr++;		NumUnpairedExtAddr++;
State.setInvalid();		State.setInvalid();
return;		return;
}		}

if (!isValidFallThroughRange(Binary->virtualAddrToOffset(Target),		if (!isValidFallThroughRange(Target, End, Binary)) {
Binary->virtualAddrToOffset(End), Binary)) {
// Skip unwinding the rest of LBR trace when a bogus range is seen.		// Skip unwinding the rest of LBR trace when a bogus range is seen.
State.setInvalid();		State.setInvalid();
return;		return;
}		}

if (Binary->usePseudoProbes()) {		if (Binary->usePseudoProbes()) {
// We don't need to top frame probe since it should be extracted		// We don't need to top frame probe since it should be extracted
// from the range.		// from the range.
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	void VirtualUnwinder::collectSamplesFromFrame(UnwindState::ProfiledFrame *Cur,
if (Cur->RangeSamples.empty() && Cur->BranchSamples.empty())		if (Cur->RangeSamples.empty() && Cur->BranchSamples.empty())
return;		return;

std::shared_ptr<ContextKey> Key = Stack.getContextKey();		std::shared_ptr<ContextKey> Key = Stack.getContextKey();
if (Key == nullptr)		if (Key == nullptr)
return;		return;
auto Ret = CtxCounterMap->emplace(Hashable<ContextKey>(Key), SampleCounter());		auto Ret = CtxCounterMap->emplace(Hashable<ContextKey>(Key), SampleCounter());
SampleCounter &SCounter = Ret.first->second;		SampleCounter &SCounter = Ret.first->second;
for (auto &Item : Cur->RangeSamples) {		for (auto &I : Cur->RangeSamples)
uint64_t StartOffset = Binary->virtualAddrToOffset(std::get<0>(Item));		SCounter.recordRangeCount(std::get<0>(I), std::get<1>(I), std::get<2>(I));
uint64_t EndOffset = Binary->virtualAddrToOffset(std::get<1>(Item));
SCounter.recordRangeCount(StartOffset, EndOffset, std::get<2>(Item));
}

for (auto &Item : Cur->BranchSamples) {		for (auto &I : Cur->BranchSamples)
uint64_t SourceOffset = Binary->virtualAddrToOffset(std::get<0>(Item));		SCounter.recordBranchCount(std::get<0>(I), std::get<1>(I), std::get<2>(I));
uint64_t TargetOffset = Binary->virtualAddrToOffset(std::get<1>(Item));
SCounter.recordBranchCount(SourceOffset, TargetOffset, std::get<2>(Item));
}
}		}

template <typename T>		template <typename T>
void VirtualUnwinder::collectSamplesFromFrameTrie(		void VirtualUnwinder::collectSamplesFromFrameTrie(
UnwindState::ProfiledFrame *Cur, T &Stack) {		UnwindState::ProfiledFrame *Cur, T &Stack) {
if (!Cur->isDummyRoot()) {		if (!Cur->isDummyRoot()) {
// Truncate the context for external frame since this isn't a real call		// Truncate the context for external frame since this isn't a real call
// context the compiler will see.		// context the compiler will see.
▲ Show 20 Lines • Show All 254 Lines • ▼ Show 20 Lines	static std::string getContextKeyStr(ContextKey *K,
const ProfiledBinary *Binary) {		const ProfiledBinary *Binary) {
if (const auto *CtxKey = dyn_cast<StringBasedCtxKey>(K)) {		if (const auto *CtxKey = dyn_cast<StringBasedCtxKey>(K)) {
return SampleContext::getContextString(CtxKey->Context);		return SampleContext::getContextString(CtxKey->Context);
} else if (const auto *CtxKey = dyn_cast<AddrBasedCtxKey>(K)) {		} else if (const auto *CtxKey = dyn_cast<AddrBasedCtxKey>(K)) {
std::ostringstream OContextStr;		std::ostringstream OContextStr;
for (uint32_t I = 0; I < CtxKey->Context.size(); I++) {		for (uint32_t I = 0; I < CtxKey->Context.size(); I++) {
if (OContextStr.str().size())		if (OContextStr.str().size())
OContextStr << " @ ";		OContextStr << " @ ";
		uint64_t Address = CtxKey->Context[I];
		if (UseOffset) {
		if (UseLoadableSegmentAsBase)
		Address -= Binary->getFirstLoadableAddress();
		else
		Address -= Binary->getPreferredBaseAddress();
		}
OContextStr << "0x"		OContextStr << "0x"
<< utohexstr(		<< utohexstr(Address,
Binary->virtualAddrToOffset(CtxKey->Context[I]),
/LowerCase=/true);		/LowerCase=/true);
}		}
return OContextStr.str();		return OContextStr.str();
} else {		} else {
llvm_unreachable("unexpected key type");		llvm_unreachable("unexpected key type");
}		}
}		}

void HybridPerfReader::unwindSamples() {		void HybridPerfReader::unwindSamples() {
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	while (Index < Records.size()) {

// Stop at broken LBR records.		// Stop at broken LBR records.
if (Addresses.size() < 2 \|\| Addresses[0].substr(2).getAsInteger(16, Src) \|\|		if (Addresses.size() < 2 \|\| Addresses[0].substr(2).getAsInteger(16, Src) \|\|
Addresses[1].substr(2).getAsInteger(16, Dst)) {		Addresses[1].substr(2).getAsInteger(16, Dst)) {
WarnInvalidLBR(TraceIt);		WarnInvalidLBR(TraceIt);
break;		break;
}		}

		// Canonicalize to use preferred load address as base address.
		Src = Binary->canonicalizeVirtualAddress(Src);
		Dst = Binary->canonicalizeVirtualAddress(Dst);
bool SrcIsInternal = Binary->addressIsCode(Src);		bool SrcIsInternal = Binary->addressIsCode(Src);
bool DstIsInternal = Binary->addressIsCode(Dst);		bool DstIsInternal = Binary->addressIsCode(Dst);
if (!SrcIsInternal)		if (!SrcIsInternal)
Src = ExternalAddr;		Src = ExternalAddr;
if (!DstIsInternal)		if (!DstIsInternal)
Dst = ExternalAddr;		Dst = ExternalAddr;
// Filter external-to-external case to reduce LBR trace size.		// Filter external-to-external case to reduce LBR trace size.
if (!SrcIsInternal && !DstIsInternal)		if (!SrcIsInternal && !DstIsInternal)
Show All 19 Lines	while (!TraceIt.isAtEoF() && !TraceIt.getCurrentLine().startswith(" 0x")) {
uint64_t FrameAddr = 0;		uint64_t FrameAddr = 0;
if (FrameStr.getAsInteger(16, FrameAddr)) {		if (FrameStr.getAsInteger(16, FrameAddr)) {
// We might parse a non-perf sample line like empty line and comments,		// We might parse a non-perf sample line like empty line and comments,
// skip it		// skip it
TraceIt.advance();		TraceIt.advance();
return false;		return false;
}		}
TraceIt.advance();		TraceIt.advance();

		FrameAddr = Binary->canonicalizeVirtualAddress(FrameAddr);
// Currently intermixed frame from different binaries is not supported.		// Currently intermixed frame from different binaries is not supported.
if (!Binary->addressIsCode(FrameAddr)) {		if (!Binary->addressIsCode(FrameAddr)) {
if (CallStack.empty())		if (CallStack.empty())
NumLeafExternalFrame++;		NumLeafExternalFrame++;
// Push a special value(ExternalAddr) for the external frames so that		// Push a special value(ExternalAddr) for the external frames so that
// unwinder can still work on this with artificial Call/Return branch.		// unwinder can still work on this with artificial Call/Return branch.
// After unwinding, the context will be truncated for external frame.		// After unwinding, the context will be truncated for external frame.
// Also deduplicate the consecutive external addresses.		// Also deduplicate the consecutive external addresses.
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	void PerfScriptReader::writeUnsymbolizedProfile(raw_fd_ostream &OS) {
auto SCounterPrinter = [&](RangeSample &Counter, StringRef Separator,		auto SCounterPrinter = [&](RangeSample &Counter, StringRef Separator,
uint32_t Indent) {		uint32_t Indent) {
OS.indent(Indent);		OS.indent(Indent);
OS << Counter.size() << "\n";		OS << Counter.size() << "\n";
for (auto &I : Counter) {		for (auto &I : Counter) {
uint64_t Start = I.first.first;		uint64_t Start = I.first.first;
uint64_t End = I.first.second;		uint64_t End = I.first.second;

if (!UseOffset \|\| (UseOffset && UseLoadableSegmentAsBase)) {		if (UseOffset) {
Start = Binary->offsetToVirtualAddr(Start);		if (UseLoadableSegmentAsBase) {
End = Binary->offsetToVirtualAddr(End);
}

if (UseOffset && UseLoadableSegmentAsBase) {
Start -= Binary->getFirstLoadableAddress();		Start -= Binary->getFirstLoadableAddress();
		hoyUnsubmitted Not Done Reply Inline Actions I'm a bit confused here. What is the semantics of using both offset and LoadableSegmentAsBase? hoy: I'm a bit confused here. What is the semantics of using both offset and LoadableSegmentAsBase?
		wleiAuthorUnsubmitted Done Reply Inline Actions I think this is to be compatible to our internal old tool where some services use offset and the FirstLoadableAddress as the output of unsymbolized profile. The original patch is https://reviews.llvm.org/D113727. wlei: I think this is to be compatible to our internal old tool where some services use offset and…
		hoyUnsubmitted Not Done Reply Inline Actions Do we still need this now? The offset here is computed by subtracting the runtime base address from preferred-based virtual address. This seems different from what previous computed, ie., subtracting the runtime base address from runtime virtual address. hoy: Do we still need this now? The offset here is computed by subtracting the runtime base address…
		wleiAuthorUnsubmitted Done Reply Inline Actions It should be the same behavior as previous offset computation. This is about how we define the "offset": Assuming MMAP loading address is missing or just equal to preferred-based-address, then the `virtual address` = `offset + preferred-based-address`. and for this two branches, if it's `UseLoadableSegmentAsBase` is False, then the offset is `virtual address - preferred-based-address` --> `offset + preferred-based-address - preferred-based-address` which is the original "offset". but when `UseLoadableSegmentAsBase` is True, the offset is `virtual address - FirstLoadableAddress` -->`offset + preferred-based-address - FirstLoadableAddress` so the offset is not the `offset` as we defined before. However, even it's "wrong", if depends on the source of input, our internal tool which just produced this "wrong" offset, so llvm-profgen also need to use the same way to recover it. Yeah, but if we won't use the old tool anymore, I think we can remove all the `UseLoadableSegmentAsBase` code. wlei: It should be the same behavior as previous offset computation. This is about how we define the…
		hoyUnsubmitted Not Done Reply Inline Actions so the offset is not the offset as we defined before. However, even it's "wrong", Could the "wrong" offsets now cause trouble for pseudo probes which always have the "correct" offset? It should work fine previously? Yeah, but if we won't use the old tool anymore, I think we can remove all the UseLoadableSegmentAsBase code. Agreed. hoy: > so the offset is not the offset as we defined before. However, even it's "wrong", Could the…
		hoyUnsubmitted Not Done Reply Inline Actions I looked a bit deeper. The "wrong" offsets should work. `getFirstLoadableAddress` stands for the preferred first load address, not the runtime one where I got the impression from its definition. // The runtime base address that the first loadabe segment is loaded at. uint64_t FirstLoadableAddress = 0; Then we no longer need to remove the feature in this change. hoy: I looked a bit deeper. The "wrong" offsets should work. `getFirstLoadableAddress` stands for…
End -= Binary->getFirstLoadableAddress();		End -= Binary->getFirstLoadableAddress();
		} else {
		Start -= Binary->getPreferredBaseAddress();
		End -= Binary->getPreferredBaseAddress();
		}
}		}

OS.indent(Indent);		OS.indent(Indent);
OS << Twine::utohexstr(Start) << Separator << Twine::utohexstr(End) << ":"		OS << Twine::utohexstr(Start) << Separator << Twine::utohexstr(End) << ":"
<< I.second << "\n";		<< I.second << "\n";
}		}
};		};

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	while (Num--) {

uint64_t Source = 0;		uint64_t Source = 0;
uint64_t Target = 0;		uint64_t Target = 0;
auto Range = LineSplit.first.split(Separator);		auto Range = LineSplit.first.split(Separator);
if (Range.second.empty() \|\| Range.first.getAsInteger(16, Source) \|\|		if (Range.second.empty() \|\| Range.first.getAsInteger(16, Source) \|\|
Range.second.getAsInteger(16, Target))		Range.second.getAsInteger(16, Target))
exitWithErrorForTraceLine(TraceIt);		exitWithErrorForTraceLine(TraceIt);

if (!UseOffset \|\| (UseOffset && UseLoadableSegmentAsBase)) {		if (UseOffset) {
uint64_t BaseAddr = 0;		if (UseLoadableSegmentAsBase) {
if (UseOffset && UseLoadableSegmentAsBase)		Source += Binary->getFirstLoadableAddress();
BaseAddr = Binary->getFirstLoadableAddress();		Target += Binary->getFirstLoadableAddress();
		} else {
Source = Binary->virtualAddrToOffset(Source + BaseAddr);		Source += Binary->getPreferredBaseAddress();
Target = Binary->virtualAddrToOffset(Target + BaseAddr);		Target += Binary->getPreferredBaseAddress();
		}
}		}

Counter[{Source, Target}] += Count;		Counter[{Source, Target}] += Count;
TraceIt.advance();		TraceIt.advance();
}		}
};		};

ReadCounter(SCounters.RangeCounter, "-");		ReadCounter(SCounters.RangeCounter, "-");
Show All 21 Lines

void UnsymbolizedProfileReader::parsePerfTraces() {		void UnsymbolizedProfileReader::parsePerfTraces() {
readUnsymbolizedProfile(PerfTraceFile);		readUnsymbolizedProfile(PerfTraceFile);
}		}

void PerfScriptReader::computeCounterFromLBR(const PerfSample *Sample,		void PerfScriptReader::computeCounterFromLBR(const PerfSample *Sample,
uint64_t Repeat) {		uint64_t Repeat) {
SampleCounter &Counter = SampleCounters.begin()->second;		SampleCounter &Counter = SampleCounters.begin()->second;
uint64_t EndOffeset = 0;		uint64_t EndAddress = 0;
for (const LBREntry &LBR : Sample->LBRStack) {		for (const LBREntry &LBR : Sample->LBRStack) {
uint64_t SourceOffset = Binary->virtualAddrToOffset(LBR.Source);		uint64_t SourceAddress = LBR.Source;
uint64_t TargetOffset = Binary->virtualAddrToOffset(LBR.Target);		uint64_t TargetAddress = LBR.Target;

// Record the branch if its sourceOffset is external. It can be the case an		// Record the branch if its SourceAddress is external. It can be the case an
// external source call an internal function, later this branch will be used		// external source call an internal function, later this branch will be used
// to generate the function's head sample.		// to generate the function's head sample.
if (Binary->offsetIsCode(TargetOffset)) {		if (Binary->addressIsCode(TargetAddress)) {
Counter.recordBranchCount(SourceOffset, TargetOffset, Repeat);		Counter.recordBranchCount(SourceAddress, TargetAddress, Repeat);
}		}

// If this not the first LBR, update the range count between TO of current		// If this not the first LBR, update the range count between TO of current
// LBR and FROM of next LBR.		// LBR and FROM of next LBR.
uint64_t StartOffset = TargetOffset;		uint64_t StartAddress = TargetAddress;
if (Binary->offsetIsCode(StartOffset) && Binary->offsetIsCode(EndOffeset) &&		if (Binary->addressIsCode(StartAddress) &&
isValidFallThroughRange(StartOffset, EndOffeset, Binary))		Binary->addressIsCode(EndAddress) &&
Counter.recordRangeCount(StartOffset, EndOffeset, Repeat);		isValidFallThroughRange(StartAddress, EndAddress, Binary))
EndOffeset = SourceOffset;		Counter.recordRangeCount(StartAddress, EndAddress, Repeat);
		EndAddress = SourceAddress;
}		}
}		}

void LBRPerfReader::parseSample(TraceStream &TraceIt, uint64_t Count) {		void LBRPerfReader::parseSample(TraceStream &TraceIt, uint64_t Count) {
std::shared_ptr<PerfSample> Sample = std::make_shared<PerfSample>();		std::shared_ptr<PerfSample> Sample = std::make_shared<PerfSample>();
// Parsing LBR stack and populate into PerfSample.LBRStack		// Parsing LBR stack and populate into PerfSample.LBRStack
if (extractLBRStack(TraceIt, Sample->LBRStack)) {		if (extractLBRStack(TraceIt, Sample->LBRStack)) {
warnIfMissingMMap();		warnIfMissingMMap();
▲ Show 20 Lines • Show All 193 Lines • ▼ Show 20 Lines
void PerfScriptReader::warnInvalidRange() {		void PerfScriptReader::warnInvalidRange() {
std::unordered_map<std::pair<uint64_t, uint64_t>, uint64_t,		std::unordered_map<std::pair<uint64_t, uint64_t>, uint64_t,
pair_hash<uint64_t, uint64_t>>		pair_hash<uint64_t, uint64_t>>
Ranges;		Ranges;

for (const auto &Item : AggregatedSamples) {		for (const auto &Item : AggregatedSamples) {
const PerfSample *Sample = Item.first.getPtr();		const PerfSample *Sample = Item.first.getPtr();
uint64_t Count = Item.second;		uint64_t Count = Item.second;
uint64_t EndOffeset = 0;		uint64_t EndAddress = 0;
for (const LBREntry &LBR : Sample->LBRStack) {		for (const LBREntry &LBR : Sample->LBRStack) {
uint64_t SourceOffset = Binary->virtualAddrToOffset(LBR.Source);		uint64_t SourceAddress = LBR.Source;
uint64_t StartOffset = Binary->virtualAddrToOffset(LBR.Target);		uint64_t StartAddress = LBR.Target;
if (EndOffeset != 0)		if (EndAddress != 0)
Ranges[{StartOffset, EndOffeset}] += Count;		Ranges[{StartAddress, EndAddress}] += Count;
EndOffeset = SourceOffset;		EndAddress = SourceAddress;
}		}
}		}

if (Ranges.empty()) {		if (Ranges.empty()) {
WithColor::warning() << "No samples in perf script!\n";		WithColor::warning() << "No samples in perf script!\n";
return;		return;
}		}

auto WarnInvalidRange =		auto WarnInvalidRange = [&](uint64_t StartAddress, uint64_t EndAddress,
[&](uint64_t StartOffset, uint64_t EndOffset, StringRef Msg) {		StringRef Msg) {
if (!ShowDetailedWarning)		if (!ShowDetailedWarning)
return;		return;
WithColor::warning()		WithColor::warning() << "[" << format("%8" PRIx64, StartAddress) << ","
<< "["		<< format("%8" PRIx64, EndAddress) << "]: " << Msg
<< format("%8" PRIx64, Binary->offsetToVirtualAddr(StartOffset))		<< "\n";
<< ","
<< format("%8" PRIx64, Binary->offsetToVirtualAddr(EndOffset))
<< "]: " << Msg << "\n";
};		};

const char *EndNotBoundaryMsg = "Range is not on instruction boundary, "		const char *EndNotBoundaryMsg = "Range is not on instruction boundary, "
"likely due to profile and binary mismatch.";		"likely due to profile and binary mismatch.";
const char *DanglingRangeMsg = "Range does not belong to any functions, "		const char *DanglingRangeMsg = "Range does not belong to any functions, "
"likely from PLT, .init or .fini section.";		"likely from PLT, .init or .fini section.";
const char *RangeCrossFuncMsg =		const char *RangeCrossFuncMsg =
"Fall through range should not cross function boundaries, likely due to "		"Fall through range should not cross function boundaries, likely due to "
"profile and binary mismatch.";		"profile and binary mismatch.";
const char *BogusRangeMsg = "Range start is after or too far from range end.";		const char *BogusRangeMsg = "Range start is after or too far from range end.";

uint64_t TotalRangeNum = 0;		uint64_t TotalRangeNum = 0;
uint64_t InstNotBoundary = 0;		uint64_t InstNotBoundary = 0;
uint64_t UnmatchedRange = 0;		uint64_t UnmatchedRange = 0;
uint64_t RangeCrossFunc = 0;		uint64_t RangeCrossFunc = 0;
uint64_t BogusRange = 0;		uint64_t BogusRange = 0;

for (auto &I : Ranges) {		for (auto &I : Ranges) {
uint64_t StartOffset = I.first.first;		uint64_t StartAddress = I.first.first;
uint64_t EndOffset = I.first.second;		uint64_t EndAddress = I.first.second;
TotalRangeNum += I.second;		TotalRangeNum += I.second;

if (!Binary->offsetIsCode(StartOffset) \|\|		if (!Binary->addressIsCode(StartAddress) &&
!Binary->offsetIsTransfer(EndOffset)) {		!Binary->addressIsCode(EndAddress))
		continue;

		if (!Binary->addressIsCode(StartAddress) \|\|
		!Binary->addressIsTransfer(EndAddress)) {
InstNotBoundary += I.second;		InstNotBoundary += I.second;
WarnInvalidRange(StartOffset, EndOffset, EndNotBoundaryMsg);		WarnInvalidRange(StartAddress, EndAddress, EndNotBoundaryMsg);
}		}

auto *FRange = Binary->findFuncRangeForOffset(StartOffset);		auto *FRange = Binary->findFuncRange(StartAddress);
if (!FRange) {		if (!FRange) {
UnmatchedRange += I.second;		UnmatchedRange += I.second;
WarnInvalidRange(StartOffset, EndOffset, DanglingRangeMsg);		WarnInvalidRange(StartAddress, EndAddress, DanglingRangeMsg);
continue;		continue;
}		}

if (EndOffset >= FRange->EndOffset) {		if (EndAddress >= FRange->EndAddress) {
RangeCrossFunc += I.second;		RangeCrossFunc += I.second;
WarnInvalidRange(StartOffset, EndOffset, RangeCrossFuncMsg);		WarnInvalidRange(StartAddress, EndAddress, RangeCrossFuncMsg);
}		}

if (!isValidFallThroughRange(StartOffset, EndOffset, Binary)) {		if (Binary->addressIsCode(StartAddress) &&
		Binary->addressIsCode(EndAddress) &&
		!isValidFallThroughRange(StartAddress, EndAddress, Binary)) {
BogusRange += I.second;		BogusRange += I.second;
WarnInvalidRange(StartOffset, EndOffset, BogusRangeMsg);		WarnInvalidRange(StartAddress, EndAddress, BogusRangeMsg);
}		}
}		}

emitWarningSummary(		emitWarningSummary(
InstNotBoundary, TotalRangeNum,		InstNotBoundary, TotalRangeNum,
"of samples are from ranges that are not on instruction boundary.");		"of samples are from ranges that are not on instruction boundary.");
emitWarningSummary(		emitWarningSummary(
UnmatchedRange, TotalRangeNum,		UnmatchedRange, TotalRangeNum,
Show All 20 Lines	void PerfScriptReader::parsePerfTraces() {
warnTruncatedStack();		warnTruncatedStack();
warnInvalidRange();		warnInvalidRange();
generateUnsymbolizedProfile();		generateUnsymbolizedProfile();
AggregatedSamples.clear();		AggregatedSamples.clear();

if (SkipSymbolization)		if (SkipSymbolization)
writeUnsymbolizedProfile(OutputFilename);		writeUnsymbolizedProfile(OutputFilename);
}		}

		wenleiUnsubmitted Not Done Reply Inline Actions Where do we use getBaseAddress in probe queries? Can we use getPreferredBaseAddress directly there? Ideally there should be one definition for base address which is the actual executable segment load address, and changing it here makes it inconsistent. wenlei: Where do we use getBaseAddress in probe queries? Can we use getPreferredBaseAddress directly…
		hoyUnsubmitted Not Done Reply Inline Actions Agreed that `setBaseAddress` should be called once to be consistent. Pseudo probe decoding is based on the preferred address. I think a reasonable fix could be make the decoding offset-based. This will require a change to the bolt code. base; or do an offset to preferred addr translation for every probe look up in llvm-profgen. This is being done in some places, eg, `extractPrefixContextStack`. #2 sounds more practical. WDYT? hoy: Agreed that `setBaseAddress` should be called once to be consistent. Pseudo probe decoding is…
		wleiAuthorUnsubmitted Done Reply Inline Actions Hmm, I found it's very tricky case. Where do we use getBaseAddress in probe queries? We call it in `Binary->offsetToVirtualAddr`. However, we call `IP.advance()` to iterate all the address in the range, in `IP` it will convert the `Address` back to `offset` using the base address. like `Offset + preferred Address - Mmap loading address`. That's another inconsistency.. Then we need to change code inside of the IP. but IP also used in `PerfReader` wlei: Hmm, I found it's very tricky case. > Where do we use getBaseAddress in probe queries? We…
		wleiAuthorUnsubmitted Done Reply Inline Actions Or can we use virtual address(`Offset + preferred Address`) for all the code(PerfReader and ProfileGenerator)? to do this, we can convert the address at the very beginning of PerfReader. like `Address = PhysicalAddress - Mmap loading address + PreferredAddress` wlei: Or can we use virtual address(`Offset + preferred Address`) for all the code(PerfReader and…
		hoyUnsubmitted Not Done Reply Inline Actions I see. This is complicated. How about we always use preferred address when doing this offset to virtual addr conversion, and we rebase all LBR and stack sample addresses to be on the preferred load address in the the perf reader ? hoy: I see. This is complicated. How about we always use preferred address when doing this offset to…
} // end namespace sampleprof		} // end namespace sampleprof
} // end namespace llvm		} // end namespace llvm

llvm/tools/llvm-profgen/ProfileGenerator.h

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	void updateBodySamplesforFunctionProfile(FunctionSamples &FunctionProfile,
uint64_t Count);		uint64_t Count);

void updateFunctionSamples();		void updateFunctionSamples();

void updateTotalSamples();		void updateTotalSamples();

void updateCallsiteSamples();		void updateCallsiteSamples();

StringRef getCalleeNameForOffset(uint64_t TargetOffset);		StringRef getCalleeNameForAddress(uint64_t TargetAddress);

void computeSummaryAndThreshold(SampleProfileMap &ProfileMap);		void computeSummaryAndThreshold(SampleProfileMap &ProfileMap);

void calculateAndShowDensity(const SampleProfileMap &Profiles);		void calculateAndShowDensity(const SampleProfileMap &Profiles);

double calculateDensity(const SampleProfileMap &Profiles,		double calculateDensity(const SampleProfileMap &Profiles,
uint64_t HotCntThreshold);		uint64_t HotCntThreshold);

▲ Show 20 Lines • Show All 264 Lines • Show Last 20 Lines

llvm/tools/llvm-profgen/ProfileGenerator.cpp

Show First 20 Lines • Show All 414 Lines • ▼ Show 20 Lines	bool ProfileGeneratorBase::collectFunctionsFromRawProfile(
std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {		std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {
if (!SampleCounters)		if (!SampleCounters)
return false;		return false;
// Go through all the stacks, ranges and branches in sample counters, use		// Go through all the stacks, ranges and branches in sample counters, use
// the start of the range to look up the function it belongs and record the		// the start of the range to look up the function it belongs and record the
// function.		// function.
for (const auto &CI : *SampleCounters) {		for (const auto &CI : *SampleCounters) {
if (const auto *CtxKey = dyn_cast<AddrBasedCtxKey>(CI.first.getPtr())) {		if (const auto *CtxKey = dyn_cast<AddrBasedCtxKey>(CI.first.getPtr())) {
for (auto Addr : CtxKey->Context) {		for (auto StackAddr : CtxKey->Context) {
if (FuncRange *FRange = Binary->findFuncRangeForOffset(		if (FuncRange *FRange = Binary->findFuncRange(StackAddr))
Binary->virtualAddrToOffset(Addr)))
ProfiledFunctions.insert(FRange->Func);		ProfiledFunctions.insert(FRange->Func);
}		}
}		}

for (auto Item : CI.second.RangeCounter) {		for (auto Item : CI.second.RangeCounter) {
uint64_t StartOffset = Item.first.first;		uint64_t StartAddress = Item.first.first;
if (FuncRange *FRange = Binary->findFuncRangeForOffset(StartOffset))		if (FuncRange *FRange = Binary->findFuncRange(StartAddress))
ProfiledFunctions.insert(FRange->Func);		ProfiledFunctions.insert(FRange->Func);
}		}

for (auto Item : CI.second.BranchCounter) {		for (auto Item : CI.second.BranchCounter) {
uint64_t SourceOffset = Item.first.first;		uint64_t SourceAddress = Item.first.first;
uint64_t TargetOffset = Item.first.first;		uint64_t TargetAddress = Item.first.first;
if (FuncRange *FRange = Binary->findFuncRangeForOffset(SourceOffset))		if (FuncRange *FRange = Binary->findFuncRange(SourceAddress))
ProfiledFunctions.insert(FRange->Func);		ProfiledFunctions.insert(FRange->Func);
if (FuncRange *FRange = Binary->findFuncRangeForOffset(TargetOffset))		if (FuncRange *FRange = Binary->findFuncRange(TargetAddress))
ProfiledFunctions.insert(FRange->Func);		ProfiledFunctions.insert(FRange->Func);
}		}
}		}
return true;		return true;
}		}

bool ProfileGenerator::collectFunctionsFromLLVMProfile(		bool ProfileGenerator::collectFunctionsFromLLVMProfile(
std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {		std::unordered_set<const BinaryFunction *> &ProfiledFunctions) {
▲ Show 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	for (const auto &PI : ProbeCounter) {
if (Probe->isEntry())		if (Probe->isEntry())
FunctionProfile.addHeadSamples(Count);		FunctionProfile.addHeadSamples(Count);
}		}
}		}

void ProfileGenerator::populateBoundarySamplesWithProbesForAllFunctions(		void ProfileGenerator::populateBoundarySamplesWithProbesForAllFunctions(
const BranchSample &BranchCounters) {		const BranchSample &BranchCounters) {
for (const auto &Entry : BranchCounters) {		for (const auto &Entry : BranchCounters) {
uint64_t SourceOffset = Entry.first.first;		uint64_t SourceAddress = Entry.first.first;
uint64_t TargetOffset = Entry.first.second;		uint64_t TargetAddress = Entry.first.second;
uint64_t Count = Entry.second;		uint64_t Count = Entry.second;
assert(Count != 0 && "Unexpected zero weight branch");		assert(Count != 0 && "Unexpected zero weight branch");

StringRef CalleeName = getCalleeNameForOffset(TargetOffset);		StringRef CalleeName = getCalleeNameForAddress(TargetAddress);
if (CalleeName.size() == 0)		if (CalleeName.size() == 0)
continue;		continue;

uint64_t SourceAddress = Binary->offsetToVirtualAddr(SourceOffset);
const MCDecodedPseudoProbe *CallProbe =		const MCDecodedPseudoProbe *CallProbe =
Binary->getCallProbeForAddr(SourceAddress);		Binary->getCallProbeForAddr(SourceAddress);
if (CallProbe == nullptr)		if (CallProbe == nullptr)
continue;		continue;

// Record called target sample and its count.		// Record called target sample and its count.
SampleContextFrameVector FrameVec;		SampleContextFrameVector FrameVec;
Binary->getInlineContextForProbe(CallProbe, FrameVec, true);		Binary->getInlineContextForProbe(CallProbe, FrameVec, true);
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	for (auto &FuncI : Binary->getAllBinaryFunctions()) {
}		}
}		}
} else {		} else {
// For each range, we search for all ranges of the function it belongs to		// For each range, we search for all ranges of the function it belongs to
// and initialize it with zero count, so it remains zero if doesn't hit any		// and initialize it with zero count, so it remains zero if doesn't hit any
// samples. This is to be consistent with compiler that interpret zero count		// samples. This is to be consistent with compiler that interpret zero count
// as unexecuted(cold).		// as unexecuted(cold).
for (const auto &I : RangeCounter) {		for (const auto &I : RangeCounter) {
uint64_t StartOffset = I.first.first;		uint64_t StartAddress = I.first.first;
for (const auto &Range : Binary->getRangesForOffset(StartOffset))		for (const auto &Range : Binary->getRanges(StartAddress))
Ranges[{Range.first, Range.second - 1}] += 0;		Ranges[{Range.first, Range.second - 1}] += 0;
}		}
}		}
RangeSample DisjointRanges;		RangeSample DisjointRanges;
findDisjointRanges(DisjointRanges, Ranges);		findDisjointRanges(DisjointRanges, Ranges);
return DisjointRanges;		return DisjointRanges;
}		}

void ProfileGenerator::populateBodySamplesForAllFunctions(		void ProfileGenerator::populateBodySamplesForAllFunctions(
const RangeSample &RangeCounter) {		const RangeSample &RangeCounter) {
for (const auto &Range : preprocessRangeCounter(RangeCounter)) {		for (const auto &Range : preprocessRangeCounter(RangeCounter)) {
uint64_t RangeBegin = Binary->offsetToVirtualAddr(Range.first.first);		uint64_t RangeBegin = Range.first.first;
uint64_t RangeEnd = Binary->offsetToVirtualAddr(Range.first.second);		uint64_t RangeEnd = Range.first.second;
uint64_t Count = Range.second;		uint64_t Count = Range.second;

InstructionPointer IP(Binary, RangeBegin, true);		InstructionPointer IP(Binary, RangeBegin, true);
// Disjoint ranges may have range in the middle of two instr,		// Disjoint ranges may have range in the middle of two instr,
// e.g. If Instr1 at Addr1, and Instr2 at Addr2, disjoint range		// e.g. If Instr1 at Addr1, and Instr2 at Addr2, disjoint range
// can be Addr1+1 to Addr2-1. We should ignore such range.		// can be Addr1+1 to Addr2-1. We should ignore such range.
if (IP.Address > RangeEnd)		if (IP.Address > RangeEnd)
continue;		continue;

do {		do {
uint64_t Offset = Binary->virtualAddrToOffset(IP.Address);
const SampleContextFrameVector &FrameVec =		const SampleContextFrameVector &FrameVec =
Binary->getFrameLocationStack(Offset);		Binary->getFrameLocationStack(IP.Address);
if (!FrameVec.empty()) {		if (!FrameVec.empty()) {
// FIXME: As accumulating total count per instruction caused some		// FIXME: As accumulating total count per instruction caused some
// regression, we changed to accumulate total count per byte as a		// regression, we changed to accumulate total count per byte as a
// workaround. Tuning hotness threshold on the compiler side might be		// workaround. Tuning hotness threshold on the compiler side might be
// necessary in the future.		// necessary in the future.
FunctionSamples &FunctionProfile = getLeafProfileAndAddTotalSamples(		FunctionSamples &FunctionProfile = getLeafProfileAndAddTotalSamples(
FrameVec, Count * Binary->getInstSize(Offset));		FrameVec, Count * Binary->getInstSize(IP.Address));
updateBodySamplesforFunctionProfile(FunctionProfile, FrameVec.back(),		updateBodySamplesforFunctionProfile(FunctionProfile, FrameVec.back(),
Count);		Count);
}		}
} while (IP.advance() && IP.Address <= RangeEnd);		} while (IP.advance() && IP.Address <= RangeEnd);
}		}
}		}

StringRef ProfileGeneratorBase::getCalleeNameForOffset(uint64_t TargetOffset) {		StringRef
		ProfileGeneratorBase::getCalleeNameForAddress(uint64_t TargetAddress) {
// Get the function range by branch target if it's a call branch.		// Get the function range by branch target if it's a call branch.
auto *FRange = Binary->findFuncRangeForStartOffset(TargetOffset);		auto *FRange = Binary->findFuncRangeForStartAddr(TargetAddress);

// We won't accumulate sample count for a range whose start is not the real		// We won't accumulate sample count for a range whose start is not the real
// function entry such as outlined function or inner labels.		// function entry such as outlined function or inner labels.
if (!FRange \|\| !FRange->IsFuncEntry)		if (!FRange \|\| !FRange->IsFuncEntry)
return StringRef();		return StringRef();

return FunctionSamples::getCanonicalFnName(FRange->getFuncName());		return FunctionSamples::getCanonicalFnName(FRange->getFuncName());
}		}

void ProfileGenerator::populateBoundarySamplesForAllFunctions(		void ProfileGenerator::populateBoundarySamplesForAllFunctions(
const BranchSample &BranchCounters) {		const BranchSample &BranchCounters) {
for (const auto &Entry : BranchCounters) {		for (const auto &Entry : BranchCounters) {
uint64_t SourceOffset = Entry.first.first;		uint64_t SourceAddress = Entry.first.first;
uint64_t TargetOffset = Entry.first.second;		uint64_t TargetAddress = Entry.first.second;
uint64_t Count = Entry.second;		uint64_t Count = Entry.second;
assert(Count != 0 && "Unexpected zero weight branch");		assert(Count != 0 && "Unexpected zero weight branch");

StringRef CalleeName = getCalleeNameForOffset(TargetOffset);		StringRef CalleeName = getCalleeNameForAddress(TargetAddress);
if (CalleeName.size() == 0)		if (CalleeName.size() == 0)
continue;		continue;
// Record called target sample and its count.		// Record called target sample and its count.
const SampleContextFrameVector &FrameVec =		const SampleContextFrameVector &FrameVec =
Binary->getFrameLocationStack(SourceOffset);		Binary->getFrameLocationStack(SourceAddress);
if (!FrameVec.empty()) {		if (!FrameVec.empty()) {
FunctionSamples &FunctionProfile =		FunctionSamples &FunctionProfile =
getLeafProfileAndAddTotalSamples(FrameVec, 0);		getLeafProfileAndAddTotalSamples(FrameVec, 0);
FunctionProfile.addCalledTargetSamples(		FunctionProfile.addCalledTargetSamples(
FrameVec.back().Location.LineOffset,		FrameVec.back().Location.LineOffset,
getBaseDiscriminator(FrameVec.back().Location.Discriminator),		getBaseDiscriminator(FrameVec.back().Location.Discriminator),
CalleeName, Count);		CalleeName, Count);
}		}
▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines

void CSProfileGenerator::populateBodySamplesForFunction(		void CSProfileGenerator::populateBodySamplesForFunction(
FunctionSamples &FunctionProfile, const RangeSample &RangeCounter) {		FunctionSamples &FunctionProfile, const RangeSample &RangeCounter) {
// Compute disjoint ranges first, so we can use MAX		// Compute disjoint ranges first, so we can use MAX
// for calculating count for each location.		// for calculating count for each location.
RangeSample Ranges;		RangeSample Ranges;
findDisjointRanges(Ranges, RangeCounter);		findDisjointRanges(Ranges, RangeCounter);
for (const auto &Range : Ranges) {		for (const auto &Range : Ranges) {
uint64_t RangeBegin = Binary->offsetToVirtualAddr(Range.first.first);		uint64_t RangeBegin = Range.first.first;
uint64_t RangeEnd = Binary->offsetToVirtualAddr(Range.first.second);		uint64_t RangeEnd = Range.first.second;
uint64_t Count = Range.second;		uint64_t Count = Range.second;
// Disjoint ranges have introduce zero-filled gap that		// Disjoint ranges have introduce zero-filled gap that
// doesn't belong to current context, filter them out.		// doesn't belong to current context, filter them out.
if (Count == 0)		if (Count == 0)
continue;		continue;

InstructionPointer IP(Binary, RangeBegin, true);		InstructionPointer IP(Binary, RangeBegin, true);
// Disjoint ranges may have range in the middle of two instr,		// Disjoint ranges may have range in the middle of two instr,
// e.g. If Instr1 at Addr1, and Instr2 at Addr2, disjoint range		// e.g. If Instr1 at Addr1, and Instr2 at Addr2, disjoint range
// can be Addr1+1 to Addr2-1. We should ignore such range.		// can be Addr1+1 to Addr2-1. We should ignore such range.
if (IP.Address > RangeEnd)		if (IP.Address > RangeEnd)
continue;		continue;

do {		do {
uint64_t Offset = Binary->virtualAddrToOffset(IP.Address);		auto LeafLoc = Binary->getInlineLeafFrameLoc(IP.Address);
auto LeafLoc = Binary->getInlineLeafFrameLoc(Offset);
if (LeafLoc) {		if (LeafLoc) {
// Recording body sample for this specific context		// Recording body sample for this specific context
updateBodySamplesforFunctionProfile(FunctionProfile, *LeafLoc, Count);		updateBodySamplesforFunctionProfile(FunctionProfile, *LeafLoc, Count);
FunctionProfile.addTotalSamples(Count);		FunctionProfile.addTotalSamples(Count);
}		}
} while (IP.advance() && IP.Address <= RangeEnd);		} while (IP.advance() && IP.Address <= RangeEnd);
}		}
}		}

void CSProfileGenerator::populateBoundarySamplesForFunction(		void CSProfileGenerator::populateBoundarySamplesForFunction(
ContextTrieNode *Node, const BranchSample &BranchCounters) {		ContextTrieNode *Node, const BranchSample &BranchCounters) {

for (const auto &Entry : BranchCounters) {		for (const auto &Entry : BranchCounters) {
uint64_t SourceOffset = Entry.first.first;		uint64_t SourceAddress = Entry.first.first;
uint64_t TargetOffset = Entry.first.second;		uint64_t TargetAddress = Entry.first.second;
uint64_t Count = Entry.second;		uint64_t Count = Entry.second;
assert(Count != 0 && "Unexpected zero weight branch");		assert(Count != 0 && "Unexpected zero weight branch");

StringRef CalleeName = getCalleeNameForOffset(TargetOffset);		StringRef CalleeName = getCalleeNameForAddress(TargetAddress);
if (CalleeName.size() == 0)		if (CalleeName.size() == 0)
continue;		continue;

ContextTrieNode *CallerNode = Node;		ContextTrieNode *CallerNode = Node;
LineLocation CalleeCallSite(0, 0);		LineLocation CalleeCallSite(0, 0);
if (CallerNode != &getRootContext()) {		if (CallerNode != &getRootContext()) {
// Record called target sample and its count		// Record called target sample and its count
auto LeafLoc = Binary->getInlineLeafFrameLoc(SourceOffset);		auto LeafLoc = Binary->getInlineLeafFrameLoc(SourceAddress);
if (LeafLoc) {		if (LeafLoc) {
CallerNode->getFunctionSamples()->addCalledTargetSamples(		CallerNode->getFunctionSamples()->addCalledTargetSamples(
LeafLoc->Location.LineOffset,		LeafLoc->Location.LineOffset,
getBaseDiscriminator(LeafLoc->Location.Discriminator), CalleeName,		getBaseDiscriminator(LeafLoc->Location.Discriminator), CalleeName,
Count);		Count);
// Record head sample for called target(callee)		// Record head sample for called target(callee)
CalleeCallSite = LeafLoc->Location;		CalleeCallSite = LeafLoc->Location;
}		}
▲ Show 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	void ProfileGeneratorBase::extractProbesFromRange(
const RangeSample *PRanges = &RangeCounter;		const RangeSample *PRanges = &RangeCounter;
RangeSample Ranges;		RangeSample Ranges;
if (FindDisjointRanges) {		if (FindDisjointRanges) {
findDisjointRanges(Ranges, RangeCounter);		findDisjointRanges(Ranges, RangeCounter);
PRanges = &Ranges;		PRanges = &Ranges;
}		}

for (const auto &Range : *PRanges) {		for (const auto &Range : *PRanges) {
uint64_t RangeBegin = Binary->offsetToVirtualAddr(Range.first.first);		uint64_t RangeBegin = Range.first.first;
uint64_t RangeEnd = Binary->offsetToVirtualAddr(Range.first.second);		uint64_t RangeEnd = Range.first.second;
uint64_t Count = Range.second;		uint64_t Count = Range.second;

InstructionPointer IP(Binary, RangeBegin, true);		InstructionPointer IP(Binary, RangeBegin, true);
// Disjoint ranges may have range in the middle of two instr,		// Disjoint ranges may have range in the middle of two instr,
// e.g. If Instr1 at Addr1, and Instr2 at Addr2, disjoint range		// e.g. If Instr1 at Addr1, and Instr2 at Addr2, disjoint range
// can be Addr1+1 to Addr2-1. We should ignore such range.		// can be Addr1+1 to Addr2-1. We should ignore such range.
if (IP.Address > RangeEnd)		if (IP.Address > RangeEnd)
continue;		continue;

do {		do {
const AddressProbesMap &Address2ProbesMap =		const AddressProbesMap &Address2ProbesMap =
Binary->getAddress2ProbesMap();		Binary->getAddress2ProbesMap();
auto It = Address2ProbesMap.find(IP.Address);		auto It = Address2ProbesMap.find(IP.Address);
if (It != Address2ProbesMap.end()) {		if (It != Address2ProbesMap.end()) {
for (const auto &Probe : It->second) {		for (const auto &Probe : It->second) {
ProbeCounter[&Probe] += Count;		ProbeCounter[&Probe] += Count;
}		}
}		}
} while (IP.advance() && IP.Address <= RangeEnd);		} while (IP.advance() && IP.Address <= RangeEnd);
}		}
}		}

static void		static void extractPrefixContextStack(SampleContextFrameVector &ContextStack,
extractPrefixContextStack(SampleContextFrameVector &ContextStack,		const SmallVectorImpl<uint64_t> &AddrVec,
const SmallVectorImpl<uint64_t> &Addresses,
ProfiledBinary *Binary) {		ProfiledBinary *Binary) {
SmallVector<const MCDecodedPseudoProbe *, 16> Probes;		SmallVector<const MCDecodedPseudoProbe *, 16> Probes;
for (auto Addr : reverse(Addresses)) {		for (auto Address : reverse(AddrVec)) {
const MCDecodedPseudoProbe *CallProbe = Binary->getCallProbeForAddr(Addr);		const MCDecodedPseudoProbe *CallProbe =
		Binary->getCallProbeForAddr(Address);
// These could be the cases when a probe is not found at a calliste. Cutting		// These could be the cases when a probe is not found at a calliste. Cutting
// off the context from here since the inliner will not know how to consume		// off the context from here since the inliner will not know how to consume
// a context with unknown callsites.		// a context with unknown callsites.
// 1. for functions that are not sampled when		// 1. for functions that are not sampled when
// --decode-probe-for-profiled-functions-only is on.		// --decode-probe-for-profiled-functions-only is on.
// 2. for a merged callsite. Callsite merging may cause the loss of original		// 2. for a merged callsite. Callsite merging may cause the loss of original
// probe IDs.		// probe IDs.
// 3. for an external callsite.		// 3. for an external callsite.
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	for (auto *FunctionProfile : I.second) {
}		}
}		}
}		}
}		}

void CSProfileGenerator::populateBoundarySamplesWithProbes(		void CSProfileGenerator::populateBoundarySamplesWithProbes(
const BranchSample &BranchCounter, SampleContextFrames ContextStack) {		const BranchSample &BranchCounter, SampleContextFrames ContextStack) {
for (const auto &BI : BranchCounter) {		for (const auto &BI : BranchCounter) {
uint64_t SourceOffset = BI.first.first;		uint64_t SourceAddress = BI.first.first;
uint64_t TargetOffset = BI.first.second;		uint64_t TargetAddress = BI.first.second;
uint64_t Count = BI.second;		uint64_t Count = BI.second;
uint64_t SourceAddress = Binary->offsetToVirtualAddr(SourceOffset);
const MCDecodedPseudoProbe *CallProbe =		const MCDecodedPseudoProbe *CallProbe =
Binary->getCallProbeForAddr(SourceAddress);		Binary->getCallProbeForAddr(SourceAddress);
if (CallProbe == nullptr)		if (CallProbe == nullptr)
continue;		continue;
FunctionSamples &FunctionProfile =		FunctionSamples &FunctionProfile =
getFunctionProfileForLeafProbe(ContextStack, CallProbe);		getFunctionProfileForLeafProbe(ContextStack, CallProbe);
FunctionProfile.addBodySamples(CallProbe->getIndex(), 0, Count);		FunctionProfile.addBodySamples(CallProbe->getIndex(), 0, Count);
FunctionProfile.addTotalSamples(Count);		FunctionProfile.addTotalSamples(Count);
StringRef CalleeName = getCalleeNameForOffset(TargetOffset);		StringRef CalleeName = getCalleeNameForAddress(TargetAddress);
if (CalleeName.size() == 0)		if (CalleeName.size() == 0)
continue;		continue;
FunctionProfile.addCalledTargetSamples(CallProbe->getIndex(), 0, CalleeName,		FunctionProfile.addCalledTargetSamples(CallProbe->getIndex(), 0, CalleeName,
Count);		Count);
}		}
}		}

ContextTrieNode *CSProfileGenerator::getContextNodeForLeafProbe(		ContextTrieNode *CSProfileGenerator::getContextNodeForLeafProbe(
Show All 33 Lines

llvm/tools/llvm-profgen/ProfiledBinary.h

Show First 20 Lines • Show All 50 Lines • ▼ Show 20 Lines

namespace llvm {		namespace llvm {
namespace sampleprof {		namespace sampleprof {

class ProfiledBinary;		class ProfiledBinary;

struct InstructionPointer {		struct InstructionPointer {
const ProfiledBinary *Binary;		const ProfiledBinary *Binary;
union {		// Address of the executable segment of the binary.
// Offset of the executable segment of the binary.
uint64_t Offset = 0;
// Also used as address in unwinder
uint64_t Address;		uint64_t Address;
};
// Index to the sorted code address array of the binary.		// Index to the sorted code address array of the binary.
uint64_t Index = 0;		uint64_t Index = 0;
InstructionPointer(const ProfiledBinary *Binary, uint64_t Address,		InstructionPointer(const ProfiledBinary *Binary, uint64_t Address,
bool RoundToNext = false);		bool RoundToNext = false);
bool advance();		bool advance();
bool backward();		bool backward();
void update(uint64_t Addr);		void update(uint64_t Addr);
};		};
Show All 22 Lines	uint64_t getFuncSize() {
}		}
return Sum;		return Sum;
}		}
};		};

// Info about function range. A function can be split into multiple		// Info about function range. A function can be split into multiple
// non-continuous ranges, each range corresponds to one FuncRange.		// non-continuous ranges, each range corresponds to one FuncRange.
struct FuncRange {		struct FuncRange {
uint64_t StartOffset;		uint64_t StartAddress;
// EndOffset is an exclusive bound.		// EndAddress is an exclusive bound.
uint64_t EndOffset;		uint64_t EndAddress;
// Function the range belongs to		// Function the range belongs to
BinaryFunction *Func;		BinaryFunction *Func;
// Whether the start offset is the real entry of the function.		// Whether the start address is the real entry of the function.
bool IsFuncEntry = false;		bool IsFuncEntry = false;

StringRef getFuncName() { return Func->FuncName; }		StringRef getFuncName() { return Func->FuncName; }
};		};

// PrologEpilog offset tracker, used to filter out broken stack samples		// PrologEpilog address tracker, used to filter out broken stack samples
// Currently we use a heuristic size (two) to infer prolog and epilog		// Currently we use a heuristic size (two) to infer prolog and epilog
// based on the start address and return address. In the future,		// based on the start address and return address. In the future,
// we will switch to Dwarf CFI based tracker		// we will switch to Dwarf CFI based tracker
struct PrologEpilogTracker {		struct PrologEpilogTracker {
// A set of prolog and epilog offsets. Used by virtual unwinding.		// A set of prolog and epilog addresses. Used by virtual unwinding.
std::unordered_set<uint64_t> PrologEpilogSet;		std::unordered_set<uint64_t> PrologEpilogSet;
ProfiledBinary *Binary;		ProfiledBinary *Binary;
PrologEpilogTracker(ProfiledBinary *Bin) : Binary(Bin){};		PrologEpilogTracker(ProfiledBinary *Bin) : Binary(Bin){};

// Take the two addresses from the start of function as prolog		// Take the two addresses from the start of function as prolog
void inferPrologOffsets(std::map<uint64_t, FuncRange> &FuncStartOffsetMap) {		void
for (auto I : FuncStartOffsetMap) {		inferPrologAddresses(std::map<uint64_t, FuncRange> &FuncStartAddressMap) {
		for (auto I : FuncStartAddressMap) {
PrologEpilogSet.insert(I.first);		PrologEpilogSet.insert(I.first);
InstructionPointer IP(Binary, I.first);		InstructionPointer IP(Binary, I.first);
if (!IP.advance())		if (!IP.advance())
break;		break;
PrologEpilogSet.insert(IP.Offset);		PrologEpilogSet.insert(IP.Address);
}		}
}		}

// Take the last two addresses before the return address as epilog		// Take the last two addresses before the return address as epilog
void inferEpilogOffsets(std::unordered_set<uint64_t> &RetAddrs) {		void inferEpilogAddresses(std::unordered_set<uint64_t> &RetAddrs) {
for (auto Addr : RetAddrs) {		for (auto Addr : RetAddrs) {
PrologEpilogSet.insert(Addr);		PrologEpilogSet.insert(Addr);
InstructionPointer IP(Binary, Addr);		InstructionPointer IP(Binary, Addr);
if (!IP.backward())		if (!IP.backward())
break;		break;
PrologEpilogSet.insert(IP.Offset);		PrologEpilogSet.insert(IP.Address);
}		}
}		}
};		};

// Track function byte size under different context (outlined version as well as		// Track function byte size under different context (outlined version as well as
// various inlined versions). It also provides query support to get function		// various inlined versions). It also provides query support to get function
// size with the best matching context, which is used to help pre-inliner use		// size with the best matching context, which is used to help pre-inliner use
// accurate post-optimization size to make decisions.		// accurate post-optimization size to make decisions.
Show All 27 Lines	private:
// Root node for context trie tree, node that this is a reverse context trie		// Root node for context trie tree, node that this is a reverse context trie
// with callee as parent and caller as child. This way we can traverse from		// with callee as parent and caller as child. This way we can traverse from
// root to find the best/longest matching context if an exact match does not		// root to find the best/longest matching context if an exact match does not
// exist. It gives us the best possible estimate for function's post-inline,		// exist. It gives us the best possible estimate for function's post-inline,
// post-optimization byte size.		// post-optimization byte size.
ContextTrieNode RootContext;		ContextTrieNode RootContext;
};		};

using OffsetRange = std::pair<uint64_t, uint64_t>;		using AddressRange = std::pair<uint64_t, uint64_t>;

class ProfiledBinary {		class ProfiledBinary {
// Absolute path of the executable binary.		// Absolute path of the executable binary.
std::string Path;		std::string Path;
// Path of the debug info binary.		// Path of the debug info binary.
std::string DebugBinaryPath;		std::string DebugBinaryPath;
// Path of symbolizer path which should be pointed to binary with debug info.		// Path of symbolizer path which should be pointed to binary with debug info.
StringRef SymbolizerPath;		StringRef SymbolizerPath;
Show All 21 Lines	class ProfiledBinary {
std::set<std::pair<uint64_t, uint64_t>> TextSections;		std::set<std::pair<uint64_t, uint64_t>> TextSections;

// A map of mapping function name to BinaryFunction info.		// A map of mapping function name to BinaryFunction info.
std::unordered_map<std::string, BinaryFunction> BinaryFunctions;		std::unordered_map<std::string, BinaryFunction> BinaryFunctions;

// A list of binary functions that have samples.		// A list of binary functions that have samples.
std::unordered_set<const BinaryFunction *> ProfiledFunctions;		std::unordered_set<const BinaryFunction *> ProfiledFunctions;

// An ordered map of mapping function's start offset to function range		// An ordered map of mapping function's start address to function range
// relevant info. Currently to determine if the offset of ELF is the start of		// relevant info. Currently to determine if the address of ELF is the start of
// a real function, we leverage the function range info from DWARF.		// a real function, we leverage the function range info from DWARF.
std::map<uint64_t, FuncRange> StartOffset2FuncRangeMap;		std::map<uint64_t, FuncRange> StartAddrToFuncRangeMap;

// Offset to context location map. Used to expand the context.		// Address to context location map. Used to expand the context.
std::unordered_map<uint64_t, SampleContextFrameVector> Offset2LocStackMap;		std::unordered_map<uint64_t, SampleContextFrameVector> AddressToLocStackMap;

// Offset to instruction size map. Also used for quick offset lookup.		// Address to instruction size map. Also used for quick Address lookup.
std::unordered_map<uint64_t, uint64_t> Offset2InstSizeMap;		std::unordered_map<uint64_t, uint64_t> AddressToInstSizeMap;

// An array of offsets of all instructions sorted in increasing order. The		// An array of Addresses of all instructions sorted in increasing order. The
// sorting is needed to fast advance to the next forward/backward instruction.		// sorting is needed to fast advance to the next forward/backward instruction.
std::vector<uint64_t> CodeAddrOffsets;		std::vector<uint64_t> CodeAddressVec;
// A set of call instruction offsets. Used by virtual unwinding.		// A set of call instruction addresses. Used by virtual unwinding.
std::unordered_set<uint64_t> CallOffsets;		std::unordered_set<uint64_t> CallAddressSet;
// A set of return instruction offsets. Used by virtual unwinding.		// A set of return instruction addresses. Used by virtual unwinding.
std::unordered_set<uint64_t> RetOffsets;		std::unordered_set<uint64_t> RetAddressSet;
// An ordered set of unconditional branch instruction offsets.		// An ordered set of unconditional branch instruction addresses.
std::set<uint64_t> UncondBranchOffsets;		std::set<uint64_t> UncondBranchAddrSet;
// A set of branch instruction offsets.		// A set of branch instruction addresses.
std::unordered_set<uint64_t> BranchOffsets;		std::unordered_set<uint64_t> BranchAddressSet;

// Estimate and track function prolog and epilog ranges.		// Estimate and track function prolog and epilog ranges.
PrologEpilogTracker ProEpilogTracker;		PrologEpilogTracker ProEpilogTracker;

// Track function sizes under different context		// Track function sizes under different context
BinarySizeContextTracker FuncSizeTracker;		BinarySizeContextTracker FuncSizeTracker;

// The symbolizer used to get inline context for an instruction.		// The symbolizer used to get inline context for an instruction.
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	class ProfiledBinary {

// Load debug info of subprograms from DWARF section.		// Load debug info of subprograms from DWARF section.
void loadSymbolsFromDWARF(ObjectFile &Obj);		void loadSymbolsFromDWARF(ObjectFile &Obj);

// Load debug info from DWARF unit.		// Load debug info from DWARF unit.
void loadSymbolsFromDWARFUnit(DWARFUnit &CompilationUnit);		void loadSymbolsFromDWARFUnit(DWARFUnit &CompilationUnit);

// A function may be spilt into multiple non-continuous address ranges. We use		// A function may be spilt into multiple non-continuous address ranges. We use
// this to set whether start offset of a function is the real entry of the		// this to set whether start address of a function is the real entry of the
// function and also set false to the non-function label.		// function and also set false to the non-function label.
void setIsFuncEntry(uint64_t Offset, StringRef RangeSymName);		void setIsFuncEntry(uint64_t Address, StringRef RangeSymName);

// Warn if no entry range exists in the function.		// Warn if no entry range exists in the function.
void warnNoFuncEntry();		void warnNoFuncEntry();

/// Dissassemble the text section and build various address maps.		/// Dissassemble the text section and build various address maps.
void disassemble(const ELFObjectFileBase *O);		void disassemble(const ELFObjectFileBase *O);

/// Helper function to dissassemble the symbol and extract info for unwinding		/// Helper function to dissassemble the symbol and extract info for unwinding
Show All 20 Lines	ProfiledBinary(const StringRef ExeBinPath, const StringRef DebugBinPath)
// Point to executable binary if debug info binary is not specified.		// Point to executable binary if debug info binary is not specified.
SymbolizerPath = DebugBinPath.empty() ? ExeBinPath : DebugBinPath;		SymbolizerPath = DebugBinPath.empty() ? ExeBinPath : DebugBinPath;
setupSymbolizer();		setupSymbolizer();
load();		load();
}		}

void decodePseudoProbe();		void decodePseudoProbe();

uint64_t virtualAddrToOffset(uint64_t VirtualAddress) const {
return VirtualAddress - BaseAddress;
}
uint64_t offsetToVirtualAddr(uint64_t Offset) const {
return Offset + BaseAddress;
}
StringRef getPath() const { return Path; }		StringRef getPath() const { return Path; }
StringRef getName() const { return llvm::sys::path::filename(Path); }		StringRef getName() const { return llvm::sys::path::filename(Path); }
uint64_t getBaseAddress() const { return BaseAddress; }		uint64_t getBaseAddress() const { return BaseAddress; }
void setBaseAddress(uint64_t Address) { BaseAddress = Address; }		void setBaseAddress(uint64_t Address) { BaseAddress = Address; }

		// Canonicalize to use preferred load address as base address.
		uint64_t canonicalizeVirtualAddress(uint64_t Address) {
		wenleiUnsubmitted Not Done Reply Inline Actions nit: `canonicalizeVirtualAddress` is probably a better name. wenlei: nit: `canonicalizeVirtualAddress` is probably a better name.
		return Address - BaseAddress + getPreferredBaseAddress();
		}
// Return the preferred load address for the first executable segment.		// Return the preferred load address for the first executable segment.
uint64_t getPreferredBaseAddress() const { return PreferredTextSegmentAddresses[0]; }		uint64_t getPreferredBaseAddress() const { return PreferredTextSegmentAddresses[0]; }
// Return the preferred load address for the first loadable segment.		// Return the preferred load address for the first loadable segment.
uint64_t getFirstLoadableAddress() const { return FirstLoadableAddress; }		uint64_t getFirstLoadableAddress() const { return FirstLoadableAddress; }
// Return the file offset for the first executable segment.		// Return the file offset for the first executable segment.
uint64_t getTextSegmentOffset() const { return TextSegmentOffsets[0]; }		uint64_t getTextSegmentOffset() const { return TextSegmentOffsets[0]; }
const std::vector<uint64_t> &getPreferredTextSegmentAddresses() const {		const std::vector<uint64_t> &getPreferredTextSegmentAddresses() const {
return PreferredTextSegmentAddresses;		return PreferredTextSegmentAddresses;
}		}
const std::vector<uint64_t> &getTextSegmentOffsets() const {		const std::vector<uint64_t> &getTextSegmentOffsets() const {
return TextSegmentOffsets;		return TextSegmentOffsets;
}		}

uint64_t getInstSize(uint64_t Offset) const {		uint64_t getInstSize(uint64_t Address) const {
auto I = Offset2InstSizeMap.find(Offset);		auto I = AddressToInstSizeMap.find(Address);
if (I == Offset2InstSizeMap.end())		if (I == AddressToInstSizeMap.end())
return 0;		return 0;
return I->second;		return I->second;
}		}

bool offsetIsCode(uint64_t Offset) const {
return Offset2InstSizeMap.find(Offset) != Offset2InstSizeMap.end();
}
bool addressIsCode(uint64_t Address) const {		bool addressIsCode(uint64_t Address) const {
uint64_t Offset = virtualAddrToOffset(Address);		return AddressToInstSizeMap.find(Address) != AddressToInstSizeMap.end();
return offsetIsCode(Offset);
}		}

bool addressIsCall(uint64_t Address) const {		bool addressIsCall(uint64_t Address) const {
uint64_t Offset = virtualAddrToOffset(Address);		return CallAddressSet.count(Address);
return CallOffsets.count(Offset);
}		}
bool addressIsReturn(uint64_t Address) const {		bool addressIsReturn(uint64_t Address) const {
uint64_t Offset = virtualAddrToOffset(Address);		return RetAddressSet.count(Address);
return RetOffsets.count(Offset);
}		}
bool addressInPrologEpilog(uint64_t Address) const {		bool addressInPrologEpilog(uint64_t Address) const {
uint64_t Offset = virtualAddrToOffset(Address);		return ProEpilogTracker.PrologEpilogSet.count(Address);
return ProEpilogTracker.PrologEpilogSet.count(Offset);
}		}

bool offsetIsTransfer(uint64_t Offset) {		bool addressIsTransfer(uint64_t Address) {
return BranchOffsets.count(Offset) \|\| RetOffsets.count(Offset) \|\|		return BranchAddressSet.count(Address) \|\| RetAddressSet.count(Address) \|\|
CallOffsets.count(Offset);		CallAddressSet.count(Address);
}		}

bool rangeCrossUncondBranch(uint64_t Start, uint64_t End) {		bool rangeCrossUncondBranch(uint64_t Start, uint64_t End) {
if (Start >= End)		if (Start >= End)
return false;		return false;
auto R = UncondBranchOffsets.lower_bound(Start);		auto R = UncondBranchAddrSet.lower_bound(Start);
return R != UncondBranchOffsets.end() && *R < End;		return R != UncondBranchAddrSet.end() && *R < End;
}		}

uint64_t getAddressforIndex(uint64_t Index) const {		uint64_t getAddressforIndex(uint64_t Index) const {
return offsetToVirtualAddr(CodeAddrOffsets[Index]);		return CodeAddressVec[Index];
}		}

size_t getCodeOffsetsSize() const { return CodeAddrOffsets.size(); }		size_t getCodeAddrVecSize() const { return CodeAddressVec.size(); }

bool usePseudoProbes() const { return UsePseudoProbes; }		bool usePseudoProbes() const { return UsePseudoProbes; }
bool useFSDiscriminator() const { return UseFSDiscriminator; }		bool useFSDiscriminator() const { return UseFSDiscriminator; }
// Get the index in CodeAddrOffsets for the address		// Get the index in CodeAddressVec for the address
// As we might get an address which is not the code		// As we might get an address which is not the code
// here it would round to the next valid code address by		// here it would round to the next valid code address by
// using lower bound operation		// using lower bound operation
uint32_t getIndexForOffset(uint64_t Offset) const {
auto Low = llvm::lower_bound(CodeAddrOffsets, Offset);
return Low - CodeAddrOffsets.begin();
}
uint32_t getIndexForAddr(uint64_t Address) const {		uint32_t getIndexForAddr(uint64_t Address) const {
uint64_t Offset = virtualAddrToOffset(Address);		auto Low = llvm::lower_bound(CodeAddressVec, Address);
return getIndexForOffset(Offset);		return Low - CodeAddressVec.begin();
}		}

uint64_t getCallAddrFromFrameAddr(uint64_t FrameAddr) const {		uint64_t getCallAddrFromFrameAddr(uint64_t FrameAddr) const {
if (FrameAddr == ExternalAddr)		if (FrameAddr == ExternalAddr)
return ExternalAddr;		return ExternalAddr;
auto I = getIndexForAddr(FrameAddr);		auto I = getIndexForAddr(FrameAddr);
FrameAddr = I ? getAddressforIndex(I - 1) : 0;		FrameAddr = I ? getAddressforIndex(I - 1) : 0;
if (FrameAddr && addressIsCall(FrameAddr))		if (FrameAddr && addressIsCall(FrameAddr))
return FrameAddr;		return FrameAddr;
return 0;		return 0;
}		}

FuncRange *findFuncRangeForStartOffset(uint64_t Offset) {		FuncRange *findFuncRangeForStartAddr(uint64_t Address) {
auto I = StartOffset2FuncRangeMap.find(Offset);		auto I = StartAddrToFuncRangeMap.find(Address);
if (I == StartOffset2FuncRangeMap.end())		if (I == StartAddrToFuncRangeMap.end())
return nullptr;		return nullptr;
return &I->second;		return &I->second;
}		}

// Binary search the function range which includes the input offset.		// Binary search the function range which includes the input address.
FuncRange *findFuncRangeForOffset(uint64_t Offset) {		FuncRange *findFuncRange(uint64_t Address) {
auto I = StartOffset2FuncRangeMap.upper_bound(Offset);		auto I = StartAddrToFuncRangeMap.upper_bound(Address);
if (I == StartOffset2FuncRangeMap.begin())		if (I == StartAddrToFuncRangeMap.begin())
return nullptr;		return nullptr;
I--;		I--;

if (Offset >= I->second.EndOffset)		if (Address >= I->second.EndAddress)
return nullptr;		return nullptr;

return &I->second;		return &I->second;
}		}

// Get all ranges of one function.		// Get all ranges of one function.
RangesTy getRangesForOffset(uint64_t Offset) {		RangesTy getRanges(uint64_t Address) {
auto *FRange = findFuncRangeForOffset(Offset);		auto *FRange = findFuncRange(Address);
// Ignore the range which falls into plt section or system lib.		// Ignore the range which falls into plt section or system lib.
if (!FRange)		if (!FRange)
return RangesTy();		return RangesTy();

return FRange->Func->Ranges;		return FRange->Func->Ranges;
}		}

const std::unordered_map<std::string, BinaryFunction> &		const std::unordered_map<std::string, BinaryFunction> &
Show All 19 Lines	public:
uint32_t getFuncSizeForContext(const ContextTrieNode *ContextNode) {		uint32_t getFuncSizeForContext(const ContextTrieNode *ContextNode) {
return FuncSizeTracker.getFuncSizeForContext(ContextNode);		return FuncSizeTracker.getFuncSizeForContext(ContextNode);
}		}

// Load the symbols from debug table and populate into symbol list.		// Load the symbols from debug table and populate into symbol list.
void populateSymbolListFromDWARF(ProfileSymbolList &SymbolList);		void populateSymbolListFromDWARF(ProfileSymbolList &SymbolList);

const SampleContextFrameVector &		const SampleContextFrameVector &
getFrameLocationStack(uint64_t Offset, bool UseProbeDiscriminator = false) {		getFrameLocationStack(uint64_t Address, bool UseProbeDiscriminator = false) {
auto I = Offset2LocStackMap.emplace(Offset, SampleContextFrameVector());		auto I = AddressToLocStackMap.emplace(Address, SampleContextFrameVector());
if (I.second) {		if (I.second) {
InstructionPointer IP(this, Offset);		InstructionPointer IP(this, Address);
I.first->second = symbolize(IP, true, UseProbeDiscriminator);		I.first->second = symbolize(IP, true, UseProbeDiscriminator);
}		}
return I.first->second;		return I.first->second;
}		}

Optional<SampleContextFrame> getInlineLeafFrameLoc(uint64_t Offset) {		Optional<SampleContextFrame> getInlineLeafFrameLoc(uint64_t Address) {
const auto &Stack = getFrameLocationStack(Offset);		const auto &Stack = getFrameLocationStack(Address);
if (Stack.empty())		if (Stack.empty())
return {};		return {};
return Stack.back();		return Stack.back();
}		}

void flushSymbolizer() { Symbolizer.reset(); }		void flushSymbolizer() { Symbolizer.reset(); }

// Compare two addresses' inline context		// Compare two addresses' inline context
bool inlineContextEqual(uint64_t Add1, uint64_t Add2);		bool inlineContextEqual(uint64_t Add1, uint64_t Add2);

// Get the full context of the current stack with inline context filled in.		// Get the full context of the current stack with inline context filled in.
// It will search the disassembling info stored in Offset2LocStackMap. This is		// It will search the disassembling info stored in AddressToLocStackMap. This
// used as the key of function sample map		// is used as the key of function sample map
SampleContextFrameVector		SampleContextFrameVector
getExpandedContext(const SmallVectorImpl<uint64_t> &Stack,		getExpandedContext(const SmallVectorImpl<uint64_t> &Stack,
bool &WasLeafInlined);		bool &WasLeafInlined);
// Go through instructions among the given range and record its size for the		// Go through instructions among the given range and record its size for the
// inline context.		// inline context.
void computeInlinedContextSizeForRange(uint64_t StartOffset,		void computeInlinedContextSizeForRange(uint64_t StartAddress,
uint64_t EndOffset);		uint64_t EndAddress);

void computeInlinedContextSizeForFunc(const BinaryFunction *Func);		void computeInlinedContextSizeForFunc(const BinaryFunction *Func);

const MCDecodedPseudoProbe *getCallProbeForAddr(uint64_t Address) const {		const MCDecodedPseudoProbe *getCallProbeForAddr(uint64_t Address) const {
return ProbeDecoder.getCallProbeForAddr(Address);		return ProbeDecoder.getCallProbeForAddr(Address);
}		}

void getInlineContextForProbe(const MCDecodedPseudoProbe *Probe,		void getInlineContextForProbe(const MCDecodedPseudoProbe *Probe,
▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

llvm/tools/llvm-profgen/ProfiledBinary.cpp

Show First 20 Lines • Show All 159 Lines • ▼ Show 20 Lines

void ProfiledBinary::warnNoFuncEntry() {		void ProfiledBinary::warnNoFuncEntry() {
uint64_t NoFuncEntryNum = 0;		uint64_t NoFuncEntryNum = 0;
for (auto &F : BinaryFunctions) {		for (auto &F : BinaryFunctions) {
if (F.second.Ranges.empty())		if (F.second.Ranges.empty())
continue;		continue;
bool hasFuncEntry = false;		bool hasFuncEntry = false;
for (auto &R : F.second.Ranges) {		for (auto &R : F.second.Ranges) {
if (FuncRange *FR = findFuncRangeForStartOffset(R.first)) {		if (FuncRange *FR = findFuncRangeForStartAddr(R.first)) {
if (FR->IsFuncEntry) {		if (FR->IsFuncEntry) {
hasFuncEntry = true;		hasFuncEntry = true;
break;		break;
}		}
}		}
}		}

if (!hasFuncEntry) {		if (!hasFuncEntry) {
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	void ProfiledBinary::load() {
} else {		} else {
loadSymbolsFromDWARF(*cast<ObjectFile>(&ExeBinary));		loadSymbolsFromDWARF(*cast<ObjectFile>(&ExeBinary));
}		}

// Disassemble the text sections.		// Disassemble the text sections.
disassemble(Obj);		disassemble(Obj);

// Use function start and return address to infer prolog and epilog		// Use function start and return address to infer prolog and epilog
ProEpilogTracker.inferPrologOffsets(StartOffset2FuncRangeMap);		ProEpilogTracker.inferPrologAddresses(StartAddrToFuncRangeMap);
ProEpilogTracker.inferEpilogOffsets(RetOffsets);		ProEpilogTracker.inferEpilogAddresses(RetAddressSet);

warnNoFuncEntry();		warnNoFuncEntry();

// TODO: decode other sections.		// TODO: decode other sections.
}		}

bool ProfiledBinary::inlineContextEqual(uint64_t Address1, uint64_t Address2) {		bool ProfiledBinary::inlineContextEqual(uint64_t Address1, uint64_t Address2) {
uint64_t Offset1 = virtualAddrToOffset(Address1);		const SampleContextFrameVector &Context1 = getFrameLocationStack(Address1);
uint64_t Offset2 = virtualAddrToOffset(Address2);		const SampleContextFrameVector &Context2 = getFrameLocationStack(Address2);
const SampleContextFrameVector &Context1 = getFrameLocationStack(Offset1);
const SampleContextFrameVector &Context2 = getFrameLocationStack(Offset2);
if (Context1.size() != Context2.size())		if (Context1.size() != Context2.size())
return false;		return false;
if (Context1.empty())		if (Context1.empty())
return false;		return false;
// The leaf frame contains location within the leaf, and it		// The leaf frame contains location within the leaf, and it
// needs to be remove that as it's not part of the calling context		// needs to be remove that as it's not part of the calling context
return std::equal(Context1.begin(), Context1.begin() + Context1.size() - 1,		return std::equal(Context1.begin(), Context1.begin() + Context1.size() - 1,
Context2.begin(), Context2.begin() + Context2.size() - 1);		Context2.begin(), Context2.begin() + Context2.size() - 1);
}		}

SampleContextFrameVector		SampleContextFrameVector
ProfiledBinary::getExpandedContext(const SmallVectorImpl<uint64_t> &Stack,		ProfiledBinary::getExpandedContext(const SmallVectorImpl<uint64_t> &Stack,
bool &WasLeafInlined) {		bool &WasLeafInlined) {
SampleContextFrameVector ContextVec;		SampleContextFrameVector ContextVec;
if (Stack.empty())		if (Stack.empty())
return ContextVec;		return ContextVec;
// Process from frame root to leaf		// Process from frame root to leaf
for (auto Address : Stack) {		for (auto Address : Stack) {
uint64_t Offset = virtualAddrToOffset(Address);
const SampleContextFrameVector &ExpandedContext =		const SampleContextFrameVector &ExpandedContext =
getFrameLocationStack(Offset);		getFrameLocationStack(Address);
// An instruction without a valid debug line will be ignored by sample		// An instruction without a valid debug line will be ignored by sample
// processing		// processing
if (ExpandedContext.empty())		if (ExpandedContext.empty())
return SampleContextFrameVector();		return SampleContextFrameVector();
// Set WasLeafInlined to the size of inlined frame count for the last		// Set WasLeafInlined to the size of inlined frame count for the last
// address which is leaf		// address which is leaf
WasLeafInlined = (ExpandedContext.size() > 1);		WasLeafInlined = (ExpandedContext.size() > 1);
ContextVec.append(ExpandedContext);		ContextVec.append(ExpandedContext);
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines

void ProfiledBinary::decodePseudoProbe() {		void ProfiledBinary::decodePseudoProbe() {
OwningBinary<Binary> OBinary = unwrapOrError(createBinary(Path), Path);		OwningBinary<Binary> OBinary = unwrapOrError(createBinary(Path), Path);
Binary &ExeBinary = *OBinary.getBinary();		Binary &ExeBinary = *OBinary.getBinary();
auto *Obj = dyn_cast<ELFObjectFileBase>(&ExeBinary);		auto *Obj = dyn_cast<ELFObjectFileBase>(&ExeBinary);
decodePseudoProbe(Obj);		decodePseudoProbe(Obj);
}		}

void ProfiledBinary::setIsFuncEntry(uint64_t Offset, StringRef RangeSymName) {		void ProfiledBinary::setIsFuncEntry(uint64_t Address, StringRef RangeSymName) {
// Note that the start offset of each ELF section can be a non-function		// Note that the start address of each ELF section can be a non-function
// symbol, we need to binary search for the start of a real function range.		// symbol, we need to binary search for the start of a real function range.
auto *FuncRange = findFuncRangeForOffset(Offset);		auto *FuncRange = findFuncRange(Address);
// Skip external function symbol.		// Skip external function symbol.
if (!FuncRange)		if (!FuncRange)
return;		return;

// Set IsFuncEntry to ture if there is only one range in the function or the		// Set IsFuncEntry to ture if there is only one range in the function or the
// RangeSymName from ELF is equal to its DWARF-based function name.		// RangeSymName from ELF is equal to its DWARF-based function name.
if (FuncRange->Func->Ranges.size() == 1 \|\|		if (FuncRange->Func->Ranges.size() == 1 \|\|
(!FuncRange->IsFuncEntry && FuncRange->getFuncName() == RangeSymName))		(!FuncRange->IsFuncEntry && FuncRange->getFuncName() == RangeSymName))
FuncRange->IsFuncEntry = true;		FuncRange->IsFuncEntry = true;
}		}

bool ProfiledBinary::dissassembleSymbol(std::size_t SI, ArrayRef<uint8_t> Bytes,		bool ProfiledBinary::dissassembleSymbol(std::size_t SI, ArrayRef<uint8_t> Bytes,
SectionSymbolsTy &Symbols,		SectionSymbolsTy &Symbols,
const SectionRef &Section) {		const SectionRef &Section) {
std::size_t SE = Symbols.size();		std::size_t SE = Symbols.size();
uint64_t SectionOffset = Section.getAddress() - getPreferredBaseAddress();		uint64_t SectionAddress = Section.getAddress();
uint64_t SectSize = Section.getSize();		uint64_t SectSize = Section.getSize();
uint64_t StartOffset = Symbols[SI].Addr - getPreferredBaseAddress();		uint64_t StartAddress = Symbols[SI].Addr;
uint64_t NextStartOffset =		uint64_t NextStartAddress =
(SI + 1 < SE) ? Symbols[SI + 1].Addr - getPreferredBaseAddress()		(SI + 1 < SE) ? Symbols[SI + 1].Addr : SectionAddress + SectSize;
: SectionOffset + SectSize;		setIsFuncEntry(StartAddress,
setIsFuncEntry(StartOffset,
FunctionSamples::getCanonicalFnName(Symbols[SI].Name));		FunctionSamples::getCanonicalFnName(Symbols[SI].Name));

StringRef SymbolName =		StringRef SymbolName =
ShowCanonicalFnName		ShowCanonicalFnName
? FunctionSamples::getCanonicalFnName(Symbols[SI].Name)		? FunctionSamples::getCanonicalFnName(Symbols[SI].Name)
: Symbols[SI].Name;		: Symbols[SI].Name;
bool ShowDisassembly =		bool ShowDisassembly =
ShowDisassemblyOnly && (DisassembleFunctionSet.empty() \|\|		ShowDisassemblyOnly && (DisassembleFunctionSet.empty() \|\|
DisassembleFunctionSet.count(SymbolName));		DisassembleFunctionSet.count(SymbolName));
if (ShowDisassembly)		if (ShowDisassembly)
outs() << '<' << SymbolName << ">:\n";		outs() << '<' << SymbolName << ">:\n";

auto WarnInvalidInsts = [](uint64_t Start, uint64_t End) {		auto WarnInvalidInsts = [](uint64_t Start, uint64_t End) {
WithColor::warning() << "Invalid instructions at "		WithColor::warning() << "Invalid instructions at "
<< format("%8" PRIx64, Start) << " - "		<< format("%8" PRIx64, Start) << " - "
<< format("%8" PRIx64, End) << "\n";		<< format("%8" PRIx64, End) << "\n";
};		};

uint64_t Offset = StartOffset;		uint64_t Address = StartAddress;
// Size of a consecutive invalid instruction range starting from Offset -1		// Size of a consecutive invalid instruction range starting from Address -1
// backwards.		// backwards.
uint64_t InvalidInstLength = 0;		uint64_t InvalidInstLength = 0;
while (Offset < NextStartOffset) {		while (Address < NextStartAddress) {
MCInst Inst;		MCInst Inst;
uint64_t Size;		uint64_t Size;
// Disassemble an instruction.		// Disassemble an instruction.
bool Disassembled =		bool Disassembled = DisAsm->getInstruction(
DisAsm->getInstruction(Inst, Size, Bytes.slice(Offset - SectionOffset),		Inst, Size, Bytes.slice(Address - SectionAddress), Address, nulls());
Offset + getPreferredBaseAddress(), nulls());
if (Size == 0)		if (Size == 0)
Size = 1;		Size = 1;

if (ShowDisassembly) {		if (ShowDisassembly) {
if (ShowPseudoProbe) {		if (ShowPseudoProbe) {
ProbeDecoder.printProbeForAddress(outs(),		ProbeDecoder.printProbeForAddress(outs(), Address);
Offset + getPreferredBaseAddress());
}		}
outs() << format("%8" PRIx64 ":", Offset + getPreferredBaseAddress());		outs() << format("%8" PRIx64 ":", Address);
size_t Start = outs().tell();		size_t Start = outs().tell();
if (Disassembled)		if (Disassembled)
IPrinter->printInst(&Inst, Offset + Size, "", *STI.get(), outs());		IPrinter->printInst(&Inst, Address + Size, "", *STI.get(), outs());
else		else
outs() << "\t<unknown>";		outs() << "\t<unknown>";
if (ShowSourceLocations) {		if (ShowSourceLocations) {
unsigned Cur = outs().tell() - Start;		unsigned Cur = outs().tell() - Start;
if (Cur < 40)		if (Cur < 40)
outs().indent(40 - Cur);		outs().indent(40 - Cur);
InstructionPointer IP(this, Offset);		InstructionPointer IP(this, Address);
outs() << getReversedLocWithContext(		outs() << getReversedLocWithContext(
symbolize(IP, ShowCanonicalFnName, ShowPseudoProbe));		symbolize(IP, ShowCanonicalFnName, ShowPseudoProbe));
}		}
outs() << "\n";		outs() << "\n";
}		}

if (Disassembled) {		if (Disassembled) {
const MCInstrDesc &MCDesc = MII->get(Inst.getOpcode());		const MCInstrDesc &MCDesc = MII->get(Inst.getOpcode());

// Record instruction size.		// Record instruction size.
Offset2InstSizeMap[Offset] = Size;		AddressToInstSizeMap[Address] = Size;

// Populate address maps.		// Populate address maps.
CodeAddrOffsets.push_back(Offset);		CodeAddressVec.push_back(Address);
if (MCDesc.isCall()) {		if (MCDesc.isCall()) {
CallOffsets.insert(Offset);		CallAddressSet.insert(Address);
UncondBranchOffsets.insert(Offset);		UncondBranchAddrSet.insert(Address);
} else if (MCDesc.isReturn()) {		} else if (MCDesc.isReturn()) {
RetOffsets.insert(Offset);		RetAddressSet.insert(Address);
UncondBranchOffsets.insert(Offset);		UncondBranchAddrSet.insert(Address);
} else if (MCDesc.isBranch()) {		} else if (MCDesc.isBranch()) {
if (MCDesc.isUnconditionalBranch())		if (MCDesc.isUnconditionalBranch())
UncondBranchOffsets.insert(Offset);		UncondBranchAddrSet.insert(Address);
BranchOffsets.insert(Offset);		BranchAddressSet.insert(Address);
}		}

if (InvalidInstLength) {		if (InvalidInstLength) {
WarnInvalidInsts(Offset - InvalidInstLength, Offset - 1);		WarnInvalidInsts(Address - InvalidInstLength, Address - 1);
InvalidInstLength = 0;		InvalidInstLength = 0;
}		}
} else {		} else {
InvalidInstLength += Size;		InvalidInstLength += Size;
}		}

Offset += Size;		Address += Size;
}		}

if (InvalidInstLength)		if (InvalidInstLength)
WarnInvalidInsts(Offset - InvalidInstLength, Offset - 1);		WarnInvalidInsts(Address - InvalidInstLength, Address - 1);

if (ShowDisassembly)		if (ShowDisassembly)
outs() << "\n";		outs() << "\n";

return true;		return true;
}		}

void ProfiledBinary::setUpDisassembler(const ELFObjectFileBase *Obj) {		void ProfiledBinary::setUpDisassembler(const ELFObjectFileBase *Obj) {
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	void ProfiledBinary::disassemble(const ELFObjectFileBase *Obj) {
// Dissassemble a text section.		// Dissassemble a text section.
for (section_iterator SI = Obj->section_begin(), SE = Obj->section_end();		for (section_iterator SI = Obj->section_begin(), SE = Obj->section_end();
SI != SE; ++SI) {		SI != SE; ++SI) {
const SectionRef &Section = *SI;		const SectionRef &Section = *SI;
if (!Section.isText())		if (!Section.isText())
continue;		continue;

uint64_t ImageLoadAddr = getPreferredBaseAddress();		uint64_t ImageLoadAddr = getPreferredBaseAddress();
uint64_t SectionOffset = Section.getAddress() - ImageLoadAddr;		uint64_t SectionAddress = Section.getAddress() - ImageLoadAddr;
uint64_t SectSize = Section.getSize();		uint64_t SectSize = Section.getSize();
if (!SectSize)		if (!SectSize)
continue;		continue;

// Register the text section.		// Register the text section.
TextSections.insert({SectionOffset, SectSize});		TextSections.insert({SectionAddress, SectSize});

StringRef SectionName = unwrapOrError(Section.getName(), FileName);		StringRef SectionName = unwrapOrError(Section.getName(), FileName);

if (ShowDisassemblyOnly) {		if (ShowDisassemblyOnly) {
outs() << "\nDisassembly of section " << SectionName;		outs() << "\nDisassembly of section " << SectionName;
outs() << " [" << format("0x%" PRIx64, Section.getAddress()) << ", "		outs() << " [" << format("0x%" PRIx64, Section.getAddress()) << ", "
<< format("0x%" PRIx64, Section.getAddress() + SectSize)		<< format("0x%" PRIx64, Section.getAddress() + SectSize)
<< "]:\n\n";		<< "]:\n\n";
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	for (const auto &DieInfo : CompilationUnit.dies()) {
// Different DWARF symbols can have same function name, search or create		// Different DWARF symbols can have same function name, search or create
// BinaryFunction indexed by the name.		// BinaryFunction indexed by the name.
auto Ret = BinaryFunctions.emplace(Name, BinaryFunction());		auto Ret = BinaryFunctions.emplace(Name, BinaryFunction());
auto &Func = Ret.first->second;		auto &Func = Ret.first->second;
if (Ret.second)		if (Ret.second)
Func.FuncName = Ret.first->first;		Func.FuncName = Ret.first->first;

for (const auto &Range : Ranges) {		for (const auto &Range : Ranges) {
uint64_t FuncStart = Range.LowPC;		uint64_t StartAddress = Range.LowPC;
uint64_t FuncSize = Range.HighPC - FuncStart;		uint64_t EndAddress = Range.HighPC;

if (FuncSize == 0 \|\| FuncStart < getPreferredBaseAddress())		if (EndAddress <= StartAddress \|\|
		StartAddress < getPreferredBaseAddress())
continue;		continue;

uint64_t StartOffset = FuncStart - getPreferredBaseAddress();
uint64_t EndOffset = Range.HighPC - getPreferredBaseAddress();

// We may want to know all ranges for one function. Here group the		// We may want to know all ranges for one function. Here group the
// ranges and store them into BinaryFunction.		// ranges and store them into BinaryFunction.
Func.Ranges.emplace_back(StartOffset, EndOffset);		Func.Ranges.emplace_back(StartAddress, EndAddress);

auto R = StartOffset2FuncRangeMap.emplace(StartOffset, FuncRange());		auto R = StartAddrToFuncRangeMap.emplace(StartAddress, FuncRange());
if (R.second) {		if (R.second) {
FuncRange &FRange = R.first->second;		FuncRange &FRange = R.first->second;
FRange.Func = &Func;		FRange.Func = &Func;
FRange.StartOffset = StartOffset;		FRange.StartAddress = StartAddress;
FRange.EndOffset = EndOffset;		FRange.EndAddress = EndAddress;
} else {		} else {
WithColor::warning()		WithColor::warning()
<< "Duplicated symbol start address at "		<< "Duplicated symbol start address at "
<< format("%8" PRIx64, StartOffset + getPreferredBaseAddress())		<< format("%8" PRIx64, StartAddress) << " "
<< " " << R.first->second.getFuncName() << " and " << Name << "\n";		<< R.first->second.getFuncName() << " and " << Name << "\n";
}		}
}		}
}		}
}		}

void ProfiledBinary::loadSymbolsFromDWARF(ObjectFile &Obj) {		void ProfiledBinary::loadSymbolsFromDWARF(ObjectFile &Obj) {
auto DebugContext = llvm::DWARFContext::create(		auto DebugContext = llvm::DWARFContext::create(
Obj, DWARFContext::ProcessDebugRelocations::Process, nullptr, DWPPath);		Obj, DWARFContext::ProcessDebugRelocations::Process, nullptr, DWPPath);
Show All 24 Lines	void ProfiledBinary::loadSymbolsFromDWARF(ObjectFile &Obj) {

if (BinaryFunctions.empty())		if (BinaryFunctions.empty())
WithColor::warning() << "Loading of DWARF info completed, but no binary "		WithColor::warning() << "Loading of DWARF info completed, but no binary "
"functions have been retrieved.\n";		"functions have been retrieved.\n";
}		}

void ProfiledBinary::populateSymbolListFromDWARF(		void ProfiledBinary::populateSymbolListFromDWARF(
ProfileSymbolList &SymbolList) {		ProfileSymbolList &SymbolList) {
for (auto &I : StartOffset2FuncRangeMap)		for (auto &I : StartAddrToFuncRangeMap)
SymbolList.add(I.second.getFuncName());		SymbolList.add(I.second.getFuncName());
}		}

void ProfiledBinary::setupSymbolizer() {		void ProfiledBinary::setupSymbolizer() {
symbolize::LLVMSymbolizer::Options SymbolizerOpts;		symbolize::LLVMSymbolizer::Options SymbolizerOpts;
SymbolizerOpts.PrintFunctions =		SymbolizerOpts.PrintFunctions =
DILineInfoSpecifier::FunctionNameKind::LinkageName;		DILineInfoSpecifier::FunctionNameKind::LinkageName;
SymbolizerOpts.Demangle = false;		SymbolizerOpts.Demangle = false;
SymbolizerOpts.DefaultArch = TheTriple.getArchName().str();		SymbolizerOpts.DefaultArch = TheTriple.getArchName().str();
SymbolizerOpts.UseSymbolTable = false;		SymbolizerOpts.UseSymbolTable = false;
SymbolizerOpts.RelativeAddresses = false;		SymbolizerOpts.RelativeAddresses = false;
SymbolizerOpts.DWPName = DWPPath;		SymbolizerOpts.DWPName = DWPPath;
Symbolizer = std::make_unique<symbolize::LLVMSymbolizer>(SymbolizerOpts);		Symbolizer = std::make_unique<symbolize::LLVMSymbolizer>(SymbolizerOpts);
}		}

SampleContextFrameVector ProfiledBinary::symbolize(const InstructionPointer &IP,		SampleContextFrameVector ProfiledBinary::symbolize(const InstructionPointer &IP,
bool UseCanonicalFnName,		bool UseCanonicalFnName,
bool UseProbeDiscriminator) {		bool UseProbeDiscriminator) {
assert(this == IP.Binary &&		assert(this == IP.Binary &&
"Binary should only symbolize its own instruction");		"Binary should only symbolize its own instruction");
auto Addr = object::SectionedAddress{IP.Offset + getPreferredBaseAddress(),		auto Addr = object::SectionedAddress{IP.Address,
object::SectionedAddress::UndefSection};		object::SectionedAddress::UndefSection};
DIInliningInfo InlineStack = unwrapOrError(		DIInliningInfo InlineStack = unwrapOrError(
Symbolizer->symbolizeInlinedCode(SymbolizerPath.str(), Addr),		Symbolizer->symbolizeInlinedCode(SymbolizerPath.str(), Addr),
SymbolizerPath);		SymbolizerPath);

SampleContextFrameVector CallStack;		SampleContextFrameVector CallStack;
for (int32_t I = InlineStack.getNumberOfFrames() - 1; I >= 0; I--) {		for (int32_t I = InlineStack.getNumberOfFrames() - 1; I >= 0; I--) {
const auto &CallerFrame = InlineStack.getFrame(I);		const auto &CallerFrame = InlineStack.getFrame(I);
Show All 15 Lines	for (int32_t I = InlineStack.getNumberOfFrames() - 1; I >= 0; I--) {
LineLocation Line(LineOffset, Discriminator);		LineLocation Line(LineOffset, Discriminator);
auto It = NameStrings.insert(FunctionName.str());		auto It = NameStrings.insert(FunctionName.str());
CallStack.emplace_back(*It.first, Line);		CallStack.emplace_back(*It.first, Line);
}		}

return CallStack;		return CallStack;
}		}

void ProfiledBinary::computeInlinedContextSizeForRange(uint64_t StartOffset,		void ProfiledBinary::computeInlinedContextSizeForRange(uint64_t RangeBegin,
uint64_t EndOffset) {		uint64_t RangeEnd) {
uint64_t RangeBegin = offsetToVirtualAddr(StartOffset);
uint64_t RangeEnd = offsetToVirtualAddr(EndOffset);
InstructionPointer IP(this, RangeBegin, true);		InstructionPointer IP(this, RangeBegin, true);

if (IP.Address != RangeBegin)		if (IP.Address != RangeBegin)
WithColor::warning() << "Invalid start instruction at "		WithColor::warning() << "Invalid start instruction at "
<< format("%8" PRIx64, RangeBegin) << "\n";		<< format("%8" PRIx64, RangeBegin) << "\n";

if (IP.Address >= RangeEnd)		if (IP.Address >= RangeEnd)
return;		return;

do {		do {
uint64_t Offset = virtualAddrToOffset(IP.Address);
const SampleContextFrameVector &SymbolizedCallStack =		const SampleContextFrameVector &SymbolizedCallStack =
getFrameLocationStack(Offset, UsePseudoProbes);		getFrameLocationStack(IP.Address, UsePseudoProbes);
uint64_t Size = Offset2InstSizeMap[Offset];		uint64_t Size = AddressToInstSizeMap[IP.Address];

// Record instruction size for the corresponding context		// Record instruction size for the corresponding context
FuncSizeTracker.addInstructionForContext(SymbolizedCallStack, Size);		FuncSizeTracker.addInstructionForContext(SymbolizedCallStack, Size);

} while (IP.advance() && IP.Address < RangeEnd);		} while (IP.advance() && IP.Address < RangeEnd);
}		}

void ProfiledBinary::computeInlinedContextSizeForFunc(		void ProfiledBinary::computeInlinedContextSizeForFunc(
Show All 17 Lines

InstructionPointer::InstructionPointer(const ProfiledBinary *Binary,		InstructionPointer::InstructionPointer(const ProfiledBinary *Binary,
uint64_t Address, bool RoundToNext)		uint64_t Address, bool RoundToNext)
: Binary(Binary), Address(Address) {		: Binary(Binary), Address(Address) {
Index = Binary->getIndexForAddr(Address);		Index = Binary->getIndexForAddr(Address);
if (RoundToNext) {		if (RoundToNext) {
// we might get address which is not the code		// we might get address which is not the code
// it should round to the next valid address		// it should round to the next valid address
if (Index >= Binary->getCodeOffsetsSize())		if (Index >= Binary->getCodeAddrVecSize())
this->Address = UINT64_MAX;		this->Address = UINT64_MAX;
else		else
this->Address = Binary->getAddressforIndex(Index);		this->Address = Binary->getAddressforIndex(Index);
}		}
}		}

bool InstructionPointer::advance() {		bool InstructionPointer::advance() {
Index++;		Index++;
if (Index >= Binary->getCodeOffsetsSize()) {		if (Index >= Binary->getCodeAddrVecSize()) {
Address = UINT64_MAX;		Address = UINT64_MAX;
return false;		return false;
}		}
Address = Binary->getAddressforIndex(Index);		Address = Binary->getAddressforIndex(Index);
return true;		return true;
}		}

bool InstructionPointer::backward() {		bool InstructionPointer::backward() {
Show All 16 Lines