This is an archive of the discontinued LLVM Phabricator instance.

[HWASan] Add sizeof(global) in report even if symbols missing.
ClosedPublic

Authored by hctim on May 26 2020, 4:47 PM.

Details

Summary

Refactor the current global header iteration to be callback-based, and add a feature that reports the size of the global variable during reporting. This allows binaries without symbols to still report the size of the global variable, which is always available in the HWASan globals PT_NOTE metadata.

Diff Detail

Event Timeline

hctim created this revision.May 26 2020, 4:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2020, 4:47 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
pcc added a comment.May 26 2020, 6:17 PM

Instead of adding these callbacks would it be simpler to have a function that takes the pointer to phdrs and returns (global_begin, global_end) (both 0 if not present)?

compiler-rt/lib/hwasan/hwasan.cpp
198–199

Why add this? (same below)

200

*all the notes

compiler-rt/lib/hwasan/hwasan_report.cpp
254

Name (and comment above) implies that the size is stored in the phdr itself. Maybe s/phdr/descriptor/'?

264

Can you use dladdr to avoid iterating and the callback here? (i.e. find ELF headers via dli_fbase)

hctim marked 5 inline comments as done.May 27 2020, 5:35 PM
In D80599#2056315, @pcc wrote:

Instead of adding these callbacks would it be simpler to have a function that takes the pointer to phdrs and returns (global_begin, global_end) (both 0 if not present)?

That would require duplicating the global iteration logic across the reporting and initialisation paths, which I was trying to avoid. Having a callback that recieves a single global is easier for my brain, but please let me know if you'd prefer it the other way around.

compiler-rt/lib/hwasan/hwasan.cpp
198–199

Naively assumed these symbols were being exported. Removed.

hctim updated this revision to Diff 266710.May 27 2020, 5:36 PM
hctim marked an inline comment as done.
  • Addressed Peter's comments, added more comments to header describing how these functions operate.
pcc added a comment.May 27 2020, 6:09 PM
In D80599#2056315, @pcc wrote:

Instead of adding these callbacks would it be simpler to have a function that takes the pointer to phdrs and returns (global_begin, global_end) (both 0 if not present)?

That would require duplicating the global iteration logic across the reporting and initialisation paths, which I was trying to avoid. Having a callback that recieves a single global is easier for my brain, but please let me know if you'd prefer it the other way around.

With the function that returns (global_begin, global_end) it's just a for loop really. You could have member functions in hwasan_global like

size_t size() const { return info & 0xffffff; }

to make the client code easier to read.

pcc added a comment.May 27 2020, 6:17 PM

And instead of returning (global_begin, global_end) you could return something like an ArrayRef<hwasan_global> (I don't think we have ArrayRef in compiler-rt, but you could add one, it's pretty simple). Now the client code just looks like

for (hwasan_global &g : HwasanGlobalsFor(phdr, ...)) {
 ...
}
hctim updated this revision to Diff 266951.May 28 2020, 10:59 AM
  • Pull out hwasan globals into its own file, and rework callback based to return-range based.
  • Also add ArrayRef<T> to sanitizer common.
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2020, 10:59 AM
hctim updated this revision to Diff 266956.May 28 2020, 11:05 AM
  • Renamed a function, so update the comment.
  • Moved PT note descriptions out of the header.
pcc added inline comments.May 28 2020, 11:28 AM
compiler-rt/lib/hwasan/hwasan_globals.cpp
26

No trivial accessors please.

96

With the ArrayRef ctor changed this becomes return {}.

compiler-rt/lib/hwasan/hwasan_globals.h
25–33

I wouldn't add this stuff, this type of error is pretty unlikely and would be caught in code review.

38

I would just put the definitions inline so that readers don't need to go hunting for them, they're pretty small.

47

Remove obvious comment.

compiler-rt/lib/hwasan/hwasan_report.cpp
252

s/phdr/descriptor/

compiler-rt/lib/sanitizer_common/sanitizer_common.h
985

Shouldn't it create an empty ArrayRef?

hctim marked 8 inline comments as done.May 28 2020, 11:47 AM
hctim added inline comments.
compiler-rt/lib/hwasan/hwasan_globals.h
25–33

Done - but left the comment at the top of the class.

hctim updated this revision to Diff 266977.May 28 2020, 11:47 AM
hctim marked an inline comment as done.
  • Address pcc's comments.
pcc added inline comments.May 28 2020, 12:46 PM
compiler-rt/lib/hwasan/hwasan.cpp
211

Remove braces

258

Remove braces

318–319

Remove braces

compiler-rt/lib/hwasan/hwasan_globals.cpp
92

Newline

compiler-rt/lib/hwasan/hwasan_globals.h
28

2^24, "larger globals have multiple descriptors"

38

Rename this back to gv_relptr.

50

Newline

compiler-rt/lib/hwasan/hwasan_report.cpp
266

Remove braces

266

I'm not sure that this first argument can be ehdr -- it needs to be the load bias, which is normally the same as ehdr in position-independent binaries, but it can be different in e.g. non-PIE executables, binaries created using lld's partitioning feature [1] and possibly also binaries linked using a linker script.

To compute the load bias from the address of the ELF header, you can look for a PT_LOAD with p_offset=0. The load bias is found by subtracting that program header's p_vaddr from the address of the ELF header.

[1] https://lld.llvm.org/Partitions.html

363

Is size == 0 realistic? I don't think you can declare a zero-size global in C.

hctim updated this revision to Diff 267952.Jun 2 2020, 12:20 PM
hctim marked 8 inline comments as done.
  • Address some code style problems and ensure we get the correct load bias.
compiler-rt/lib/hwasan/hwasan_report.cpp
363

Sorry - I don't understand the comment. size == 0 here indicates that the descriptor lookup failed.

hctim updated this revision to Diff 267953.Jun 2 2020, 12:21 PM
hctim marked an inline comment as done.
  • Change sizeof(global) comment.
hctim added inline comments.Jun 3 2020, 9:04 AM
compiler-rt/lib/hwasan/hwasan_report.cpp
280

This is wrong. TODO(me).

hctim added inline comments.Jun 3 2020, 9:07 AM
compiler-rt/lib/hwasan/hwasan_report.cpp
270

And this is also wrong, not walking the headers here.

hctim updated this revision to Diff 268596.Jun 4 2020, 2:53 PM
hctim marked 2 inline comments as done.
  • Fix up phdr iteration and correctly find the load bias.

As it turns out, dli_fbase returned by dladdr() and dlpi_addr by dl_iterate_phdr both call themselves "base address", but are different things. Absolutely did my head in.

hctim added a comment.Jun 9 2020, 10:33 AM

Friendly ping

pcc accepted this revision.Jun 9 2020, 11:18 AM

LGTM

compiler-rt/lib/hwasan/hwasan_globals.h
28

nit: 2 << 24 == 2^25 != 2^24.

compiler-rt/lib/hwasan/hwasan_report.cpp
363

Ah okay. I'd probably leave a brief comment about that here then.

This revision is now accepted and ready to land.Jun 9 2020, 11:18 AM
hctim updated this revision to Diff 269626.Jun 9 2020, 11:51 AM
hctim marked 4 inline comments as done.
  • Rebase and address some of pcc's nits.
This revision was automatically updated to reflect the committed changes.
hctim added a comment.Jun 9 2020, 1:30 PM

Looks like my local commits didn't get squashed before upstream, but the net impact was [base .. a744142] (i.e. the last version before commit).