Refactor the current global header iteration to be callback-based, and add a feature that reports the size of the global variable during reporting. This allows binaries without symbols to still report the size of the global variable, which is always available in the HWASan globals PT_NOTE metadata.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Instead of adding these callbacks would it be simpler to have a function that takes the pointer to phdrs and returns (global_begin, global_end) (both 0 if not present)?
compiler-rt/lib/hwasan/hwasan.cpp | ||
---|---|---|
197 | Why add this? (same below) | |
246 | *all the notes | |
compiler-rt/lib/hwasan/hwasan_report.cpp | ||
252 | Name (and comment above) implies that the size is stored in the phdr itself. Maybe s/phdr/descriptor/'? | |
262 | Can you use dladdr to avoid iterating and the callback here? (i.e. find ELF headers via dli_fbase) |
That would require duplicating the global iteration logic across the reporting and initialisation paths, which I was trying to avoid. Having a callback that recieves a single global is easier for my brain, but please let me know if you'd prefer it the other way around.
compiler-rt/lib/hwasan/hwasan.cpp | ||
---|---|---|
197 | Naively assumed these symbols were being exported. Removed. |
- Addressed Peter's comments, added more comments to header describing how these functions operate.
With the function that returns (global_begin, global_end) it's just a for loop really. You could have member functions in hwasan_global like
size_t size() const { return info & 0xffffff; }
to make the client code easier to read.
And instead of returning (global_begin, global_end) you could return something like an ArrayRef<hwasan_global> (I don't think we have ArrayRef in compiler-rt, but you could add one, it's pretty simple). Now the client code just looks like
for (hwasan_global &g : HwasanGlobalsFor(phdr, ...)) { ... }
- Pull out hwasan globals into its own file, and rework callback based to return-range based.
- Also add ArrayRef<T> to sanitizer common.
- Renamed a function, so update the comment.
- Moved PT note descriptions out of the header.
compiler-rt/lib/hwasan/hwasan_globals.cpp | ||
---|---|---|
25 ↗ | (On Diff #266956) | No trivial accessors please. |
95 ↗ | (On Diff #266956) | With the ArrayRef ctor changed this becomes return {}. |
compiler-rt/lib/hwasan/hwasan_globals.h | ||
24–32 ↗ | (On Diff #266956) | I wouldn't add this stuff, this type of error is pretty unlikely and would be caught in code review. |
37 ↗ | (On Diff #266956) | I would just put the definitions inline so that readers don't need to go hunting for them, they're pretty small. |
46 ↗ | (On Diff #266956) | Remove obvious comment. |
compiler-rt/lib/hwasan/hwasan_report.cpp | ||
250 | s/phdr/descriptor/ | |
compiler-rt/lib/sanitizer_common/sanitizer_common.h | ||
985 ↗ | (On Diff #266956) | Shouldn't it create an empty ArrayRef? |
compiler-rt/lib/hwasan/hwasan_globals.h | ||
---|---|---|
24–32 ↗ | (On Diff #266956) | Done - but left the comment at the top of the class. |
compiler-rt/lib/hwasan/hwasan.cpp | ||
---|---|---|
311 | Remove braces | |
357 | Remove braces | |
411–412 | Remove braces | |
compiler-rt/lib/hwasan/hwasan_globals.cpp | ||
92 ↗ | (On Diff #266977) | Newline |
compiler-rt/lib/hwasan/hwasan_globals.h | ||
27 ↗ | (On Diff #266977) | 2^24, "larger globals have multiple descriptors" |
37 ↗ | (On Diff #266977) | Rename this back to gv_relptr. |
51 ↗ | (On Diff #266977) | Newline |
compiler-rt/lib/hwasan/hwasan_report.cpp | ||
264 | Remove braces | |
264 | I'm not sure that this first argument can be ehdr -- it needs to be the load bias, which is normally the same as ehdr in position-independent binaries, but it can be different in e.g. non-PIE executables, binaries created using lld's partitioning feature [1] and possibly also binaries linked using a linker script. To compute the load bias from the address of the ELF header, you can look for a PT_LOAD with p_offset=0. The load bias is found by subtracting that program header's p_vaddr from the address of the ELF header. | |
364 | Is size == 0 realistic? I don't think you can declare a zero-size global in C. |
- Address some code style problems and ensure we get the correct load bias.
compiler-rt/lib/hwasan/hwasan_report.cpp | ||
---|---|---|
364 | Sorry - I don't understand the comment. size == 0 here indicates that the descriptor lookup failed. |
compiler-rt/lib/hwasan/hwasan_report.cpp | ||
---|---|---|
278 | This is wrong. TODO(me). |
compiler-rt/lib/hwasan/hwasan_report.cpp | ||
---|---|---|
268 | And this is also wrong, not walking the headers here. |
- Fix up phdr iteration and correctly find the load bias.
As it turns out, dli_fbase returned by dladdr() and dlpi_addr by dl_iterate_phdr both call themselves "base address", but are different things. Absolutely did my head in.
Looks like my local commits didn't get squashed before upstream, but the net impact was [base .. a744142] (i.e. the last version before commit).
Why add this? (same below)