This is an archive of the discontinued LLVM Phabricator instance.

[hwasan] Handle possible failure of dladdr
ClosedPublic

Authored by mmalcomson on Nov 12 2020, 6:07 AM.

Details

Summary
In `GetGlobalSizeFromDescriptor` we use `dladdr` to get info on the the
current address.  `dladdr` returns 0 if it failed.
During testing on Linux this returned 0 to indicate failure, and
populated the `info` structure with a NULL pointer which was
dereferenced later.

This patch checks for `dladdr` returning 0, and in that case returns 0
from `GetGlobalSizeFromDescriptor` to indicate failure of identifying
the address.

N.b. the testcases where this failed were most tests using stack-tagging
with glibc and using the interception ABI.
I have not been able to trigger the problem any other way, which is why
there is no testcase here.

Diff Detail

Event Timeline

mmalcomson created this revision.Nov 12 2020, 6:07 AM
Herald added a subscriber: Restricted Project. · View Herald TranscriptNov 12 2020, 6:07 AM
mmalcomson requested review of this revision.Nov 12 2020, 6:07 AM

Any idea how this is possible?
We only call this if GetModuleNameAndOffsetForPC succeeds, which should only happen when an address is within a shared object.

This looks reasonable to me, but like Evgenii I don't understand how this happened. Was there a separate thread that did a racy dlclose()?

mmalcomson added a comment.EditedNov 13 2020, 2:28 AM

The problem is when hwasan detects a fault accessing a value on the stack in Linux.
What I believe happens is:

On Linux GetModuleNameAndOffsetForPC succeeds for the stack.

It uses FindModuleForAddress, which uses fallback_modules_, fallback_modules_ uses fallbackInit, which initialises the list of modules from /proc/self/maps.

On Linux /proc/self/maps has a mapping for the stack -- which means the fallback_modules_ has a module named "[stack]".

Hence GetModuleNameAndOffsetForPC returns the name "[stack]" and the offset into it (and dladdr doesn't work on a stack address).

I guess we could compare against specially named modules like "[stack]" and "[heap]" by checking for the first character of the module name, but I feel this check against dladdr failing would be more robust.

eugenis accepted this revision.Nov 13 2020, 4:30 PM

LGTM
Thank you.

This revision is now accepted and ready to land.Nov 13 2020, 4:30 PM
hctim accepted this revision.Nov 13 2020, 5:30 PM

Thanks for the clarification!

Just returning 0 from GetGlobalSizeFromDescriptor will result in HWASan also printing XXX is located to the right/left of a global variable in (foo.so+0x1234) for stack tag mismatches.

I guess we could compare against specially named modules like "[stack]" and "[heap]" by checking for the first character of the module name, but I feel this check against dladdr failing would be more robust.

I'm not sure this works on Android - as any pages in the bss segment that don't come from the pages occupied by the file mapping end up being mapped as [anon:bss]. I was going to say that maybe we can get away with ignoring non-fd based mappings, but this is incompatible with things that use anonymous maps for file-based content (like MTE globals + UBSan might have problems if UBSan needs the maps somewhere).

Thanks for pointing out this bug - I've just realised the same problem exists on Android as well. I'm not sure how we didn't catch the regression introduced here, all stack-based reports are cut off really short.

For now I'll approve this patch to unblock, and I'll circle back and rework slightly here to remove the unnecessary 0x007feca724bc is located to the left of a global variable in ([stack]+0x1f4c0).

This revision was automatically updated to reflect the committed changes.