This is an archive of the discontinued LLVM Phabricator instance.

Add NewAddressDescription, which can describe any type of address.
ClosedPublic

Authored by filcab on Sep 1 2016, 7:28 AM.

Details

Summary

This is useful for inclusion in the Error* structures.

The class will be renamed after getting rid of the current AddressDescription,
which is only used in asan_debugging.cc

Split between *Base and the "proper" object with constructors, because VS2013
doesn't allow us to include objects without a trivial default constructor in an
enum, and I'd like to be sure that a NewAddressDescription object is always
valid, so we need initialization.

Diff Detail

Repository
rL LLVM

Event Timeline

filcab updated this revision to Diff 70004.Sep 1 2016, 7:28 AM
filcab retitled this revision from to Add NewAddressDescription, which can describe any type of address..
filcab updated this object.
filcab added reviewers: kcc, vitalybuka, eugenis.
filcab added a subscriber: llvm-commits.
vitalybuka added inline comments.Sep 1 2016, 12:34 PM
lib/asan/asan_descriptions.h
176 ↗(On Diff #70004)

I don't like union like this, but I missed the background for this changes.
What is the ultimate goal?

185 ↗(On Diff #70004)

Why not regular C++ e.g. with interface with virtual methods?

218 ↗(On Diff #70004)

Just: *this = {};

vitalybuka requested changes to this revision.Sep 1 2016, 1:02 PM
vitalybuka edited edge metadata.
This revision now requires changes to proceed.Sep 1 2016, 1:02 PM
filcab updated this revision to Diff 70338.Sep 5 2016, 9:20 AM
filcab edited edge metadata.
filcab marked an inline comment as done.

Update after code review

filcab updated this revision to Diff 70399.Sep 6 2016, 7:41 AM
filcab edited edge metadata.

Revert back to the old constructor, otherwise we'd get an infinite loop.
Added an extra argument to the non-default constructor, which allows us to control wether the thread registry should be locked or not (pass false when it's already locked).

vitalybuka requested changes to this revision.Sep 6 2016, 11:58 AM
vitalybuka edited edge metadata.
vitalybuka added inline comments.
lib/asan/asan_descriptions.h
185 ↗(On Diff #70399)

Why not to replace existing ErrorDescription with this one n the same Patch so you don't need to introduce New* names

This revision now requires changes to proceed.Sep 6 2016, 11:58 AM

I like small patches, but they should be meaningful on it's own and incrementally improve code quality.
Here we have patch with "unused" type which is safe be remove by anyone in the next patch.

It would be nice to have smallest possible patch but which has code which explain things like this:

  1. Why we need NewAddressDescriptionBase. How this is different from ErrorDescription which is just fine without Base?
  2. Why we need shouldLockThreadRegistry?
  3. Why not to incrementally change AddressDescription?
lib/asan/asan_descriptions.h
233 ↗(On Diff #70399)

ThreadRegistryLock

filcab updated this revision to Diff 70530.Sep 7 2016, 7:10 AM
filcab edited edge metadata.
filcab marked 2 inline comments as done.

Address review comments. Merge D24132 into this patch.

filcab added a comment.Sep 7 2016, 7:10 AM

I like small patches, but they should be meaningful on it's own and incrementally improve code quality.
Here we have patch with "unused" type which is safe be remove by anyone in the next patch.

It would be nice to have smallest possible patch but which has code which explain things like this:

  1. Why we need NewAddressDescriptionBase. How this is different from ErrorDescription which is just fine without Base?

ErrorDescription is the class that contains the union which VS2013 doesn't like, so it won't need the *Base version since it can initialize itself. The classes *inside* that union can't have non-trivial default initializers in VS2013 because it doesn't support them yet.

  1. Why we need shouldLockThreadRegistry?

asan_locate_address doesn't need to lock while running, but if (and only if) the address being described is in the stack, we'll need to lock the thread registry. Since asan_locate_address doesn't even try to print or anything, I think it's a bit too much to always lock, hence the extra constructor argument.

  1. Why not to incrementally change AddressDescription?

AddressDescription is too basic and not very useful. The whole point of the {Stack,Global,Shadow,Heap}AddressDescription structures is to have a good amount of information about the address. Their purpose is to be serialized so we can properly support post-mortem debugging without the debugger needing to know about all the sanitizer internals and reproduce that work.

Since we already have all that information, but in different structs, we can replace the old AddressDescription with a more descriptive structure, which can contain any of the other descriptions, depending on the address.

lib/asan/asan_descriptions.h
218 ↗(On Diff #70399)

*this = {} will trigger an infinite loop, since it will call itself.

185 ↗(On Diff #70004)

I would like to be able to use the same structures internally and externally, when reporting errors.
Having a trivially copiable type makes it much easier. Not having to deal with the tagged union and having virtual methods would make the type non-trivially copiable and then we would have to add a bunch of code for (de)serializing the type so the debugger can get to it.

The rationale for the structured error reporting is here:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/101933.html

Bring over a comment from D24132 that wasn't submitted.

lib/asan/asan_debugging.cc
87 ↗(On Diff #70530)

If we add the default case, we get a clang warning, though. I can add a check that we have one of those, before the switch, but it will end up ugly:

CHECK(descr.kind == kAddressKindWild || descr.kind == kAddressKindShadow || ...);
Should we also fix places like PrintHeapChunkAccess in asan_descriptions.cc, which does the same thing (not have a default case)?

vitalybuka requested changes to this revision.Sep 7 2016, 10:27 AM
vitalybuka edited edge metadata.
vitalybuka added inline comments.
lib/asan/asan_debugging.cc
87 ↗(On Diff #70530)

We have another internal switch so assert(region_kind) just before return should cover both

108 ↗(On Diff #70530)

name[0] = 0; is already done at entry

115 ↗(On Diff #70530)

name[0] = 0; is already done at entry

lib/asan/asan_descriptions.h
185 ↗(On Diff #70530)

I see.
How about following?

struct AddressDescriptionData {
 AddressKind kind;
 union {
    ShadowAddressDescription shadow;
    HeapAddressDescription heap;
    StackAddressDescription stack;
    GlobalAddressDescription global;
    uptr addr;
  };
  // No methods.
}

class AddressDescription {
 private:
  AddressDescriptionData data;

 public:
  AddressDescription() {
    ...
  }
   uptr Address() const ;
   uptr Print() const ;
   const ShadowAddressDescription* AsShadowAddress() const { 
      return "kind is ShadowAddress" ? &shadow : nullptr;
   };
}
This revision now requires changes to proceed.Sep 7 2016, 10:27 AM
filcab updated this revision to Diff 70694.Sep 8 2016, 6:47 AM
filcab edited edge metadata.
filcab marked 4 inline comments as done.

Addressed some code review comments.

lib/asan/asan_descriptions.h
185 ↗(On Diff #70530)

But then you'd either be copying AddressDescriptionData a bunch, to create new AddressDescription objects with the methods you want, or you'd end up with something like AddressDescriptionData, AddressDescription, and AddressDescriptionView, which would allow you to:

  • Store the info wherever you want (AddressDescriptionData)
  • Get the info from and address easily (AddressDescription. Owns the data)
  • "Interpret" an AddressDescriptionData (AddressDescriptionView, contains a reference)

This would end up duplicating the methods (AddressDescription's methods can be defined in terms of AddressDescriptionView's, but it would still be duplicating work.
I'm also not sure what we'd gain by not having the utility methods in the base object. I see the argument of disallowing property changes on AddressDescription and its private member, but we only use a full object like that once: in __asan_locate_address. Most of the other uses for this structure will be in the Error* structs. In there we'd have the fully public class, so we wouldn't be able to protect against inadvertent writes.

We can also do without AddressDescription, and just have the Data+View (not necessarily those names, of course). This would make __asan_locate_address slightly more complicated, since it would need to create a *Data object, and a *View object (probably the *View would populate the data).
I don't see that as a big win anywhere, but I'm ok with doing it if you prefer.

vitalybuka requested changes to this revision.Sep 8 2016, 11:28 AM
vitalybuka edited edge metadata.
vitalybuka added inline comments.
lib/asan/asan_descriptions.h
185 ↗(On Diff #70694)

I like this way: https://reviews.llvm.org/D24358
WDYT?

This revision now requires changes to proceed.Sep 8 2016, 11:28 AM
filcab updated this revision to Diff 70852.Sep 9 2016, 9:52 AM
filcab edited edge metadata.

Updated with review comments.
I wasn't looking at the class properly, it fits in an Error* structure and the type is still trivially-copyable.

Vitaly: I changed the patch a bit from yours. I think it ends up being simpler to have code like:

if (auto shadow = descr.AsShadow()) {
  ...
} else ...

Than to have the switch on .Kind(), since there's more code (reading) overhead with the switch.

I kept the const you added to the methods like Print(), but I can take them out and do a follow-up patch (or do it before) if you'd prefer.

Thank you,

Filipe

filcab updated this revision to Diff 70855.Sep 9 2016, 10:00 AM
filcab edited edge metadata.

Make it work on VS2013: only trivial initialization.
Added UNREACHABLE to the end of covered switches (silences VS warnings).

vitalybuka accepted this revision.Sep 9 2016, 11:28 AM
vitalybuka edited edge metadata.

LGTM with few comments

Vitaly: I changed the patch a bit from yours. I think it ends up being simpler to have code like:

if (auto shadow = descr.AsShadow()) {
  ...
} else ...

LGTM

Than to have the switch on .Kind(), since there's more code (reading) overhead with the switch.

I kept the const you added to the methods like Print(), but I can take them out and do a follow-up patch (or do it before) if you'd prefer.

Thanks

lib/asan/asan_descriptions.h
198 ↗(On Diff #70855)

Could you please hide this constructor into cpp?
also GetHeapAddressInformation and DescribeAddressIfStack can be removed from h.

227 ↗(On Diff #70855)

Please remove commented line

This revision is now accepted and ready to land.Sep 9 2016, 11:28 AM

Almost forgot, it could be useful if you copy/paste your proposal into new PR assigned to yourself, and include that PR into all patches. Can help future readers with context.

vitalybuka added inline comments.Sep 9 2016, 11:41 AM
lib/asan/asan_descriptions.h
197 ↗(On Diff #70855)

And I'd prefer for now remove shouldLockThreadRegistry argument and do ThreadRegistryLock inside of GetStackAddressInformation uconditionally

This revision was automatically updated to reflect the committed changes.
filcab marked 2 inline comments as done.