This is an archive of the discontinued LLVM Phabricator instance.

[sanitizer] Add iterator to AddrHashMap
AbandonedPublic

Authored by bruening on May 27 2016, 12:48 PM.

Details

Summary

Adds an iterator to the AddrHashMap class used by sanitizer runtime
libraries. Adds a test of the new iterator.

Diff Detail

Event Timeline

bruening updated this revision to Diff 58826.May 27 2016, 12:48 PM
bruening retitled this revision from to [sanitizer] Add iterator to AddrHashMap.
bruening updated this object.
bruening added a reviewer: aizatsky.
bruening added subscribers: llvm-commits, eugenis, kcc and 2 others.
kcc added a reviewer: dvyukov.May 27 2016, 1:44 PM

Please don't commit w/o Dmitry's LGTM

vitalybuka requested changes to this revision.May 27 2016, 1:48 PM
vitalybuka added a reviewer: vitalybuka.
vitalybuka added inline comments.
lib/sanitizer_common/sanitizer_addrhashmap.h
113

I think logic could be simpler if Iterator contained temp copy of HashValue<T*>
and operator* returns reference to it.

E.g.:
class Iterator {

const HashValue<T*>& operator*() {
  return curr_item_;
}

 Iterator& operator++() {
    // all logic here,
    curr_item_ = something.
  }

 HashValue<T*> curr_item_;

}

121

add == nullptr should not be possible according line 139

130

This will hide bugs. We should crash here if someone increments end().

136

Iterating concurrent hash map looks scary in general, and not very efficient.
This will hold bucket locked entire iterator lifetime.
Maybe safer to consider some alternatives, like exporting entire table as array.
So you can get array under the log, and do whatever you want after release.

This revision now requires changes to proceed.May 27 2016, 1:48 PM
dvyukov added inline comments.May 29 2016, 3:35 AM
lib/sanitizer_common/sanitizer_addrhashmap.h
91

remove the template, and just use T*
simpler and less typing

92

s/HashValue/Value/

It's already in the scope of Addr_Hash_Map.

101

please move all the code out of the class declaration to the bottom of the file

you can see that it is done for all member functions here, I wanted the declaration to remain at least minimally readable (to the point it is possible with C++)

105

this won't work: iterator can hold mutex lock

if you copy the iterator, you will double unlock the mutex and corrupt its state

do you plan to use them? if not, = delete

124

this must be acquire to pair with release in the release function

but unfortunately that's not enough: elements can be concurrently removed and overwritten
that's ok for normal usages because lookup for an element should not race with its removal, embed elements are not compacted, and accesses to add elements are protected with read lock

I think the simplest fix would be to read lock the cell whenever you access it (both embed and add elements).

136

unless I am missing something, this can double-lock the mutex if a cell contains more than 1 add element

need more tests to catch it

lib/sanitizer_common/tests/sanitizer_addrhashmap_test.cc
1

s/bitvector/addrhashmap/

11

s/bitvector/addrhashmap/

bruening abandoned this revision.Jul 7 2016, 9:02 AM

AddrHashMap has significant limitations that make it unsuitable for most of our use cases for hashtables and we're going to need to add a separate, more general-purpose table: the data being read-only after creation, and most importantly, the fixed-size table that cannot resize and adapt to the size of the app, are blockers for us. Abandoning this CL.