This is an archive of the discontinued LLVM Phabricator instance.

[GWP-ASan] Initial build files, implementation of PRNG [1].

Authored by hctim on May 13 2019, 11:08 AM.

Diff Detail


Event Timeline

hctim created this revision.May 13 2019, 11:08 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 13 2019, 11:08 AM
Herald added subscribers: llvm-commits, Restricted Project, mgorny and 2 others. · View Herald Transcript

Note that the empty compiler-rt/test/gwp_asan/CMakeLists.txt is required, as there are automatic build rules that build the tests for all sanitizers which fail if there is no test directory for gwp_asan.

function-static vs global-static TLS variables:

Looks like the optimiser has a hard time with both of these instances (TLS check on the load and subsequent save vs. excessive register spills). Turns out the perf is about even though, with the implemented version here just nudging it out (but within the noise).

eugenis added inline comments.May 13 2019, 11:28 AM
29 ↗(On Diff #199300)

I thought we switched to __thread?
Does the generated code become more compact if these two are combined in a single thread-local aggregate?

morehouse added inline comments.May 13 2019, 12:09 PM
10 ↗(On Diff #199300)

mutex.h doesn't exist yet.

17 ↗(On Diff #199300)

We have double-initialization if getRandomUnsigned64 is called before these initializers run.

17 ↗(On Diff #199300)

A random namespace might be overkill.

20 ↗(On Diff #199300)

1 in 2 billion is very rare. I don't see a use case for less frequent sampling than that.

hctim updated this revision to Diff 199475.May 14 2019, 10:02 AM
hctim marked 9 inline comments as done.
  • Updated to 32-bit PRNG, removed unneccessary definitions.h
morehouse added inline comments.May 14 2019, 10:29 AM
17 ↗(On Diff #199475)

I think static is actually implicit in this context. So static thread_local is equivalent to thread_local.

25 ↗(On Diff #199475)

Since this is a compile-time constant, static doesn't do anything.

28 ↗(On Diff #199475)

This seems like a lot of state to keep. Not a big deal, but is there a simpler RNG we can use? What about just xorshift32?

hctim updated this revision to Diff 199504.May 14 2019, 1:26 PM
hctim marked 3 inline comments as done.
  • /s/static thread_local/thread_local
hctim added inline comments.May 14 2019, 1:26 PM
28 ↗(On Diff #199475)

xorwow has a period of 2 ** 192 - 2**32, versus xorshift32 having a period of 2 ** 32. Also, surprisingly, it's about 1.2x faster than xorshift32 ( on my x86_64.

I personally think the 20B extra TLS/thread is worth it (given that it's likely zero overhead anyway unless the 20B pushes someone to alloc a new page for TLS). WDYT?

17 ↗(On Diff #199300)

Fixed w/ function-local statics.

29 ↗(On Diff #199300)

As discussed offline, the slow path + guaranteed always-correct PRNG.

17 ↗(On Diff #199300)


20 ↗(On Diff #199300)

Changed to 32-bit (using a slightly different algorithm).

morehouse accepted this revision.May 14 2019, 2:02 PM
morehouse added inline comments.
28 ↗(On Diff #199475)

2^32 seems plenty large to me. Speed isn't a huge concern since we're on the slow path, so I personally would prefer simpler. In fact if the RNG func is small enough, it might be simpler to put it directly in instead and reduce the number of files.

This revision is now accepted and ready to land.May 14 2019, 2:02 PM
hctim updated this revision to Diff 199512.May 14 2019, 2:21 PM
hctim marked an inline comment as done.
  • Updated getRandomUnsigned32() to simply use xorshift32, instead of xorwow.
hctim updated this revision to Diff 199517.May 14 2019, 2:37 PM
  • Merged in master ready for submission.
This revision was automatically updated to reflect the committed changes.