This is an archive of the discontinued LLVM Phabricator instance.

Implement __emutls_get_address
ClosedPublic

Authored by chh on Aug 12 2015, 5:16 PM.

Details

Summary

Implement __emutls_get_address for targets like Android that depends on libgcc's emulated thread local storage.

clang/llvm can generate calls to __emutls_get_address since
http://reviews.llvm.org/D10522 and http://reviews.llvm.org/D10524

Diff Detail

Repository
rL LLVM

Event Timeline

chh updated this revision to Diff 32008.Aug 12 2015, 5:16 PM
chh retitled this revision from to Implement __emutls_get_address.
chh updated this object.
chh added reviewers: rnk, joerg.
chh added subscribers: llvm-commits, davidxl, enh and 4 others.
danalbert added inline comments.Aug 14 2015, 3:34 PM
compiler-rt/lib/builtins/emutls.c
17 ↗(On Diff #32008)

Just checked with cferris. I don't think this is a default we even want on Android. Sounds like this would maybe be a win for dlmalloc with us, but for jemalloc posix_memalign should be fine. Older devices + N9 are dlmalloc, but he still thinks it's unlikely this is necessary.

33 ↗(On Diff #32008)

Only one space between void* and address (I believe, I'm not very familiar with style practices here)

35 ↗(On Diff #32008)

Or null

40 ↗(On Diff #32008)

Should probably find (or create) a common header for this.

67 ↗(On Diff #32008)

No space between the cast and the value

85 ↗(On Diff #32008)

text alignment again

90 ↗(On Diff #32008)

emutls_address_array* instead of void*? This is C, so there's no mangling concern.

93 ↗(On Diff #32008)

A non-trivial loop body should have braces.

107 ↗(On Diff #32008)

abort() (no space)

128 ↗(On Diff #32008)

Inconsistent pointer positioning. Most of the file has type* var, here is type *var. Not sure what the style is here, but stick to it (or stick to one for this particular file if there isn't consistent style across the whole tree).

133 ↗(On Diff #32008)

No space after the cast

142 ↗(On Diff #32008)

Probably better to be aligning this multiples of 16 or something instead of just adding 16 since index can be any integer here.

149 ↗(On Diff #32008)

This doesn't actually guarantee that we've made enough space, right? If I'm reading this correctly, we need the check on the line above because some other thread may have used many TLS variables that our thread hasn't encountered yet, so our array needs to expand to accommodate them, but if the index is significantly larger than our current array then we won't have made enough space.

compiler-rt/test/builtins/Unit/emutls_test.c
14 ↗(On Diff #32008)

sort

50 ↗(On Diff #32008)

include file and line number?

74 ↗(On Diff #32008)

If this isn't true we have a much bigger problem.

77 ↗(On Diff #32008)

The above three don't offer anything that the static asserts don't.

78 ↗(On Diff #32008)

Make it a static assert.

chh marked 10 inline comments as done.Aug 14 2015, 5:03 PM
chh added inline comments.
compiler-rt/lib/builtins/emutls.c
17 ↗(On Diff #32008)

If posix_memalign is used, current Android linker won't link because it wants to redefine malloc and free, but not posix_memalign. We can change Android linker, but I think making this module not depending on only basic malloc/free will be more portable. It's not going to be fast accessing emutls variables anyway.

40 ↗(On Diff #32008)

Still looking for a place to put this.

90 ↗(On Diff #32008)

It should be void* for pthread_key_create's argument type.

142 ↗(On Diff #32008)

This calloc or realloc does not need 16-byte or 16-slot aligned address.
It's just some extra slots to avoid too many calloc or realloc.
libgcc uses increment of 32 slots, which seems too big.
Most apps have only a few TLS variables.

149 ↗(On Diff #32008)

Other threads can use more than index + 16 variables, but the current thread only need to allocate space for index number of slots, for this call. When other TLS variables are used by this thread later, this thread can allocate more slots if necessary.

compiler-rt/test/builtins/Unit/emutls_test.c
50 ↗(On Diff #32008)

I would usually do that, but all other tests use this simple style without file/line number.
Guess it's the style preferred here.

74 ↗(On Diff #32008)

Yes, I hope the compile time check is useful to catch configuration errors in cross compilers.

77 ↗(On Diff #32008)

Yes, removing them.

chh updated this revision to Diff 32203.Aug 14 2015, 5:03 PM
joerg added inline comments.Aug 17 2015, 4:29 AM
compiler-rt/lib/builtins/emutls.c
18 ↗(On Diff #32203)

IMO it would still be cleaner to provide two inline functions at the start and
#undef posix_memalign / free and redefine them to map to those two functions. It leaves the complications of the Android issue out of the rest of the file.

143 ↗(On Diff #32203)

Just round up to next multiple of 16. No need to allocate some strange number depending on which TLS variable was access first.

150 ↗(On Diff #32203)

I still think it would make sense to try to remember the current maximum index. It doesn't have to be thread-safe, just a global variable. Essentially, use something like roundup16(max(index, max_index)) here.

chh marked an inline comment as done.Aug 17 2015, 11:48 AM
chh added inline comments.
compiler-rt/lib/builtins/emutls.c
18 ↗(On Diff #32203)

Please take a look of the new diff.
I am not sure that is what what you want.
It has the posix_memalign/malloc/free functions factored out into two inline functions. Maybe it is easier to read.

150 ↗(On Diff #32203)

I think emutls_num_object is what you want as a max_index here.
It's shared by all threads and thread safe.
For every newly used TLS variable, its index will be ++emutls_num_object.

chh updated this revision to Diff 32329.Aug 17 2015, 11:48 AM
danalbert added inline comments.Aug 24 2015, 9:54 AM
compiler-rt/lib/builtins/emutls.c
41 ↗(On Diff #32329)

Wow. Phab makes it hard to track comments across versions. This was about COMPILE_TIME_ASSERT. Feel free to make your own header in lib/builtins for this. We don't have one for this kind of thing yet.

compiler-rt/test/builtins/Unit/emutls_test.c
51 ↗(On Diff #32329)

No sense in continuing a bad thing. This should probably all come out to a shared header at some point.

chh updated this revision to Diff 33016.Aug 24 2015, 3:26 PM

Added FILE and LINE output in error messages.
Added comp_assert.h.

compiler-rt/lib/builtins/emutls.c
41 ↗(On Diff #32329)

Done. Added new header comp_assert.h.

compiler-rt/test/builtins/Unit/emutls_test.c
51 ↗(On Diff #32329)

Done. Added FILE and LINE to printf output.

compnerd added inline comments.Aug 24 2015, 7:22 PM
compiler-rt/lib/builtins/comp_assert.h
20 ↗(On Diff #33016)

Why not just put this in int_lib.h?

chh added inline comments.Aug 25 2015, 10:10 AM
compiler-rt/lib/builtins/comp_assert.h
20 ↗(On Diff #33016)

Maybe you mean int_util.h?
int_lib.h defines target dependent configuration macros and functions.
int_util.h defines aborting macros and declares aborting functions.
COMPILE_TIME_ASSERT is target independent compile-time checking and does not generate any code.

Now comp_assert.h is only included by one file.
If it is included by multiple files in the future, I think it would be better to separate different functionality into multiple header files.

compnerd added inline comments.Aug 25 2015, 8:05 PM
compiler-rt/lib/builtins/comp_assert.h
20 ↗(On Diff #33016)

Er, yes, int_util.h. That was just a slip up on my part. The *use* of the macro in your case is target dependent, nothing about the macro is target dependent though. There is a set of macros that we use pervasively though out the builtins, which are currently consolidated, and this would be the odd one out.

chh updated this revision to Diff 33228.Aug 26 2015, 11:19 AM

Move the macro COMPILE_TIME_ASSERT to int_util.h and change it to a typedef.

chh added a comment.Aug 26 2015, 11:22 AM

I think all suggested changes were made.
Please let me know if there is any other required change or is it ready to submit.
Thanks.

compiler-rt/lib/builtins/comp_assert.h
20 ↗(On Diff #33016)

Okay, I am moving the macro to int_util.h and also making it a typedef as suggested by Joerge.

chh removed a subscriber: danalbert.
joerg added inline comments.Aug 26 2015, 2:30 PM
compiler-rt/lib/builtins/int_util.h
31 ↗(On Diff #33228)

More like:

#define COMPILE_TIME_ASSERT0(pred,cnt) typedef char dummy#cnt[(pred) ? 1 : -1]
#define COMPILE_TIME_ASSERT(pred) COMPILE_TIME_ASSERT0(pred,LINE)

or so.

chh added inline comments.Aug 26 2015, 3:25 PM
compiler-rt/lib/builtins/int_util.h
31 ↗(On Diff #33228)

I think what you want is dummy##cnt to add line number to each dummy typedef.
However, above macros were expanded to dummy__LINE__, and I got multiple definition errors.

A trick to get cpp do the right thing is:

#define COMPILE_TIME_ASSERT(expr)    typedef char UNIQUE_NAME[(expr) ? 1 : -1] __attribute__((unused))
#define UNIQUE_NAME                     MAKE_NAME(__LINE__)
#define MAKE_NAME(line)                 MAKE_NAME2(line)
#define MAKE_NAME2(line)               constraint_ ## line

The __attribute__((unused)) is still necessary to avoid unused typedef warnings.

Would you rather to have that complicated macros?

My current definition should give line number on error and also avoid multiple definition errors.

joerg edited edge metadata.Aug 27 2015, 6:47 AM

That's why I said "or so" :) The attribute unused is fine, but your version doesn't work in file scope, which is where I normally would place compile time asserts.

chh updated this revision to Diff 33339.Aug 27 2015, 10:49 AM
chh edited edge metadata.

Make COMPILER_ASSERT work at also the file scope.
Use COUNTER instead of LINE to be unique in a compilation unit.

compnerd added inline comments.Aug 27 2015, 8:16 PM
compiler-rt/lib/builtins/CMakeLists.txt
44 ↗(On Diff #33339)

Please don't add this to GENERIC_SOURCES. This is really not applicable to all targets. It uses pthread, so it can't be used on say Windows.

This makes me wonder if this actually belongs in the compiler builtins in the first place.

compiler-rt/lib/builtins/emutls.c
47 ↗(On Diff #33339)

Can you also pull out the alignment math into a macro or an inline function please?

80 ↗(On Diff #33339)

Cant this be written more simply as

(align & (align - 1) == 0)
109 ↗(On Diff #33339)

Remove the unnecessary braces here.

danalbert added inline comments.Aug 27 2015, 8:27 PM
compiler-rt/lib/builtins/CMakeLists.txt
44 ↗(On Diff #33339)

GCC keeps it in libgcc, and it's a function that's only ever emitted by the compiler as a replacement for __thread. Seems like this is the right place to me. A windows equivalent could be implemented as well, but perhaps it's best to just wrap all this in an #if !defined(_WIN32) for now.

danalbert added inline comments.Aug 27 2015, 8:28 PM
compiler-rt/lib/builtins/emutls.c
109 ↗(On Diff #33339)

Bodies that aren't a one-liner should have braces.

chh updated this revision to Diff 33442.Aug 28 2015, 10:55 AM
chh marked an inline comment as done.

Changes with compnerd's and danalbert's suggestions.

chh added a comment.Aug 28 2015, 10:57 AM

PTAL new diff 7. Thanks.

compiler-rt/lib/builtins/CMakeLists.txt
44 ↗(On Diff #33339)

I will exclude it from WIN32 as danalbert suggested.

compiler-rt/lib/builtins/emutls.c
80 ↗(On Diff #33339)

Done, use if ((align & (align - 1)) != 0) abort();

109 ↗(On Diff #33339)

I think you meant that line 106 does not need brace and line 105 needs brace. Fixed it.

compnerd accepted this revision.Aug 28 2015, 7:52 PM
compnerd added a reviewer: compnerd.

LG with the one comment about the growth function.

compiler-rt/lib/builtins/emutls.c
151 ↗(On Diff #33442)

If you are trying to round up to 16, shouldn't this be more like:

(index + 16 - 1) & ~(16 - 1)

Which would be ideal as an inline function:

inline size_t align_up(size_t value, unsigned alignment) {
  return (value + alignment - 1) & ~(size_t)(alignment - 1);
}
This revision is now accepted and ready to land.Aug 28 2015, 7:52 PM
chh added inline comments.Aug 31 2015, 9:17 AM
compiler-rt/lib/builtins/emutls.c
151 ↗(On Diff #33442)

It's trickier than that as we try to be compatible with gcc's structure of emutls_address_arrary. It has a 'data' array plus an extra pointer. So, here we want to round up (index + 1) to 16, but returns the new size of 'data' array.

This revision was automatically updated to reflect the committed changes.