A lot of library calls cannot be used to synchronize with other threads
of execution. This is useful to know, e.g., for heap-2-stack on GPUs.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
To be safe, what do you think about marking nosync to ops that can be represented as a series of loads/stores or scalar ops only? For example, I believe memset is nosync because it is equivalent to a series of nonatomic stores.
For side-effecting operations, such as printf, I'm not 100% sure whether it is nosync. printf interacts with cout to properly flush buffers, which might do some interactions.
llvm/lib/Transforms/Utils/BuildLibCalls.cpp | ||
---|---|---|
528 | C17's 7.22.3 Memory management functions has this paragraph:
Should we conservatively assume that allocation/deallocation fns may synchronize with other threads? | |
978 | This implies that if comparator fn executs any atomic operation then qsort raises UB, IIUC. | |
1100 | These functions may raise FE exceptions; would it be safe to assume that calling two ldexp, both of which setting FE exceptions, is UB? |
I tried to avoid all file operations, we can also avoid printf and friends.
llvm/lib/Transforms/Utils/BuildLibCalls.cpp | ||
---|---|---|
528 | My original purpose was to add nosync to malloc and free :( I guess what this says is that if you have two threads. You now know T1 deallocated P. Worst case we could derive this for call sites if the pointer P was never observed (=captured). | |
978 | right. | |
1100 | I don't understand the question. |
llvm/lib/Transforms/Utils/BuildLibCalls.cpp | ||
---|---|---|
528 |
I think it is safer and okay maybe, but since a few malloc implementation such as glibc's one typically has a lock to correctly manipulate allocated areas inside, showing the validity of attaching nosync seems non-trivial to me. I have a question about the background btw - Does GPU's malloc need to use atomic operations? If GPU's malloc is simpler and they are not guaranteed to synchronize, attaching nosync can be justified. | |
1100 | My question was whether math library fns can use atomic operations to update errno. I just found that errno is defined as a thread-local storage; I believe it's okay now. |
I'll send an email to cfe-dev at some point soon to ask about the malloc/free semantics, wrt. nosync but also other things.
Thanks for the input, I might update this to only include the safe things before the email is send.
llvm/lib/Transforms/Utils/BuildLibCalls.cpp | ||
---|---|---|
528 | The question is not necessarily if the impl. uses an atomic or is synchronized but if it "leaks" out. So can the user establish a proper happens-before relation between two threads with malloc/free. If it doesn't, nosync is fine regardless of the impl. because of the "as-if" rule. |
llvm/lib/Transforms/Utils/BuildLibCalls.cpp | ||
---|---|---|
528 | If no issue is found, I believe defining malloc that is never escaped as returning an isolated allocation that can never be used to communicate with the outer world (unknown functions or threads) will be great too. |
C17's 7.22.3 Memory management functions has this paragraph:
Should we conservatively assume that allocation/deallocation fns may synchronize with other threads?