This is an archive of the discontinued LLVM Phabricator instance.

[LICM] When promoting scalars, allow inserting stores to thread-local allocas
ClosedPublic

Authored by mkuper on Dec 29 2016, 5:06 PM.

Details

Summary

This is similar to the allocfn case - if an alloca is not captured, then it's necessarily thread-local.

Diff Detail

Repository
rL LLVM

Event Timeline

mkuper updated this revision to Diff 82710.Dec 29 2016, 5:06 PM
mkuper retitled this revision from to [LICM] When promoting scalars, allow inserting stores to thread-local allocas.
mkuper updated this object.
mkuper added a reviewer: reames.
mkuper added a subscriber: llvm-commits.
reames accepted this revision.Dec 29 2016, 5:10 PM
reames edited edge metadata.

LGTM, thanks!

This revision is now accepted and ready to land.Dec 29 2016, 5:10 PM
This revision was automatically updated to reflect the committed changes.
trentxintong added inline comments.
llvm/trunk/test/Transforms/LICM/promote-tls.ll
67

Any particular reason why this can not be br i1 undef, label %a, label %b ? Understandably, this can be simplified by SimplifyCFG or SCCP. but we are only running LICM here ?

trentxintong added inline comments.Jan 18 2017, 2:16 PM
llvm/trunk/lib/Transforms/Scalar/LICM.cpp
1037

The question I really want to ask is whether the check for MayNotBeCaptured is necessary for Alloca. Should not alloca be implicitly thread local ? The language reference does not explicitly state it.

efriedma added inline comments.
llvm/trunk/lib/Transforms/Scalar/LICM.cpp
1037

In general, memory allocated using alloca can be used in all the same ways you could use memory allocated with malloc(). Not sure what sort of restriction you're expecting.

sanjoy added a subscriber: sanjoy.Jan 19 2017, 3:43 PM
sanjoy added inline comments.
llvm/trunk/lib/Transforms/Scalar/LICM.cpp
1037

(Slight digression) There is one difference -- alloca memory is inherently frame local. This means you don't have to worry about the memory escaping via a ret. DSE handles a specific instance of this, where the stores are in the return block itself:

i8* @f() {
  %t = alloca i32
  store i32 42, i32* %t
  ret i32* %t
}

but e.g. slightly more complex cases like

define i32* @f(i1* %cond) {
  %t = alloca i32
  store i32 42, i32* %t
  %c = load volatile i1, i1* %cond
  br i1 %c, label %a, label %b

a:
  %c2 = load volatile i1, i1* %cond
  ret i32* %t

b:
  ret i32* null
}

are not handled by -O3.