This is an archive of the discontinued LLVM Phabricator instance.

[LICM] When promoting scalars, allow inserting stores to thread-local allocas
ClosedPublic

Authored by mkuper on Dec 29 2016, 5:06 PM.

Details

Summary

This is similar to the allocfn case - if an alloca is not captured, then it's necessarily thread-local.

Diff Detail

Event Timeline

mkuper updated this revision to Diff 82710.Dec 29 2016, 5:06 PM
mkuper retitled this revision from to [LICM] When promoting scalars, allow inserting stores to thread-local allocas.
mkuper updated this object.
mkuper added a reviewer: reames.
mkuper added a subscriber: llvm-commits.
reames accepted this revision.Dec 29 2016, 5:10 PM
reames edited edge metadata.

LGTM, thanks!

This revision is now accepted and ready to land.Dec 29 2016, 5:10 PM
This revision was automatically updated to reflect the committed changes.
trentxintong added inline comments.
llvm/trunk/test/Transforms/LICM/promote-tls.ll
67 ↗(On Diff #82711)

Any particular reason why this can not be br i1 undef, label %a, label %b ? Understandably, this can be simplified by SimplifyCFG or SCCP. but we are only running LICM here ?

trentxintong added inline comments.Jan 18 2017, 2:16 PM
llvm/trunk/lib/Transforms/Scalar/LICM.cpp
1037 ↗(On Diff #82711)

The question I really want to ask is whether the check for MayNotBeCaptured is necessary for Alloca. Should not alloca be implicitly thread local ? The language reference does not explicitly state it.

efriedma added inline comments.
llvm/trunk/lib/Transforms/Scalar/LICM.cpp
1037 ↗(On Diff #82711)

In general, memory allocated using alloca can be used in all the same ways you could use memory allocated with malloc(). Not sure what sort of restriction you're expecting.

sanjoy added a subscriber: sanjoy.Jan 19 2017, 3:43 PM
sanjoy added inline comments.
llvm/trunk/lib/Transforms/Scalar/LICM.cpp
1037 ↗(On Diff #82711)

(Slight digression) There is one difference -- alloca memory is inherently frame local. This means you don't have to worry about the memory escaping via a ret. DSE handles a specific instance of this, where the stores are in the return block itself:

i8* @f() {
  %t = alloca i32
  store i32 42, i32* %t
  ret i32* %t
}

but e.g. slightly more complex cases like

define i32* @f(i1* %cond) {
  %t = alloca i32
  store i32 42, i32* %t
  %c = load volatile i1, i1* %cond
  br i1 %c, label %a, label %b

a:
  %c2 = load volatile i1, i1* %cond
  ret i32* %t

b:
  ret i32* null
}

are not handled by -O3.