This is an archive of the discontinued LLVM Phabricator instance.

Debug info: Support fragmented variables.
ClosedPublic

Authored by aprantl on Feb 3 2014, 3:16 PM.

Download Raw Diff

Details

Reviewers

chandlerc
aprantl

Summary

Reposting in Phabricator as requested.

Currently, LLVM cannot represent locations for aggregate variables that
are fragmented across multiple Values. This situation arises, e.g.,
during SROA, or even as part of the x86_64 calling convention when a struct
is passed by value. This patch adds the functionality to emit
DWARF location expressions using DW_OP(_bit)_piece to express partial
values.

This is implemented by adding a new operation type OpPiece to complex
DIVariables which accepts an offset and a size.

The canonical example would be something like this:

typedef struct { long int a; int b;} S;

int foo(S s) {
     return s.b;
}

which at -O1 is now codegen’d into:

; Function Attrs: nounwind readnone ssp uwtable
define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 {
entry:
 tail call void @llvm.dbg.value(metadata !{i64 %s.coerce0}, i64 0, metadata !20)
 tail call void @llvm.dbg.value(metadata !{i32 %s.coerce1}, i64 0, metadata !21)
 ret i32 %s.coerce1, !dbg !24
}

with this patch we'd emit the following DWARF:

0x00000047:         TAG_formal_parameter [3]  
                  AT_name( "s" )
                  AT_decl_file( "struct.c" )
                  AT_decl_line( 3 )
                  AT_type( {0x00000069} ( S ) )
                  AT_location( 0x00000000
                     0x0000000000000000 - 0x0000000000000008: rdi, piece 0x00000008
                     0x0000000000000000 - 0x0000000000000006: rsi, bit-piece 32 64 
                     0x0000000000000006 - 0x0000000000000008: rax, bit-piece 32 64  )

cheers,
adrian

Diff Detail

Event Timeline

Could you show a little more of your example (you show the function's IR, but not the debug info metadata associated with it (I'm OK if you gloss over most of the metadata details, but highlighting the new metadata constructs your adding would be helpful - rather than wandering through the full test case to get the general idea))

Also, a high level (higher than the code itself) description of the strategy you've taken would be helpful. I'm particularly curious about the use of a "entire" DIVariable and how/why it's built from constituent parts.

Hi David,

Here’s the MDNodes for the two variable pieces:

!20 = metadata !{i32 786689, metadata !4, metadata !"s", metadata !5, i32 16777219, metadata !9, i32 0, i32 0, i32 3, i32 0, i32 8} ; [ DW_TAG_arg_variable ] [s] [line 3]
!21 = metadata !{i32 786689, metadata !4, metadata !"s", metadata !5, i32 16777219, metadata !9, i32 0, i32 0, i32 3, i32 8, i32 4} ; [ DW_TAG_arg_variable ] [s] [line 3]

All the way through ISel and the MC layer those two MDNodes behave like regular DIVariables with complex addresses. In DwarfDebug, we need to treat the pieces as one entity when calculating the Ranges/History. getEntireVariable() simply chops off the last two elements so it can be used as a shared index into DbgVariables while still indicating that this is a fragmented location. The extra code in the range calculation makes sure that a piece of a variable is not clobbered by a non-overlapping piece.

FYI: In a previous design I experimented with introducing a new type MDnode

!Piece = metadata !{i32 PIECE, metadata !<ref to DIVariable>, metadata !<Offset>, metadata !<Size>}

but this meant adding an extra case to all the ISel/MC code handling DBG_VALUEs, which is why I ended up with the current design.

— adrian

Ping! Does anyone have a strong opinion on this design?

thanks,

adrian

I've got a few comments on the code, but there's no update for the documentation in SourceLevelDebugging.html. How about we get that solidified first so we've got a good idea of what format we're looking at for the metadata.

include/llvm/DebugInfo.h
684 ↗	(On Diff #6836)	Cut and paste.
lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1305 ↗	(On Diff #6836)	Would probably be nice to have a function for getDebugMetadata or something for MachineInstr instead of the subtraction.
1320 ↗	(On Diff #6836)	Typo in function name.
lib/Transforms/Scalar/SROA.cpp
426	Can't you just get the users here?
947	Instead of passing all of these down it should be possible to construct one.

aprantl updated this revision to Unknown Object (????).Feb 11 2014, 3:42 PM

Added documentation and fixed typos.

Rebased patch on ToT and addressed review comments. Most importantly this resolves a conflict with r201190, which added DW_OP_piece support for variables occupying only part of a register (essentially the complementary problem).

Updated to also support variables split up by SelectionDAG’s integer type legalization. Thanks iains for providing additional testcases.

Hi Adrian,

This is a lot of great work, thanks!

I think the first question is how this is going to interact with DIBuilder::createComplexVariable? I think it should just be a concatenation of the two things. It looks like you have it as an instead? In general, I think I like the idea that variable locations just have expressions attached to them. And lacking anything else I think dwarf expressions are just fine here.

I've got more comments on the patch itself, but let's get this out of the way first.

-eric

Two high-level comments:

This needs a set of direct SROA tests. I don't think these need to use the C-compiled-to-debug stuff as you can use stub metadata nodes (i've tried this in a couple of other tests and it works fine) and just check that the intrinsics are being adjusted the way you want and re-pointed in the way you want. You *might* need one layer of the debug part, but I generally feel like you can make this testable in a more "normal" IR test kind of way.

Clang format. =] (ok, this is a minor comment)

Detailed questions below. The most important one is trying to understand how you handle un-split cases and iteratively split cases.

lib/Support/GraphWriter.cpp
90 ↗	(On Diff #7324)	Uh, huh?
lib/Transforms/Scalar/SROA.cpp
426–429	This seems like a total hack. It papers over the fact that this is not relevant for a particular store, but for the entire alloca. Why can't we find this by locating the use of the alloca in the metadata and then the intrinsic using the metadata? I know it's a non-standard edge in the IR, but I was pretty sure it was still an edge in the IR somewhere that would let us find this?
690	Bad reformatting here. Generally, I would encourage you to use clang-format.
2236	What happens if the original store was already a split? This will happen a lot with SROA when we iterate on the same alloca. Also, what about when this isn't a split at all? I feel like the first patch should just migrate dbg.declares from old allocas to new allocas when we're rewriting uses but not re-slicing anything.
2239	Note that you would want SliceSize here, as if we split up the store itself, BeginOffset and EndOffset won't reflect that split. You should add a test case for this as well -- example would be storing the low and high parts of an i64 separately, as well as storing the whole thing.
2241–2244	I'm probably missing something, but it looks like this has the real possibility of taking code which has no partial info (no dbg.value calls) and making it suddenly have more info. Is that right? Is that good? Should we be looking for relevant dbg.value calls pertaining to the original alloca and slicing them up rather than just synthesizing our own? Also, what about memset? memcpy? speculated stores around phis? I assume follow-up patches, but it might be good to document some of that.
3173	Why is it important to thread a single DIBuilder through all of these levels? Why not just create the DIBuilder in the AllocaSliceRewriter? Or if it is expensive to construct, create it as a member?

On Feb 26, 2014, at 3:06, Chandler Carruth <chandlerc@gmail.com> wrote:

Hi Chandler/Eric/David,

Thanks for your feedback, everyone!
I think I addressed every comment now, below are some notes regarding the non-obvious or controversial changes.

Two high-level comments:

This needs a set of direct SROA tests.

There are now three tests that explicitly run opt -sroa and check the IR output. The IR output is then also compiled with llc to verify the dwarf output, so they are still located in the tests/DebugInfo directory.

Detailed questions below. The most important one is trying to understand how you handle un-split cases and iteratively split cases.
...
What happens if the original store was already a split? This will happen a lot with SROA when we iterate on the same alloca.

This is not an issue because we can create a variable piece from an existing variable piece just fine.

Also, what about when this *isn't* a split at all? I feel like the first patch should just migrate dbg.declares from old allocas to new allocas when we're rewriting uses but not re-slicing anything.

I moved the code that migrates the debug declares from the Store handler to the AllocaSlice visitor of AllocaRewriter. It also handles the case were the alloca is just rewritten without slicing. Currently it’s using the DIType of the DIVariable to determine whether this should be a VariablePiece or not. This is not strictly necessary, though: it forces us to build a DITypeIdentifierMap in SROA, but it saves us 3 int32 per variable MDNode when the alloca is not split. Alternatively we could also just always create pieces, and perform the check whether a piece covers the entire variable back in AsmPrinter.

Comment at: lib/Transforms/Scalar/SROA.cpp:426-429
@@ -419,1 +425,6 @@

+ Make a best effort to find a dbg.declare intrinsic describing
+ the alloca by peeking at the next instruction.
+ if (DbgDeclareInst *DDI=dyn_cast_or_null<DbgDeclareInst>(SI.getNextNode()))
+ S.DbgDeclare = DDI;
+
————————

This seems like a total hack.

It papers over the fact that this is not relevant for a particular store, but for the entire alloca.

Why can't we find this by locating the use of the alloca in the metadata and then the intrinsic using the metadata? > I know it's a non-standard edge in the IR, but I was pretty sure it was still *an* edge in the IR somewhere that would let us find this?

The current implementation uses a DenseMap<AllocaInst, DbgDeclare> to find the debug intrinsic for an alloca.

Comment at: lib/Transforms/Scalar/SROA.cpp:2241-2244
@@ +2240,6 @@
+ if (Var.getType().getSizeInBits() < Size*8 /* Assuming 8 Bits/Byte */) {
+ DIVariable Piece = DIB.createVariablePiece(Var, BeginOffset, Size);
+ DIB.insertDbgValueIntrinsic(V, 0, Piece, Store)
+ ->setDebugLoc(DbgDecl->getDebugLoc());
+ Pass.DeadInsts.insert(DbgDecl);
+ }
————————

I'm probably missing something, but it *looks* like this has the real possibility of taking code which has no partial info (no dbg.value calls) and making it suddenly have more info. Is that right? Is that good?

Maybe I’m misunderstanding what you said there, but as far as I understand it this:

Should we be looking for relevant dbg.value calls pertaining to the original alloca and slicing them up rather than just synthesizing our own?

is describing exactly what we are doing right now. Also note that only dbg.declare instrinsics can describe allocas.

Also, what about memset? memcpy? speculated stores around phis? I assume follow-up patches, but it might be good to document some of that.

This is not yet complete and I plan to add other cases over time!

Here is a rebased version of the patch that applies cleanly after Chandler’s movement of header files and Eric’s creation of AsmPrinterDwarf.

(sorry for delays, trying to get to this one...)

(Ditto)

Rebased on r204198.
Moved the DWARF expression output for DW_op_pieces out of emitDebugLocEntry::emitDebugLocEntry() and into a DWARFPieceEmitter. Should be more readable this way.

Hi Eric & Chandler,

it’s been a while, and I was just thinking about what we could do to make this move forward a little faster. I understand that the more time is passing the patch grows larger and large which makes it harder to review. We could, for instance split out the SROA changes into a separate patch, and then focus on the IR and DwarfDebug changes initially, with the type legalizer providing the necessary test cases.

cheers,
adrian

Mostly commenting on the SROA things.

lib/Transforms/Scalar/SROA.cpp
970–972	If these are actually indexed by AllocaInst*s, why not use that as the key type?
975–976	This is a pretty terrible interface... Why not just lazily create this and store it in the Module so that the Module provides a direct method to produce a TypeIdentifierMap? Also, why Optional rather than a pointer? I assume TypeIdentifierMap is some kind of smart pointer?
2107–2108	I don't think this belongs in the slice rewriter. The rewriter is about rewriting uses in the slice vector. There is a perfect place for this in the rewritePartition method right after we create the new AllocaInst.
2116–2117	This being an assert worries me. Does the verifier ensure this?
2124	This doesn't make sense to me. We may see the same DbgDecl many times? Wouldn't this become dead only when OldAI became dead?

Let's split the patch out with just the infrastructure changes needed and a testcase to use them and we'll go from there? In general, I think it's a good idea, just a big patch :)

-eric

Let's look into splitting this up and out a bit. I think being able to have complex addresses as part of a location in the debug value instruction should be fine. The multiple optional statements bother me a bit, I guess we'll need to add a null metadata in cases where you want to use this. What about turning it into a non-optional argument? Just keep a NULL for a complex address? How badly does that affect the existing tests?

Let's split the patch out with just the infrastructure changes needed and a testcase to use them and we'll go from there? In general, I think it's a good idea, just a big patch :)

i'd probably split out the "optimize" as well. Yeah, we'll emit non-optimal for a couple of patches, but as an add-on that seems like a more obvious patch.

I echo Chandler's thoughts about the type uniquing map. What's going on there?

Let's raise this back to everyone's queue again. Had some questions on the type uniquing map, but I think most of the prereqs were already reviewed and so this can be rebased?

Rebased on today's trunk. Otherwise mostly unchanged so far, I have a couple of questions regarding Chandler's comments that I'd like to clarify before the next iteration.
Thanks for all the feedback so far!

I'm curious why there are both SROA and SelDAG changes in this together - I assume the SelDAG (and any other LLVM CodeGen) changes could be separated/tested with raw IR inputs and the SROA changes should be implemented/tested as standalone IR->IR behavior, right?

So is there more separation to do in this patch, or are these changes somehow inextricably linked?

lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1285 ↗	(On Diff #12121)	Remove commented out code (& unrelated whitespace changes) - then this file shouldn't be included in the diff at all. Ideally this whole change shouldn't have any changes to core debug info APIs/IR APIs - those changes have already been committed. This change is now just about teaching certain optimizations to preserve/create the IR necessary to describe the variable location fragments, right?

dblaikie edited the test plan for this revision. (Show Details)Aug 1 2014, 4:27 PM

dblaikie removed a reviewer: deleted.

Added some questions/comments to Chandler's questions. Thanks!

lib/Transforms/Scalar/SROA.cpp
970–972	They could also be function arguments, not just allocas.
975–976	The main concern was memory usage. I wanted only create this on demand when it's used and free the memory after the pass is done with the module. That said, I'm not overly attached to this model. Got rid of the optional.
2107–2108	What I'm doing here is replace the one dbg.declare describing the old alloca with one for each of the slices. IF I understand it correctly :-) it appears that rewritePartition would be too early for this because it didn;t yet create the slices yet.
2116–2117	The verifier (DIVariable::Verify()) cannot ensure this, because it would need access to a DITypeIdentifierMap to look up the type. There is, however, the same assertion in DwarfDebug just before the Dwarf expression it would be emitted. If we do make the DITypeIdentifierMap a feature of the module we could have the verifier take care of it.
2124	The goal is to replace the single DbgDecl that describes the old alloca with many smaller dbg instrinsics that describe the individual slices.

Splitting out TypeLegalizer changes into a separate review.

The type legalizer changes can be found at http://reviews.llvm.org/D4831.

And both of these patches are predicated on http://reviews.llvm.org/D3373 right?

Removed extra whitespace from patch.

dblaikie edited the test plan for this revision. (Show Details)Aug 20 2014, 7:46 PM

dblaikie added a reviewer: chandlerc.

dblaikie removed a subscriber: chandlerc.

Bumping this hopefully into Chandler's queue (by setting him as a reviewer, rather than just a subscriber) and some minor feedback from me on the test cases.

test/DebugInfo/X86/sroasplit-2.ll
9	Perhaps these two test cases could be named more precisely for what they do - it took me a bit of staring to see the specific difference between them (the type of the Inner::b member is different) and I still don't have any idea why that's important to test separately (what is it that's different about that case)? (if it is different and different code is needed for it, usual question applies about "is this different code that can be implemented/tested/committed separately", but perhaps it isn't) & do both cases need the two levels of indirection (a struct, with an array of other structs)? While a test case that covers a few cases in one go is soemtimes nice, it'd help me at least if there was a comment explaining why the different pieces of indirection were important to the test - what parts of SROA they're exercising.

Rebased on todays trunk.

This patch teaches SROA to split the debug info for aggregate variables into pieces describing the individual scalar elements of the former aggregate.

Chandler, could you please review my responses to you previous review?

Fix an initializer for MSVC compatibility.

To finish reviewing this, it would help to have a full context patch? If that's a problem, I can patch it locally and take a look...

lib/Transforms/Scalar/SROA.cpp
959	These are of Value *s, but what Values? The loads? The allocas? Something else? Essentially, I think this comment needs to indicate what the synthetic "use" edge looks like for the purpose of SROA.
2117–2118	The order here seems a bit odd. I would group the references separately from the builders.... Also, nit-picky issue, clang-format please. =]
3724	Bad indentation...

Rebased on trunk and addressed Chandler's feedback.

Thanks for the full context diff, it did help. =D

lib/Transforms/Scalar/SROA.cpp
2112	You don't need this -- the SROA pass object is available directly as "Pass".
2166–2177	Actually, why do this here at all? This doesn't seem to really be related to rewriting the uses in the traditional sense. Notably, it seems excessive if OldAI == NewAI (which happens in some cases). I think the better place to put this logic is inside the rewritePartition method of the SROA pass itself right after we create the new alloca.
3713–3714	This should be a dyn_cast to AllocaInst and a re-use. Also, getAddress can return null which seems like it would crash here. Unless the verifier rejects such an debug declare inst (and if it does, it would be nice to have an "unsafe" version of the method that returns null for the verifier and have this method assert for us that the result is non-null) this should handle the null case gracefully. Somewhat separately (and apologies if this just is my ignorance of debug info) what cases would you expect where the address isn't directly an alloca inst?

aprantl added inline comments.Dec 18 2014, 1:42 PM

lib/Transforms/Scalar/SROA.cpp
2112	Thanks!
2166–2177	Done.
3713–3714	See the first testcase: define i32 @foo(%struct.Outer* byval align 8 %outer) #0 { call void @llvm.dbg.declare(metadata %struct.Outer* %outer, metadata !25, metadata !2), !dbg !26 it could be an argument.

Addressed Chandler's feedback / moved the debug intrinsic generation right after we create the new alloca.

Changed isa/cast to dyn_cast.

aprantl added inline comments.Dec 18 2014, 1:52 PM

lib/Transforms/Scalar/SROA.cpp
3713–3714	No the verifier does not reject it, although an UndefValue would be preferred over of a nullptr. I'll change it to a dyn_cast_or_null.

s/dyn_cast/dyn_cast_or_null/

The code is looking quite good at this point. I'm mostly looking for some cleanups / clarifications of things I don't understand.

lib/Transforms/Scalar/SROA.cpp
3245–3247	This predicate is somewhat subtle. You're creating a piece expression any time the alloc size isn't the same, but you're using the slice size rather than the alloc size to create the piece.... Can you add a comment here? If I came to this code later, I wouldn't understand why the different sizes used were significant.
3717	Is there a particular cost to constructiog DIBuilder objects? I ask because, were this an IRBuilder, I would have said to just construct it directly in each place we need it, because there is no real cast associated with it. If the same is true for DIBuilder, then I'd suggest the same change. If DIBuilder works differently, I'd like to know how to better use it. =]
test/DebugInfo/X86/sroasplit-1.ll
4	Honestly, I really dislike seeing rdar links even in tests. They add no information for any reader not at Apple, and so you end up needing to explain all of the context of the test anyways. PR links are at least a reasonable thing for your average contributor to go read.

aprantl added inline comments.Dec 22 2014, 12:28 PM

lib/Transforms/Scalar/SROA.cpp
3245–3247	I changed condition and comment to this, which should be more clear: // Create a piece expression describing the slice, if the new slize is // smaller than the old alloca or the old alloca already was described // with a piece. It would be even better to just compare against the size // of the type described in the debug info, but then we would need to // build an expensive DIRefMap. if (SliceSize < DL->getTypeAllocSize(AI.getAllocatedType()) \|\| DbgDeclares[AI].getExpression().isVariablePiece()) Piece = DIB->createPieceExpression(BeginOffset, SliceSize);
3717	Compared to an IRBuilder it looks as if DIBuilder has to initialize quite a few members (that will never be used). But I'm unsure whether there actually is a cost to initializing DenseMaps and SmallVectors? Otherwise I'd be happy to instantiate the DIBuilder on the fly. Alternatively we could factor out a less capable DIBuilder base class that does not have all these members. Module &M; LLVMContext &VMContext; MDNode TempEnumTypes; MDNode TempRetainTypes; MDNode TempSubprograms; MDNode TempGVs; MDNode TempImportedModules; Function DeclareFn; // llvm.dbg.declare Function ValueFn; // llvm.dbg.value SmallVector<Metadata , 4> AllEnumTypes; /// Track the RetainTypes, since they can be updated later on. SmallVector<TrackingMDNodeRef, 4> AllRetainTypes; SmallVector<Metadata , 4> AllSubprograms; SmallVector<Metadata , 4> AllGVs; SmallVector<TrackingMDNodeRef, 4> AllImportedModules; /// \brief Track nodes that may be unresolved. SmallVector<TrackingMDNodeRef, 4> UnresolvedNodes; bool AllowUnresolvedNodes; /// Each subprogram's preserved local variables. DenseMap<MDNode *, std::vector<TrackingMDNodeRef>> PreservedVariables;
test/DebugInfo/X86/sroasplit-1.ll
4	I can understand that. Removed.

Address previous round of feedback.

LGTM with a minor change below, thanks for all the work explaining things. =]

lib/Transforms/Scalar/SROA.cpp
3717	Interesting. I suggest you just build them on the fly here, and if these members ever end up being a problem, we can go in and avoid their cost much as you describe. No need to do that until we see a problem, but it seems nicer than having a heap-allocated builder.

Thanks for the review so far!
This

Initializes DIBuilder on the fly.
Improves the DebugInfo/X86/array2.ll testcase based on the new capabilities
Fixes a compile error in the above-cited expression.

Closing. Thanks everyone!!

This revision is now accepted and ready to land.Dec 22 2014, 2:23 PM

aprantl closed this revision.Aug 18 2015, 10:41 AM

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

SROA.cpp

34 lines

test/

DebugInfo/

X86/

array2.ll

8 lines

sroasplit-1.ll

97 lines

sroasplit-2.ll

102 lines

Diff 17571

lib/Transforms/Scalar/SROA.cpp

Show First 20 Lines • Show All 417 Lines • ▼ Show 20 Lines	if (SROAStrictInbounds && GEPI.isInBounds()) {

// Handle a struct index, which adds its field offset to the pointer.		// Handle a struct index, which adds its field offset to the pointer.
if (StructType STy = dyn_cast<StructType>(GTI)) {		if (StructType STy = dyn_cast<StructType>(GTI)) {
unsigned ElementIdx = OpC->getZExtValue();		unsigned ElementIdx = OpC->getZExtValue();
const StructLayout *SL = DL.getStructLayout(STy);		const StructLayout *SL = DL.getStructLayout(STy);
GEPOffset +=		GEPOffset +=
APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx));		APInt(Offset.getBitWidth(), SL->getElementOffset(ElementIdx));
} else {		} else {
// For array or vector indices, scale the index by the size of the		// For array or vector indices, scale the index by the size of the
		echristoUnsubmitted Not Done Reply Inline Actions Can't you just get the users here? echristo: Can't you just get the users here?
// type.		// type.
APInt Index = OpC->getValue().sextOrTrunc(Offset.getBitWidth());		APInt Index = OpC->getValue().sextOrTrunc(Offset.getBitWidth());
GEPOffset += Index * APInt(Offset.getBitWidth(),		GEPOffset += Index * APInt(Offset.getBitWidth(),
		chandlercUnsubmitted Not Done Reply Inline Actions This seems like a total hack. It papers over the fact that this is not relevant for a particular store, but for the entire alloca. Why can't we find this by locating the use of the alloca in the metadata and then the intrinsic using the metadata? I know it's a non-standard edge in the IR, but I was pretty sure it was still an edge in the IR somewhere that would let us find this? chandlerc: This seems like a total hack. It papers over the fact that this is not relevant for a…
DL.getTypeAllocSize(GTI.getIndexedType()));		DL.getTypeAllocSize(GTI.getIndexedType()));
}		}

// If this index has computed an intermediate pointer which is not		// If this index has computed an intermediate pointer which is not
// inbounds, then the result of the GEP is a poison value and we can		// inbounds, then the result of the GEP is a poison value and we can
// delete it and all uses.		// delete it and all uses.
if (GEPOffset.ugt(AllocSize))		if (GEPOffset.ugt(AllocSize))
return markAsDead(GEPI);		return markAsDead(GEPI);
▲ Show 20 Lines • Show All 244 Lines • ▼ Show 20 Lines	if (!Size) {
// This is a new PHI/Select, check for an unsafe use of it.		// This is a new PHI/Select, check for an unsafe use of it.
if (Instruction *UnsafeI = hasUnsafePHIOrSelectUse(&I, Size))		if (Instruction *UnsafeI = hasUnsafePHIOrSelectUse(&I, Size))
return PI.setAborted(UnsafeI);		return PI.setAborted(UnsafeI);
}		}

// For PHI and select operands outside the alloca, we can't nuke the entire		// For PHI and select operands outside the alloca, we can't nuke the entire
// phi or select -- the other side might still be relevant, so we special		// phi or select -- the other side might still be relevant, so we special
// case them here and use a separate structure to track the operands		// case them here and use a separate structure to track the operands
// themselves which should be replaced with undef.		// themselves which should be replaced with undef.
		chandlercUnsubmitted Not Done Reply Inline Actions Bad reformatting here. Generally, I would encourage you to use clang-format. chandlerc: Bad reformatting here. Generally, I would encourage you to use clang-format.
// FIXME: This should instead be escaped in the event we're instrumenting		// FIXME: This should instead be escaped in the event we're instrumenting
// for address sanitization.		// for address sanitization.
if (Offset.uge(AllocSize)) {		if (Offset.uge(AllocSize)) {
AS.DeadOperands.push_back(U);		AS.DeadOperands.push_back(U);
return;		return;
}		}

insertUse(I, Offset, Size);		insertUse(I, Offset, Size);
▲ Show 20 Lines • Show All 240 Lines • ▼ Show 20 Lines	class SROA : public FunctionPass {
std::vector<AllocaInst *> PromotableAllocas;		std::vector<AllocaInst *> PromotableAllocas;

/// \brief A worklist of PHIs to speculate prior to promoting allocas.		/// \brief A worklist of PHIs to speculate prior to promoting allocas.
///		///
/// All of these PHIs have been checked for the safety of speculation and by		/// All of these PHIs have been checked for the safety of speculation and by
/// being speculated will allow promoting allocas currently in the promotable		/// being speculated will allow promoting allocas currently in the promotable
/// queue.		/// queue.
SetVector<PHINode , SmallVector<PHINode , 2>> SpeculatablePHIs;		SetVector<PHINode , SmallVector<PHINode , 2>> SpeculatablePHIs;

		echristoUnsubmitted Not Done Reply Inline Actions Instead of passing all of these down it should be possible to construct one. echristo: Instead of passing all of these down it should be possible to construct one.
/// \brief A worklist of select instructions to speculate prior to promoting		/// \brief A worklist of select instructions to speculate prior to promoting
/// allocas.		/// allocas.
///		///
/// All of these select instructions have been checked for the safety of		/// All of these select instructions have been checked for the safety of
/// speculation and by being speculated will allow promoting allocas		/// speculation and by being speculated will allow promoting allocas
/// currently in the promotable queue.		/// currently in the promotable queue.
SetVector<SelectInst , SmallVector<SelectInst , 2>> SpeculatableSelects;		SetVector<SelectInst , SmallVector<SelectInst , 2>> SpeculatableSelects;

		/// Debug intrinsics do not show up like regular uses in the
		/// IR. This side-table holds the missing use edges.
		DenseMap<AllocaInst , DbgDeclareInst > DbgDeclares;

		chandlercUnsubmitted Not Done Reply Inline Actions These are of Value s, but what Values? The loads? The allocas? Something else? Essentially, I think this comment needs to indicate what the synthetic "use" edge looks like for the purpose of SROA. chandlerc:* These are of Value *s, but what Values? The loads? The allocas? Something else? Essentially, I…
public:		public:
SROA(bool RequiresDomTree = true)		SROA(bool RequiresDomTree = true)
: FunctionPass(ID), RequiresDomTree(RequiresDomTree), C(nullptr),		: FunctionPass(ID), RequiresDomTree(RequiresDomTree), C(nullptr),
DL(nullptr), DT(nullptr) {		DL(nullptr), DT(nullptr) {
initializeSROAPass(*PassRegistry::getPassRegistry());		initializeSROAPass(*PassRegistry::getPassRegistry());
}		}
bool runOnFunction(Function &F) override;		bool runOnFunction(Function &F) override;
void getAnalysisUsage(AnalysisUsage &AU) const override;		void getAnalysisUsage(AnalysisUsage &AU) const override;

const char *getPassName() const override { return "SROA"; }		const char *getPassName() const override { return "SROA"; }
static char ID;		static char ID;

private:		private:
		chandlercUnsubmitted Not Done Reply Inline Actions If these are actually indexed by AllocaInsts, why not use that as the key type? chandlerc:* If these are actually indexed by AllocaInst*s, why not use that as the key type?
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions They could also be function arguments, not just allocas. aprantl: They could also be function arguments, not just allocas.
friend class PHIOrSelectSpeculator;		friend class PHIOrSelectSpeculator;
friend class AllocaSliceRewriter;		friend class AllocaSliceRewriter;

bool rewritePartition(AllocaInst &AI, AllocaSlices &AS,		bool rewritePartition(AllocaInst &AI, AllocaSlices &AS,
		chandlercUnsubmitted Not Done Reply Inline Actions This is a pretty terrible interface... Why not just lazily create this and store it in the Module so that the Module provides a direct method to produce a TypeIdentifierMap? Also, why Optional rather than a pointer? I assume TypeIdentifierMap is some kind of smart pointer? chandlerc: This is a pretty terrible interface... Why not just lazily create this and store it in the…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions The main concern was memory usage. I wanted only create this on demand when it's used and free the memory after the pass is done with the module. That said, I'm not overly attached to this model. Got rid of the optional. aprantl: The main concern was memory usage. I wanted only create this on demand when it's used and free…
AllocaSlices::iterator B, AllocaSlices::iterator E,		AllocaSlices::iterator B, AllocaSlices::iterator E,
int64_t BeginOffset, int64_t EndOffset,		int64_t BeginOffset, int64_t EndOffset,
ArrayRef<AllocaSlices::iterator> SplitUses);		ArrayRef<AllocaSlices::iterator> SplitUses);
bool splitAlloca(AllocaInst &AI, AllocaSlices &AS);		bool splitAlloca(AllocaInst &AI, AllocaSlices &AS);
bool runOnAlloca(AllocaInst &AI);		bool runOnAlloca(AllocaInst &AI);
void clobberUse(Use &U);		void clobberUse(Use &U);
void deleteDeadInstructions(SmallPtrSetImpl<AllocaInst *> &DeletedAllocas);		void deleteDeadInstructions(SmallPtrSetImpl<AllocaInst *> &DeletedAllocas);
bool promoteAllocas(Function &F);		bool promoteAllocas(Function &F);
▲ Show 20 Lines • Show All 1,114 Lines • ▼ Show 20 Lines	class AllocaSliceRewriter : public InstVisitor<AllocaSliceRewriter, bool> {
bool IsSplit;		bool IsSplit;
Use *OldUse;		Use *OldUse;
Instruction *OldPtr;		Instruction *OldPtr;

// Track post-rewrite users which are PHI nodes and Selects.		// Track post-rewrite users which are PHI nodes and Selects.
SmallPtrSetImpl<PHINode *> &PHIUsers;		SmallPtrSetImpl<PHINode *> &PHIUsers;
SmallPtrSetImpl<SelectInst *> &SelectUsers;		SmallPtrSetImpl<SelectInst *> &SelectUsers;

// Utility IR builder, whose name prefix is setup for each visited use, and		// Utility IR builder, whose name prefix is setup for each visited use, and
// the insertion point is set to point to the user.		// the insertion point is set to point to the user.
		chandlercUnsubmitted Not Done Reply Inline Actions I don't think this belongs in the slice rewriter. The rewriter is about rewriting uses in the slice vector. There is a perfect place for this in the rewritePartition method right after we create the new AllocaInst. chandlerc: I don't think this belongs in the slice rewriter. The rewriter is about rewriting uses in the…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions What I'm doing here is replace the one dbg.declare describing the old alloca with one for each of the slices. IF I understand it correctly :-) it appears that rewritePartition would be too early for this because it didn;t yet create the slices yet. aprantl: What I'm doing here is replace the one dbg.declare describing the old alloca with one for each…
IRBuilderTy IRB;		IRBuilderTy IRB;

public:		public:
AllocaSliceRewriter(const DataLayout &DL, AllocaSlices &AS, SROA &Pass,		AllocaSliceRewriter(const DataLayout &DL, AllocaSlices &AS, SROA &Pass,
		chandlercUnsubmitted Not Done Reply Inline Actions You don't need this -- the SROA pass object is available directly as "Pass". chandlerc: You don't need this -- the SROA pass object is available directly as "Pass".
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions Thanks! aprantl: Thanks!
AllocaInst &OldAI, AllocaInst &NewAI,		AllocaInst &OldAI, AllocaInst &NewAI,
uint64_t NewAllocaBeginOffset,		uint64_t NewAllocaBeginOffset,
uint64_t NewAllocaEndOffset, bool IsIntegerPromotable,		uint64_t NewAllocaEndOffset, bool IsIntegerPromotable,
VectorType *PromotableVecTy,		VectorType *PromotableVecTy,
SmallPtrSetImpl<PHINode *> &PHIUsers,		SmallPtrSetImpl<PHINode *> &PHIUsers,
		chandlercUnsubmitted Not Done Reply Inline Actions This being an assert worries me. Does the verifier ensure this? chandlerc: This being an assert worries me. Does the verifier ensure this?
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions The verifier (DIVariable::Verify()) cannot ensure this, because it would need access to a DITypeIdentifierMap to look up the type. There is, however, the same assertion in DwarfDebug just before the Dwarf expression it would be emitted. If we do make the DITypeIdentifierMap a feature of the module we could have the verifier take care of it. aprantl: The verifier (DIVariable::Verify()) cannot ensure this, because it would need access to a…
SmallPtrSetImpl<SelectInst *> &SelectUsers)		SmallPtrSetImpl<SelectInst *> &SelectUsers)
		chandlercUnsubmitted Not Done Reply Inline Actions The order here seems a bit odd. I would group the references separately from the builders.... Also, nit-picky issue, clang-format please. =] chandlerc: The order here seems a bit odd. I would group the references separately from the builders....
: DL(DL), AS(AS), Pass(Pass), OldAI(OldAI), NewAI(NewAI),		: DL(DL), AS(AS), Pass(Pass), OldAI(OldAI), NewAI(NewAI),
NewAllocaBeginOffset(NewAllocaBeginOffset),		NewAllocaBeginOffset(NewAllocaBeginOffset),
NewAllocaEndOffset(NewAllocaEndOffset),		NewAllocaEndOffset(NewAllocaEndOffset),
NewAllocaTy(NewAI.getAllocatedType()),		NewAllocaTy(NewAI.getAllocatedType()),
IntTy(IsIntegerPromotable		IntTy(IsIntegerPromotable
? Type::getIntNTy(		? Type::getIntNTy(
		chandlercUnsubmitted Not Done Reply Inline Actions This doesn't make sense to me. We may see the same DbgDecl many times? Wouldn't this become dead only when OldAI became dead? chandlerc: This doesn't make sense to me. We may see the same DbgDecl many times? Wouldn't this become…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions The goal is to replace the single DbgDecl that describes the old alloca with many smaller dbg instrinsics that describe the individual slices. aprantl: The goal is to replace the single DbgDecl that describes the old alloca with many smaller dbg…
NewAI.getContext(),		NewAI.getContext(),
DL.getTypeSizeInBits(NewAI.getAllocatedType()))		DL.getTypeSizeInBits(NewAI.getAllocatedType()))
: nullptr),		: nullptr),
VecTy(PromotableVecTy),		VecTy(PromotableVecTy),
ElementTy(VecTy ? VecTy->getElementType() : nullptr),		ElementTy(VecTy ? VecTy->getElementType() : nullptr),
ElementSize(VecTy ? DL.getTypeSizeInBits(ElementTy) / 8 : 0),		ElementSize(VecTy ? DL.getTypeSizeInBits(ElementTy) / 8 : 0),
BeginOffset(), EndOffset(), IsSplittable(), IsSplit(), OldUse(),		BeginOffset(), EndOffset(), IsSplittable(), IsSplit(), OldUse(),
OldPtr(), PHIUsers(PHIUsers), SelectUsers(SelectUsers),		OldPtr(), PHIUsers(PHIUsers), SelectUsers(SelectUsers),
Show All 25 Lines	bool visit(AllocaSlices::const_iterator I) {
OldUse = I->getUse();		OldUse = I->getUse();
OldPtr = cast<Instruction>(OldUse->get());		OldPtr = cast<Instruction>(OldUse->get());

Instruction *OldUserI = cast<Instruction>(OldUse->getUser());		Instruction *OldUserI = cast<Instruction>(OldUse->getUser());
IRB.SetInsertPoint(OldUserI);		IRB.SetInsertPoint(OldUserI);
IRB.SetCurrentDebugLocation(OldUserI->getDebugLoc());		IRB.SetCurrentDebugLocation(OldUserI->getDebugLoc());
IRB.SetNamePrefix(Twine(NewAI.getName()) + "." + Twine(BeginOffset) + ".");		IRB.SetNamePrefix(Twine(NewAI.getName()) + "." + Twine(BeginOffset) + ".");

CanSROA &= visit(cast<Instruction>(OldUse->getUser()));		CanSROA &= visit(cast<Instruction>(OldUse->getUser()));
if (VecTy \|\| IntTy)		if (VecTy \|\| IntTy)
assert(CanSROA);		assert(CanSROA);
return CanSROA;		return CanSROA;
}		}

private:		private:
// Make sure the other visit overloads are visible.		// Make sure the other visit overloads are visible.
using Base::visit;		using Base::visit;

// Every instruction which can end up as a user must have a rewrite rule.		// Every instruction which can end up as a user must have a rewrite rule.
bool visitInstruction(Instruction &I) {		bool visitInstruction(Instruction &I) {
		chandlercUnsubmitted Not Done Reply Inline Actions Actually, why do this here at all? This doesn't seem to really be related to rewriting the uses in the traditional sense. Notably, it seems excessive if OldAI == NewAI (which happens in some cases). I think the better place to put this logic is inside the rewritePartition method of the SROA pass itself right after we create the new alloca. chandlerc: Actually, why do this here at all? This doesn't seem to really be related to rewriting the uses…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions Done. aprantl: Done.
DEBUG(dbgs() << " !!!! Cannot rewrite: " << I << "\n");		DEBUG(dbgs() << " !!!! Cannot rewrite: " << I << "\n");
llvm_unreachable("No rewrite rule for this instruction!");		llvm_unreachable("No rewrite rule for this instruction!");
}		}

Value getNewAllocaSlicePtr(IRBuilderTy &IRB, Type PointerTy) {		Value getNewAllocaSlicePtr(IRBuilderTy &IRB, Type PointerTy) {
// Note that the offset computation can use BeginOffset or NewBeginOffset		// Note that the offset computation can use BeginOffset or NewBeginOffset
// interchangeably for unsplit slices.		// interchangeably for unsplit slices.
assert(IsSplit \|\| BeginOffset == NewBeginOffset);		assert(IsSplit \|\| BeginOffset == NewBeginOffset);
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	unsigned getSliceAlign(Type *Ty = nullptr) {
unsigned Align =		unsigned Align =
MinAlign(NewAIAlign, NewBeginOffset - NewAllocaBeginOffset);		MinAlign(NewAIAlign, NewBeginOffset - NewAllocaBeginOffset);
return (Ty && Align == DL.getABITypeAlignment(Ty)) ? 0 : Align;		return (Ty && Align == DL.getABITypeAlignment(Ty)) ? 0 : Align;
}		}

unsigned getIndex(uint64_t Offset) {		unsigned getIndex(uint64_t Offset) {
assert(VecTy && "Can only call getIndex when rewriting a vector");		assert(VecTy && "Can only call getIndex when rewriting a vector");
uint64_t RelOffset = Offset - NewAllocaBeginOffset;		uint64_t RelOffset = Offset - NewAllocaBeginOffset;
assert(RelOffset / ElementSize < UINT32_MAX && "Index out of bounds");		assert(RelOffset / ElementSize < UINT32_MAX && "Index out of bounds");
		chandlercUnsubmitted Not Done Reply Inline Actions What happens if the original store was already a split? This will happen a lot with SROA when we iterate on the same alloca. Also, what about when this isn't a split at all? I feel like the first patch should just migrate dbg.declares from old allocas to new allocas when we're rewriting uses but not re-slicing anything. chandlerc: What happens if the original store was already a split? This will happen a lot with SROA when…
uint32_t Index = RelOffset / ElementSize;		uint32_t Index = RelOffset / ElementSize;
assert(Index * ElementSize == RelOffset);		assert(Index * ElementSize == RelOffset);
return Index;		return Index;
		chandlercUnsubmitted Not Done Reply Inline Actions Note that you would want SliceSize here, as if we split up the store itself, BeginOffset and EndOffset won't reflect that split. You should add a test case for this as well -- example would be storing the low and high parts of an i64 separately, as well as storing the whole thing. chandlerc: Note that you would want SliceSize here, as if we split up the store itself, BeginOffset and…
}		}

void deleteIfTriviallyDead(Value *V) {		void deleteIfTriviallyDead(Value *V) {
Instruction *I = cast<Instruction>(V);		Instruction *I = cast<Instruction>(V);
if (isInstructionTriviallyDead(I))		if (isInstructionTriviallyDead(I))
		chandlercUnsubmitted Not Done Reply Inline Actions I'm probably missing something, but it looks like this has the real possibility of taking code which has no partial info (no dbg.value calls) and making it suddenly have more info. Is that right? Is that good? Should we be looking for relevant dbg.value calls pertaining to the original alloca and slicing them up rather than just synthesizing our own? Also, what about memset? memcpy? speculated stores around phis? I assume follow-up patches, but it might be good to document some of that. chandlerc: I'm probably missing something, but it looks like this has the real possibility of taking…
Pass.DeadInsts.insert(I);		Pass.DeadInsts.insert(I);
}		}

Value *rewriteVectorizedLoadInst() {		Value *rewriteVectorizedLoadInst() {
unsigned BeginIndex = getIndex(NewBeginOffset);		unsigned BeginIndex = getIndex(NewBeginOffset);
unsigned EndIndex = getIndex(NewEndOffset);		unsigned EndIndex = getIndex(NewEndOffset);
assert(EndIndex > BeginIndex && "Empty vector!");		assert(EndIndex > BeginIndex && "Empty vector!");

▲ Show 20 Lines • Show All 912 Lines • ▼ Show 20 Lines
/// This routine drives both of the rewriting goals of the SROA pass. It tries		/// This routine drives both of the rewriting goals of the SROA pass. It tries
/// to rewrite uses of an alloca partition to be conducive for SSA value		/// to rewrite uses of an alloca partition to be conducive for SSA value
/// promotion. If the partition needs a new, more refined alloca, this will		/// promotion. If the partition needs a new, more refined alloca, this will
/// build that new alloca, preserving as much type information as possible, and		/// build that new alloca, preserving as much type information as possible, and
/// rewrite the uses of the old alloca to point at the new one and have the		/// rewrite the uses of the old alloca to point at the new one and have the
/// appropriate new offsets. It also evaluates how successful the rewrite was		/// appropriate new offsets. It also evaluates how successful the rewrite was
/// at enabling promotion and if it was successful queues the alloca to be		/// at enabling promotion and if it was successful queues the alloca to be
/// promoted.		/// promoted.
bool SROA::rewritePartition(AllocaInst &AI, AllocaSlices &AS,		bool SROA::rewritePartition(AllocaInst &AI, AllocaSlices &AS,
		chandlercUnsubmitted Not Done Reply Inline Actions Why is it important to thread a single DIBuilder through all of these levels? Why not just create the DIBuilder in the AllocaSliceRewriter? Or if it is expensive to construct, create it as a member? chandlerc: Why is it important to thread a single DIBuilder through all of these levels? Why not just…
AllocaSlices::iterator B, AllocaSlices::iterator E,		AllocaSlices::iterator B, AllocaSlices::iterator E,
int64_t BeginOffset, int64_t EndOffset,		int64_t BeginOffset, int64_t EndOffset,
ArrayRef<AllocaSlices::iterator> SplitUses) {		ArrayRef<AllocaSlices::iterator> SplitUses) {
assert(BeginOffset < EndOffset);		assert(BeginOffset < EndOffset);
uint64_t SliceSize = EndOffset - BeginOffset;		uint64_t SliceSize = EndOffset - BeginOffset;

// Try to compute a friendly type for this partition of the alloca. This		// Try to compute a friendly type for this partition of the alloca. This
// won't always succeed, in which case we fall back to a legal integer type		// won't always succeed, in which case we fall back to a legal integer type
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	if (SliceTy == AI.getAllocatedType()) {
// If we will get at least this much alignment from the type alone, leave		// If we will get at least this much alignment from the type alone, leave
// the alloca's alignment unconstrained.		// the alloca's alignment unconstrained.
if (Alignment <= DL->getABITypeAlignment(SliceTy))		if (Alignment <= DL->getABITypeAlignment(SliceTy))
Alignment = 0;		Alignment = 0;
NewAI =		NewAI =
new AllocaInst(SliceTy, nullptr, Alignment,		new AllocaInst(SliceTy, nullptr, Alignment,
AI.getName() + ".sroa." + Twine(B - AS.begin()), &AI);		AI.getName() + ".sroa." + Twine(B - AS.begin()), &AI);
++NumNewAllocas;		++NumNewAllocas;

		// Migrate debug information from the old alloca to the new alloca
		// and the individial slices.
		if (DbgDeclareInst *DbgDecl = DbgDeclares.lookup(&AI)) {
		DIVariable Var(DbgDecl->getVariable());
		DIExpression Piece;
		DIBuilder DIB(*AI.getParent()->getParent()->getParent(),
		/AllowUnresolved/ false);
		// Create a piece expression describing the slice, if the new slize is
		// smaller than the old alloca or the old alloca already was described
		// with a piece. It would be even better to just compare against the size
		chandlercUnsubmitted Not Done Reply Inline Actions This predicate is somewhat subtle. You're creating a piece expression any time the alloc size isn't the same, but you're using the slice size rather than the alloc size to create the piece.... Can you add a comment here? If I came to this code later, I wouldn't understand why the different sizes used were significant. chandlerc: This predicate is somewhat subtle. You're creating a piece expression any time the alloc size…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions I changed condition and comment to this, which should be more clear: // Create a piece expression describing the slice, if the new slize is // smaller than the old alloca or the old alloca already was described // with a piece. It would be even better to just compare against the size // of the type described in the debug info, but then we would need to // build an expensive DIRefMap. if (SliceSize < DL->getTypeAllocSize(AI.getAllocatedType()) \|\| DbgDeclares[AI].getExpression().isVariablePiece()) Piece = DIB->createPieceExpression(BeginOffset, SliceSize); aprantl: I changed condition and comment to this, which should be more clear: ``` // Create a…
		// of the type described in the debug info, but then we would need to
		// build an expensive DIRefMap.
		if (SliceSize < DL->getTypeAllocSize(AI.getAllocatedType()) \|\|
		DIExpression(DbgDecl->getExpression()).isVariablePiece())
		Piece = DIB.createPieceExpression(BeginOffset, SliceSize);
		Instruction *NewDDI = DIB.insertDeclare(NewAI, Var, Piece, &AI);
		NewDDI->setDebugLoc(DbgDecl->getDebugLoc());
		DbgDeclares.insert(std::make_pair(NewAI, cast<DbgDeclareInst>(NewDDI)));
		DeadInsts.insert(DbgDecl);
		}
}		}

DEBUG(dbgs() << "Rewriting alloca partition "		DEBUG(dbgs() << "Rewriting alloca partition "
<< "[" << BeginOffset << "," << EndOffset << ") to: " << *NewAI		<< "[" << BeginOffset << "," << EndOffset << ") to: " << *NewAI
<< "\n");		<< "\n");

// Track the high watermark on the worklist as it is only relevant for		// Track the high watermark on the worklist as it is only relevant for
// promoted allocas. We will reset it to this point if the alloca is not in		// promoted allocas. We will reset it to this point if the alloca is not in
▲ Show 20 Lines • Show All 355 Lines • ▼ Show 20 Lines	if (DT && !ForceSSAUpdater) {
DEBUG(dbgs() << "Promoting allocas with mem2reg...\n");		DEBUG(dbgs() << "Promoting allocas with mem2reg...\n");
PromoteMemToReg(PromotableAllocas, *DT, nullptr, AT);		PromoteMemToReg(PromotableAllocas, *DT, nullptr, AT);
PromotableAllocas.clear();		PromotableAllocas.clear();
return true;		return true;
}		}

DEBUG(dbgs() << "Promoting allocas with SSAUpdater...\n");		DEBUG(dbgs() << "Promoting allocas with SSAUpdater...\n");
SSAUpdater SSA;		SSAUpdater SSA;
DIBuilder DIB(F.getParent(), /AllowUnresolved*/ false);
SmallVector<Instruction *, 64> Insts;		SmallVector<Instruction *, 64> Insts;

// We need a worklist to walk the uses of each alloca.		// We need a worklist to walk the uses of each alloca.
SmallVector<Instruction *, 8> Worklist;		SmallVector<Instruction *, 8> Worklist;
SmallPtrSet<Instruction *, 8> Visited;		SmallPtrSet<Instruction *, 8> Visited;
SmallVector<Instruction *, 32> DeadInsts;		SmallVector<Instruction *, 32> DeadInsts;

		DIBuilder DIB(F.getParent(), /AllowUnresolved*/ false);

for (unsigned Idx = 0, Size = PromotableAllocas.size(); Idx != Size; ++Idx) {		for (unsigned Idx = 0, Size = PromotableAllocas.size(); Idx != Size; ++Idx) {
AllocaInst *AI = PromotableAllocas[Idx];		AllocaInst *AI = PromotableAllocas[Idx];
Insts.clear();		Insts.clear();
Worklist.clear();		Worklist.clear();
Visited.clear();		Visited.clear();

enqueueUsersInWorklist(*AI, Worklist, Visited);		enqueueUsersInWorklist(*AI, Worklist, Visited);

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	bool SROA::runOnFunction(Function &F) {
DL = &DLP->getDataLayout();		DL = &DLP->getDataLayout();
DominatorTreeWrapperPass *DTWP =		DominatorTreeWrapperPass *DTWP =
getAnalysisIfAvailable<DominatorTreeWrapperPass>();		getAnalysisIfAvailable<DominatorTreeWrapperPass>();
DT = DTWP ? &DTWP->getDomTree() : nullptr;		DT = DTWP ? &DTWP->getDomTree() : nullptr;
AT = &getAnalysis<AssumptionTracker>();		AT = &getAnalysis<AssumptionTracker>();

BasicBlock &EntryBB = F.getEntryBlock();		BasicBlock &EntryBB = F.getEntryBlock();
for (BasicBlock::iterator I = EntryBB.begin(), E = std::prev(EntryBB.end());		for (BasicBlock::iterator I = EntryBB.begin(), E = std::prev(EntryBB.end());
I != E; ++I)		I != E; ++I) {
if (AllocaInst *AI = dyn_cast<AllocaInst>(I))		if (AllocaInst *AI = dyn_cast<AllocaInst>(I))
Worklist.insert(AI);		Worklist.insert(AI);
		else if (DbgDeclareInst *DDI = dyn_cast<DbgDeclareInst>(I))
		if (auto AI = dyn_cast_or_null<AllocaInst>(DDI->getAddress()))
		DbgDeclares.insert(std::make_pair(AI, DDI));
		chandlercUnsubmitted Not Done Reply Inline Actions This should be a dyn_cast to AllocaInst and a re-use. Also, getAddress can return null which seems like it would crash here. Unless the verifier rejects such an debug declare inst (and if it does, it would be nice to have an "unsafe" version of the method that returns null for the verifier and have this method assert for us that the result is non-null) this should handle the null case gracefully. Somewhat separately (and apologies if this just is my ignorance of debug info) what cases would you expect where the address isn't directly an alloca inst? chandlerc: This should be a dyn_cast to AllocaInst and a re-use. Also, getAddress can return null which…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions See the first testcase: define i32 @foo(%struct.Outer* byval align 8 %outer) #0 { call void @llvm.dbg.declare(metadata %struct.Outer* %outer, metadata !25, metadata !2), !dbg !26 it could be an argument. aprantl: See the first testcase: define i32 @foo(%struct.Outer* byval align 8 %outer) #0 { call…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions No the verifier does not reject it, although an UndefValue would be preferred over of a nullptr. I'll change it to a dyn_cast_or_null. aprantl: No the verifier does not reject it, although an UndefValue would be preferred over of a nullptr.
		}

bool Changed = false;		bool Changed = false;
		chandlercUnsubmitted Not Done Reply Inline Actions Is there a particular cost to constructiog DIBuilder objects? I ask because, were this an IRBuilder, I would have said to just construct it directly in each place we need it, because there is no real cast associated with it. If the same is true for DIBuilder, then I'd suggest the same change. If DIBuilder works differently, I'd like to know how to better use it. =] chandlerc: Is there a particular cost to constructiog DIBuilder objects? I ask because, were this an…
		aprantlAuthorUnsubmitted Not Done Reply Inline Actions Compared to an IRBuilder it looks as if DIBuilder has to initialize quite a few members (that will never be used). But I'm unsure whether there actually is a cost to initializing DenseMaps and SmallVectors? Otherwise I'd be happy to instantiate the DIBuilder on the fly. Alternatively we could factor out a less capable DIBuilder base class that does not have all these members. Module &M; LLVMContext &VMContext; MDNode TempEnumTypes; MDNode TempRetainTypes; MDNode TempSubprograms; MDNode TempGVs; MDNode TempImportedModules; Function DeclareFn; // llvm.dbg.declare Function ValueFn; // llvm.dbg.value SmallVector<Metadata , 4> AllEnumTypes; /// Track the RetainTypes, since they can be updated later on. SmallVector<TrackingMDNodeRef, 4> AllRetainTypes; SmallVector<Metadata , 4> AllSubprograms; SmallVector<Metadata , 4> AllGVs; SmallVector<TrackingMDNodeRef, 4> AllImportedModules; /// \brief Track nodes that may be unresolved. SmallVector<TrackingMDNodeRef, 4> UnresolvedNodes; bool AllowUnresolvedNodes; /// Each subprogram's preserved local variables. DenseMap<MDNode , std::vector<TrackingMDNodeRef>> PreservedVariables; aprantl:* Compared to an IRBuilder it looks as if DIBuilder has to initialize quite a few members (that…
		chandlercUnsubmitted Not Done Reply Inline Actions Interesting. I suggest you just build them on the fly here, and if these members ever end up being a problem, we can go in and avoid their cost much as you describe. No need to do that until we see a problem, but it seems nicer than having a heap-allocated builder. chandlerc: Interesting. I suggest you just build them on the fly here, and if these members ever end up…
// A set of deleted alloca instruction pointers which should be removed from		// A set of deleted alloca instruction pointers which should be removed from
// the list of promotable allocas.		// the list of promotable allocas.
SmallPtrSet<AllocaInst *, 4> DeletedAllocas;		SmallPtrSet<AllocaInst *, 4> DeletedAllocas;

do {		do {
while (!Worklist.empty()) {		while (!Worklist.empty()) {
Changed \|= runOnAlloca(*Worklist.pop_back_val());		Changed \|= runOnAlloca(*Worklist.pop_back_val());
		chandlercUnsubmitted Not Done Reply Inline Actions Bad indentation... chandlerc: Bad indentation...
deleteDeadInstructions(DeletedAllocas);		deleteDeadInstructions(DeletedAllocas);

// Remove the deleted allocas from various lists so that we don't try to		// Remove the deleted allocas from various lists so that we don't try to
// continue processing them.		// continue processing them.
if (!DeletedAllocas.empty()) {		if (!DeletedAllocas.empty()) {
auto IsInSet = [&](AllocaInst *AI) { return DeletedAllocas.count(AI); };		auto IsInSet = [&](AllocaInst *AI) { return DeletedAllocas.count(AI); };
Worklist.remove_if(IsInSet);		Worklist.remove_if(IsInSet);
PostPromotionWorklist.remove_if(IsInSet);		PostPromotionWorklist.remove_if(IsInSet);
Show All 23 Lines

test/DebugInfo/X86/array2.ll

	; ModuleID = 'array.c'			; ModuleID = 'array.c'
	;			;
	; From (clang -g -c -O0):			; From (clang -g -c -O0):
	;			;
	; void f(int* p) {			; void f(int* p) {
	; p[0] = 42;			; p[0] = 42;
	; }			; }
	;			;
	; int main(int argc, char** argv) {			; int main(int argc, char** argv) {
	; int array[4] = { 0, 1, 2, 3 };			; int array[4] = { 0, 1, 2, 3 };
	; f(array);			; f(array);
	; return array[0];			; return array[0];
	; }			; }
	;			;
	; RUN: opt %s -O2 -S -o - \| FileCheck %s			; RUN: opt %s -O2 -S -o - \| FileCheck %s
	; Test that we do not lower dbg.declares for arrays.			; Test that we correctly lower dbg.declares for arrays.
	;			;
	; CHECK: define i32 @main			; CHECK: define i32 @main
	; CHECK: call void @llvm.dbg.value			; CHECK: call void @llvm.dbg.value(metadata i32 42, i64 0, metadata ![[ARRAY:[0-9]+]], metadata ![[EXPR:[0-9]+]])
	; CHECK: call void @llvm.dbg.value			; CHECK: ![[ARRAY]] = {{.*}}; [ DW_TAG_auto_variable ] [array] [line 6]
	; CHECK: call void @llvm.dbg.declare			; CHECK: ![[EXPR]] = {{.*}}; [ DW_TAG_expression ] [DW_OP_piece offset=0, size=4]
	target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-apple-macosx10.9.0"			target triple = "x86_64-apple-macosx10.9.0"

	@main.array = private unnamed_addr constant [4 x i32] [i32 0, i32 1, i32 2, i32 3], align 16			@main.array = private unnamed_addr constant [4 x i32] [i32 0, i32 1, i32 2, i32 3], align 16

	; Function Attrs: nounwind ssp uwtable			; Function Attrs: nounwind ssp uwtable
	define void @f(i32* %p) #0 {			define void @f(i32* %p) #0 {
	entry:			entry:
	▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

test/DebugInfo/X86/sroasplit-1.ll

This file was added.

				; RUN: opt %s -sroa -verify -S -o - \| FileCheck %s
				;
				; Test that we can partial emit debug info for aggregates repeatedly
				; split up by SROA.
				chandlercUnsubmitted Not Done Reply Inline Actions Honestly, I really dislike seeing rdar links even in tests. They add no information for any reader not at Apple, and so you end up needing to explain all of the context of the test anyways. PR links are at least a reasonable thing for your average contributor to go read. chandlerc: Honestly, I really dislike seeing rdar links even in tests. They add no information for any…
				aprantlAuthorUnsubmitted Not Done Reply Inline Actions I can understand that. Removed. aprantl: I can understand that. Removed.
				;
				; // Compile with -O1
				; typedef struct {
				; int a;
				; long int b;
				; } Inner;
				;
				; typedef struct {
				; Inner inner[2];
				; } Outer;
				;
				; int foo(Outer outer) {
				; Inner i1 = outer.inner[1];
				; return i1.a;
				; }
				;

				; Verify that SROA creates a variable piece when splitting i1.
				; CHECK: %[[I1:.*]] = alloca [12 x i8], align 4
				; CHECK: call void @llvm.dbg.declare(metadata [12 x i8]* %[[I1]], metadata ![[VAR:[0-9]+]], metadata ![[PIECE1:[0-9]+]])
				; CHECK: call void @llvm.dbg.value(metadata i32 %[[A:.*]], i64 0, metadata ![[VAR]], metadata ![[PIECE2:[0-9]+]])
				; CHECK: ret i32 %[[A]]
				; Read Var and Piece:
				; CHECK: ![[VAR]] = {{.*}} ; [ DW_TAG_auto_variable ] [i1] [line 11]
				; CHECK: ![[PIECE1]] = {{.*}} ; [ DW_TAG_expression ] [DW_OP_piece offset=4, size=12]
				; CHECK: ![[PIECE2]] = {{.*}} ; [ DW_TAG_expression ] [DW_OP_piece offset=0, size=4]

				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx10.9.0"

				%struct.Outer = type { [2 x %struct.Inner] }
				%struct.Inner = type { i32, i64 }

				; Function Attrs: nounwind ssp uwtable
				define i32 @foo(%struct.Outer* byval align 8 %outer) #0 {
				entry:
				%i1 = alloca %struct.Inner, align 8
				call void @llvm.dbg.declare(metadata %struct.Outer* %outer, metadata !25, metadata !2), !dbg !26
				call void @llvm.dbg.declare(metadata %struct.Inner* %i1, metadata !27, metadata !2), !dbg !28
				%inner = getelementptr inbounds %struct.Outer* %outer, i32 0, i32 0, !dbg !28
				%arrayidx = getelementptr inbounds [2 x %struct.Inner]* %inner, i32 0, i64 1, !dbg !28
				%0 = bitcast %struct.Inner* %i1 to i8*, !dbg !28
				%1 = bitcast %struct.Inner* %arrayidx to i8*, !dbg !28
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* %1, i64 16, i32 8, i1 false), !dbg !28
				%a = getelementptr inbounds %struct.Inner* %i1, i32 0, i32 0, !dbg !29
				%2 = load i32* %a, align 4, !dbg !29
				ret i32 %2, !dbg !29
				}

				; Function Attrs: nounwind readnone
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #1

				; Function Attrs: nounwind
				declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i32, i1) #2

				attributes #0 = { nounwind ssp uwtable }
				attributes #1 = { nounwind readnone }
				attributes #2 = { nounwind }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!22, !23}
				!llvm.ident = !{!24}

				!0 = !{!"0x11\0012\00clang version 3.5.0 \000\00\000\00\001", !1, !2, !2, !3, !2, !2} ; [ DW_TAG_compile_unit ] [ DW_TAG_compile_unit ] [sroasplit-1.c] [DW_LANG_C99]
				!1 = !{!"sroasplit-1.c", !""}
				!2 = !{}
				!3 = !{!4}
				!4 = !{!"0x2e\00foo\00foo\00\0010\000\001\000\006\00256\000\0010", !1, !5, !6, null, i32 (%struct.Outer) @foo, null, null, !2} ; [ DW_TAG_subprogram ] [ DW_TAG_subprogram ] [line 10] [def] [foo]
				!5 = !{!"0x29", !1} ; [ DW_TAG_file_type ] [ DW_TAG_file_type ] [sroasplit-1.c]
				!6 = !{!"0x15\00\000\000\000\000\000\000", i32 0, null, null, !7, null, null, null} ; [ DW_TAG_subroutine_type ] [ DW_TAG_subroutine_type ] [line 0, size 0, align 0, offset 0] [from ]
				!7 = !{!8, !9}
				!8 = !{!"0x24\00int\000\0032\0032\000\000\005", null, null} ; [ DW_TAG_base_type ] [ DW_TAG_base_type ] [int] [line 0, size 32, align 32, offset 0, enc DW_ATE_signed]
				!9 = !{!"0x16\00Outer\008\000\000\000\000", !1, null, !10} ; [ DW_TAG_typedef ] [ DW_TAG_typedef ] [Outer] [line 8, size 0, align 0, offset 0] [from ]
				!10 = !{!"0x13\00\006\00256\0064\000\000\000", !1, null, null, !11, null, null, null} ; [ DW_TAG_structure_type ] [line 6, size 256, align 64, offset 0] [def] [from ]
				!11 = !{!12}
				!12 = !{!"0xd\00inner\007\00256\0064\000\000", !1, !10, !13} ; [ DW_TAG_member ] [inner] [line 7, size 256, align 64, offset 0] [from ]
				!13 = !{!"0x1\00\000\00256\0064\000\000", null, null, !14, !20, i32 0, null, null, null} ; [ DW_TAG_array_type ] [line 0, size 256, align 64, offset 0] [from Inner]
				!14 = !{!"0x16\00Inner\004\000\000\000\000", !1, null, !15} ; [ DW_TAG_typedef ] [ DW_TAG_typedef ] [Inner] [line 4, size 0, align 0, offset 0] [from ]
				!15 = !{!"0x13\00\001\00128\0064\000\000\000", !1, null, null, !16, null, null, null} ; [ DW_TAG_structure_type ] [line 1, size 128, align 64, offset 0] [def] [from ]
				!16 = !{!17, !18}
				!17 = !{!"0xd\00a\002\0032\0032\000\000", !1, !15, !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
				!18 = !{!"0xd\00b\003\0064\0064\0064\000", !1, !15, !19} ; [ DW_TAG_member ] [b] [line 3, size 64, align 64, offset 64] [from long int]
				!19 = !{!"0x24\00long int\000\0064\0064\000\000\005", null, null} ; [ DW_TAG_base_type ] [ DW_TAG_base_type ] [long int] [line 0, size 64, align 64, offset 0, enc DW_ATE_signed]
				!20 = !{!21}
				!21 = !{!"0x21\000\002"} ; [ DW_TAG_subrange_type ] [0, 1]
				!22 = !{i32 2, !"Dwarf Version", i32 2}
				!23 = !{i32 1, !"Debug Info Version", i32 2}
				!24 = !{!"clang version 3.5.0 "}
				!25 = !{!"0x101\00outer\0016777226\000", !4, !5, !9} ; [ DW_TAG_arg_variable ] [ DW_TAG_arg_variable ] [outer] [line 10]
				!26 = !{i32 10, i32 0, !4, null}
				!27 = !{!"0x100\00i1\0011\000", !4, !5, !14} ; [ DW_TAG_auto_variable ] [ DW_TAG_auto_variable ] [i1] [line 11]
				!28 = !{i32 11, i32 0, !4, null}
				!29 = !{i32 12, i32 0, !4, null}

test/DebugInfo/X86/sroasplit-2.ll

This file was added.

				; RUN: opt %s -sroa -verify -S -o - \| FileCheck %s
				;
				; Test that we can partial emit debug info for aggregates repeatedly
				; split up by SROA.
				;
				; // Compile with -O1
				; typedef struct {
				; int a;
				; int b;
				dblaikieUnsubmitted Not Done Reply Inline Actions Perhaps these two test cases could be named more precisely for what they do - it took me a bit of staring to see the specific difference between them (the type of the Inner::b member is different) and I still don't have any idea why that's important to test separately (what is it that's different about that case)? (if it is different and different code is needed for it, usual question applies about "is this different code that can be implemented/tested/committed separately", but perhaps it isn't) & do both cases need the two levels of indirection (a struct, with an array of other structs)? While a test case that covers a few cases in one go is soemtimes nice, it'd help me at least if there was a comment explaining why the different pieces of indirection were important to the test - what parts of SROA they're exercising. dblaikie: Perhaps these two test cases could be named more precisely for what they do - it took me a bit…
				; } Inner;
				;
				; typedef struct {
				; Inner inner[2];
				; } Outer;
				;
				; int foo(Outer outer) {
				; Inner i1 = outer.inner[1];
				; return i1.a;
				; }
				;

				; Verify that SROA creates a variable piece when splitting i1.
				; CHECK: call void @llvm.dbg.value(metadata i64 %outer.coerce0, i64 0, metadata ![[O:[0-9]+]], metadata ![[PIECE1:[0-9]+]]),
				; CHECK: call void @llvm.dbg.value(metadata i64 %outer.coerce1, i64 0, metadata ![[O]], metadata ![[PIECE2:[0-9]+]]),
				; CHECK: call void @llvm.dbg.value({{.*}}, i64 0, metadata ![[I1:[0-9]+]], metadata ![[PIECE3:[0-9]+]]),
				; CHECK-DAG: ![[O]] = {{.*}} [ DW_TAG_arg_variable ] [outer] [line 10]
				; CHECK-DAG: ![[PIECE1]] = {{.*}} [ DW_TAG_expression ] [DW_OP_piece offset=0, size=8]
				; CHECK-DAG: ![[PIECE2]] = {{.*}} [ DW_TAG_expression ] [DW_OP_piece offset=8, size=8]
				; CHECK-DAG: ![[I1]] = {{.*}} [ DW_TAG_auto_variable ] [i1] [line 11]
				; CHECK-DAG: ![[PIECE3]] = {{.*}} [ DW_TAG_expression ] [DW_OP_piece offset=0, size=4]

				; ModuleID = 'sroasplit-2.c'
				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-apple-macosx10.9.0"

				%struct.Outer = type { [2 x %struct.Inner] }
				%struct.Inner = type { i32, i32 }

				; Function Attrs: nounwind ssp uwtable
				define i32 @foo(i64 %outer.coerce0, i64 %outer.coerce1) #0 {
				%outer = alloca %struct.Outer, align 8
				%i1 = alloca %struct.Inner, align 4
				%1 = bitcast %struct.Outer* %outer to { i64, i64 }*
				%2 = getelementptr { i64, i64 }* %1, i32 0, i32 0
				store i64 %outer.coerce0, i64* %2
				%3 = getelementptr { i64, i64 }* %1, i32 0, i32 1
				store i64 %outer.coerce1, i64* %3
				call void @llvm.dbg.declare(metadata %struct.Outer* %outer, metadata !24, metadata !2), !dbg !25
				call void @llvm.dbg.declare(metadata %struct.Inner* %i1, metadata !26, metadata !2), !dbg !27
				%4 = getelementptr inbounds %struct.Outer* %outer, i32 0, i32 0, !dbg !27
				%5 = getelementptr inbounds [2 x %struct.Inner]* %4, i32 0, i64 1, !dbg !27
				%6 = bitcast %struct.Inner* %i1 to i8*, !dbg !27
				%7 = bitcast %struct.Inner* %5 to i8*, !dbg !27
				call void @llvm.memcpy.p0i8.p0i8.i64(i8* %6, i8* %7, i64 8, i32 4, i1 false), !dbg !27
				%8 = getelementptr inbounds %struct.Inner* %i1, i32 0, i32 0, !dbg !28
				%9 = load i32* %8, align 4, !dbg !28
				ret i32 %9, !dbg !28
				}

				; Function Attrs: nounwind readnone
				declare void @llvm.dbg.declare(metadata, metadata, metadata) #1

				; Function Attrs: nounwind
				declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture readonly, i64, i32, i1) #2

				attributes #0 = { nounwind ssp uwtable "no-frame-pointer-elim"="true" }
				attributes #1 = { nounwind readnone }
				attributes #2 = { nounwind }

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!21, !22}
				!llvm.ident = !{!23}

				!0 = !{!"0x11\0012\00clang version 3.5.0 \000\00\000\00\001", !1, !2, !2, !3, !2, !2} ; [ DW_TAG_compile_unit ] [ DW_TAG_compile_unit ] [sroasplit-2.c] [DW_LANG_C99]
				!1 = !{!"sroasplit-2.c", !""}
				!2 = !{}
				!3 = !{!4}
				!4 = !{!"0x2e\00foo\00foo\00\0010\000\001\000\006\00256\000\0010", !1, !5, !6, null, i32 (i64, i64)* @foo, null, null, !2} ; [ DW_TAG_subprogram ] [ DW_TAG_subprogram ] [line 10] [def] [foo]
				!5 = !{!"0x29", !1} ; [ DW_TAG_file_type ] [ DW_TAG_file_type ] [sroasplit-2.c]
				!6 = !{!"0x15\00\000\000\000\000\000\000", i32 0, null, null, !7, null, null, null} ; [ DW_TAG_subroutine_type ] [ DW_TAG_subroutine_type ] [line 0, size 0, align 0, offset 0] [from ]
				!7 = !{!8, !9}
				!8 = !{!"0x24\00int\000\0032\0032\000\000\005", null, null} ; [ DW_TAG_base_type ] [ DW_TAG_base_type ] [int] [line 0, size 32, align 32, offset 0, enc DW_ATE_signed]
				!9 = !{!"0x16\00Outer\008\000\000\000\000", !1, null, !10} ; [ DW_TAG_typedef ] [ DW_TAG_typedef ] [Outer] [line 8, size 0, align 0, offset 0] [from ]
				!10 = !{!"0x13\00\006\00128\0032\000\000\000", !1, null, null, !11, null, null, null} ; [ DW_TAG_structure_type ] [line 6, size 128, align 32, offset 0] [def] [from ]
				!11 = !{!12}
				!12 = !{!"0xd\00inner\007\00128\0032\000\000", !1, !10, !13} ; [ DW_TAG_member ] [inner] [line 7, size 128, align 32, offset 0] [from ]
				!13 = !{!"0x1\00\000\00128\0032\000\000", null, null, !14, !19, i32 0, null, null, null} ; [ DW_TAG_array_type ] [line 0, size 128, align 32, offset 0] [from Inner]
				!14 = !{!"0x16\00Inner\004\000\000\000\000", !1, null, !15} ; [ DW_TAG_typedef ] [ DW_TAG_typedef ] [Inner] [line 4, size 0, align 0, offset 0] [from ]
				!15 = !{!"0x13\00\001\0064\0032\000\000\000", !1, null, null, !16, null, null, null} ; [ DW_TAG_structure_type ] [line 1, size 64, align 32, offset 0] [def] [from ]
				!16 = !{!17, !18}
				!17 = !{!"0xd\00a\002\0032\0032\000\000", !1, !15, !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
				!18 = !{!"0xd\00b\003\0032\0032\0032\000", !1, !15, !8} ; [ DW_TAG_member ] [b] [line 3, size 32, align 32, offset 32] [from int]
				!19 = !{!20}
				!20 = !{!"0x21\000\002"} ; [ DW_TAG_subrange_type ] [0, 1]
				!21 = !{i32 2, !"Dwarf Version", i32 2}
				!22 = !{i32 1, !"Debug Info Version", i32 2}
				!23 = !{!"clang version 3.5.0 "}
				!24 = !{!"0x101\00outer\0016777226\000", !4, !5, !9} ; [ DW_TAG_arg_variable ] [ DW_TAG_arg_variable ] [outer] [line 10]
				!25 = !{i32 10, i32 0, !4, null}
				!26 = !{!"0x100\00i1\0011\000", !4, !5, !14} ; [ DW_TAG_auto_variable ] [ DW_TAG_auto_variable ] [i1] [line 11]
				!27 = !{i32 11, i32 0, !4, null}
				!28 = !{i32 12, i32 0, !4, null}