Page MenuHomePhabricator

[WIP] Attach debug intrinsics to allocas, and use correct address space
Needs ReviewPublic

Authored by scott.linder on Oct 7 2020, 9:38 AM.

Details

Summary

A dbg.declare for a local/parameter describes the hardware location of
the source variable's value. This matches up with the semantics of the
alloca for the variable, whereas any addrspacecast inserted in order to
implement some source-level notion of address spaces does not.

When creating the dbg.declare intrinsic, attach it directly to the
alloca, not to any addrspacecast.

Update the DIExpression with the address space of the alloca, rather
than use the address space associated with the source level type.

Diff Detail

Event Timeline

scott.linder created this revision.Oct 7 2020, 9:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2020, 9:38 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
scott.linder requested review of this revision.Oct 7 2020, 9:38 AM

I need to add more tests, but I wanted to float the idea of the change and get feedback first.

aprantl added inline comments.Oct 9 2020, 11:56 AM
clang/lib/CodeGen/CGDecl.cpp
1579

This is unintuitive — can you add a comment explaining why it may not be valid and why address should only be used then?

scott.linder added inline comments.Oct 9 2020, 2:08 PM
clang/lib/CodeGen/CGDecl.cpp
1579

This is kind of a cop-out on my part, the only path where this occurs is for OpenMP, and I think I just need to understand better what is happening. This also occurs for NRVO, but that is explicitly called out just below this. I'll try to understand this more completely and see if I can represent the possibilities more direclty.

Somewhat related, it is a bit unsettling reading through this, as the invariant seems to be that address.isValid() by the time the call to setAddrOfLocalVar is called, which makes sense but isn't explicit anywhere in the multiple nested ifs. I'll also add an assert of that.

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here? I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

Ok, thank you! I think the simplest thing is for me to update the patch to tie the debug info to the call to the runtime allocator, then.

I have no idea what from the commit message what this has to do with the OpenMP code gen but I saw this randomly while looking for something else so here we go:

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

The frontend does *not* insert __kmpc_alloc calls for firstprivate, or almost anything else for that matter. Grep clang/lib and you can find 2 uses, both in very specialized cases not related to "regular" user allocations". alloca is used as with basically everything else: https://clang.godbolt.org/z/z8fEqG

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

Ok, thank you! I think the simplest thing is for me to update the patch to tie the debug info to the call to the runtime allocator, then.

SROA/mem2reg is happening as you expect it to. FWIW, we also have heap2stack and argument-promotion + constant prop for parallel regions implemented in the Attributor. That means we would/will apply SROA/mem2reg even if you have a runtime alloca and if the value is nominally "shared" but could be made firtprivate.

I have no idea what from the commit message what this has to do with the OpenMP code gen but I saw this randomly while looking for something else so here we go:

@ABataev Sorry if I'm pulling you in without enough context/work on my end, but I wanted to ask how the Clang codegen for OpenMP locals works at a high level?

Is the idea that instead of an alloc the frontend can insert calls into the runtime in some cases, like __kmpc_alloc (e.g. for firstprivate as in https://reviews.llvm.org/D5140)?

Yes, right.

The frontend does *not* insert __kmpc_alloc calls for firstprivate, or almost anything else for that matter. Grep clang/lib and you can find 2 uses, both in very specialized cases not related to "regular" user allocations". alloca is used as with basically everything else: https://clang.godbolt.org/z/z8fEqG

They are inserted for pragma allocate and privates with allocate clauses.

If that is the case, I assume there is no equivalent to SROA/Mem2Reg here?

I assume, no.

I am trying to understand conceptually where the debug info for the source level local should be tied to in the IR, and at least for locals which use alloc it has turned out to be much simpler to tie the variable directly to the alloc itself rather than bitcasts and things which obscure the relationship.

Ok, thank you! I think the simplest thing is for me to update the patch to tie the debug info to the call to the runtime allocator, then.

SROA/mem2reg is happening as you expect it to. FWIW, we also have heap2stack and argument-promotion + constant prop for parallel regions implemented in the Attributor. That means we would/will apply SROA/mem2reg even if you have a runtime alloca and if the value is nominally "shared" but could be made firtprivate.