This is an archive of the discontinued LLVM Phabricator instance.

[Clang][LLVM][Attr] support btf_type_tag attribute
ClosedPublic

Authored by yonghong-song on Oct 5 2021, 5:29 PM.

Details

Summary

This patch introduced btf_type_tag attribute. The attribute
is a type attribute and intends to address the below
linux use cases.

typedef int __user *__intp;
int foo(int __user *arg, ...)
static int do_execve(struct filename *filename,
       const char __user *const __user *__argv,
       const char __user *const __user *__envp)

Here __user in the kernel defined as

__attribute__((noderef, address_space(__user)))

for sparse ([1]) type checking mode.

For normal clang compilation, we intend to replace it with

__attribute__((btf_type_tag("user")))

and record such informaiton in dwarf and BTF so such
information later can be used in kernel for bpf verification
or for other tracing functionalities.

[1] https://www.kernel.org/doc/html/v4.11/dev-tools/sparse.html

Diff Detail

Event Timeline

yonghong-song created this revision.Oct 5 2021, 5:29 PM
yonghong-song requested review of this revision.Oct 5 2021, 5:29 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 5 2021, 5:29 PM
  • Instead of piggyback on existing AttributedType, create a new child class AttributedBTFType. The code is reorganized to avoid intrusive to existing AttributedType.
  • Hack: the AttributedBTFType still not trivially destructible. Haven't find a best way how to fix it.
  • Missing: With AttributedBTFType, we need to handle more things, serialization/deserialization, TypeProperties, etc.

It would be good if I can get some feedback on the overall approach before diving into more details.
cc @aaron.ballman

@aaron.ballman Ping. This is to address your concern in D110127, could you take a look whether my proposal for a new attribute btf_type_tag will be okay for you or not? Thanks!

@aaron.ballman Ping. This is to address your concern in D110127, could you take a look whether my proposal for a new attribute btf_type_tag will be okay for you or not? Thanks!

Thank you for the new approach! On the whole, I think this seems like the better way to go. One thing that's not clear to me is whether we expect to give this type attribute any semantics beyond passing further information to debug info or not. e.g., do we have to worry about things like:

#define __tag1 __attribute__((btf_type_tag("tag1")))
#define __tag2 __attribute__((btf_type_tag("tag2")))

int * __tag1 t1;
int * __tag2 t2 = t1; // Diagnose a type mismatch?

(I know you don't intend to support that sort of thing right now, but I'm wondering about the future -- basically, I'm worried about the situation where we accept code like that without a diagnostic now, but want to diagnose it in the future and will force a breaking change on users.)

You should definitely fix up the clang-format issues and fix identifiers to meet the usual naming conventions, etc.

clang/include/clang/AST/Type.h
4806

I think we should be able to use a StringRef instead, rather than have to worry about allocations.

@aaron.ballman Ping. This is to address your concern in D110127, could you take a look whether my proposal for a new attribute btf_type_tag will be okay for you or not? Thanks!

Thank you for the new approach! On the whole, I think this seems like the better way to go.

Great. I will go with this approach then.

One thing that's not clear to me is whether we expect to give this type attribute any semantics beyond passing further information to debug info or not. e.g., do we have to worry about things like:

#define __tag1 __attribute__((btf_type_tag("tag1")))
#define __tag2 __attribute__((btf_type_tag("tag2")))

int * __tag1 t1;
int * __tag2 t2 = t1; // Diagnose a type mismatch?

(I know you don't intend to support that sort of thing right now, but I'm wondering about the future -- basically, I'm worried about the situation where we accept code like that without a diagnostic now, but want to diagnose it in the future and will force a breaking change on users.)

This is a good question. These type attributes are mostly for type checking. In linux kernel, they are used by sparse (https://www.kernel.org/doc/html/v4.11/dev-tools/sparse.html) for type checking. We could implement a clang phase to utilize such information for type checking as well. But this is not the goal so far. Even if we indeed implemented type checking based on btf_type_tag in the future, it is likely to issue as a warning instead of a hard error. The attribute is mostly used to compile linux kernel, so we should be okay, e.g., to fix all issues in the kernel before turning on type checking feature in clang, etc.

You should definitely fix up the clang-format issues and fix identifiers to meet the usual naming conventions, etc.

Yes, this is true. This is a RFC patch so I didn't do clang-format at all. Will make sure all clang-format issues fixed, naming conventions followed with the formal patch.

yonghong-song added inline comments.Oct 11 2021, 1:01 PM
clang/include/clang/AST/Type.h
4806

StringRef indeed works as a type. Previously, I used StringRef directly passed by the caller but later the underlying string is somehow freed so I got a "" string for tag. I will revisit this.

yonghong-song retitled this revision from [POC][BTF] support btf_type_tag attribute to [Clang][LLVM][Attr] support btf_type_tag attribute.
yonghong-song edited the summary of this revision. (Show Details)
  • remove POC tag and now the implementation is complete
  • in this patch, AttributedBTFType is created as a subclass of AttributedType. This makes quite some changes in the codebase for processing or handling AttributedBTFType. Alternatively, we could amend existing AttributedType to have a StringRef to store btf_tag. We can discuss which way is better.
  • I may have missed some testing. Let me know what I have missed and I can add them.
  • fix clang-format issues.
yonghong-song added a subscriber: urnathan.
  • fix a few clang-format issues.

@aaron.ballman ping. Did you get time to look at this patch? Esp. any comments on this approach with a subclass AttributedBTF vs. putting a StringRef directly in Attributed?

martong removed a subscriber: martong.Oct 21 2021, 4:44 AM

I reviewed this a bit, but I think it might be heading in a slightly wrong direction. I think you should be using an AttributedType but I don't think we need to add AttributedBTFType as a subclass to do this. An AttributedType holds the type kind information in it, but you should be able to access an AttributedTypeLoc object to get to the semantic attribute to access the string argument to the attribute. Generally, you'd call getTypeSourceInfo() on the declaration to get the type source info for its type, you can call getTypeLoc() on that object to get a generic TypeLoc object, and then call getAs<AttributedTypeLoc>() on that to turn it into an AttributedTypeLoc and from there you can call getAttr() to get the Attr * for the semantic attribute.

clang/include/clang/AST/ASTContext.h
1562–1563
clang/include/clang/AST/PropertiesBase.td
134

Is std::string really the correct type to be using for AST serialization? Can we use StringRef here as well?

clang/include/clang/AST/Type.h
4779–4783
clang/include/clang/AST/TypeLoc.h
896

Is this actually needed? I would have assumed the AttributedTypeLoc would suffice.

clang/include/clang/Basic/Attr.td
1850

No new, undocumented attributes please.

clang/include/clang/Serialization/ASTRecordWriter.h
286–287

This appears to be unused, can be removed?

clang/lib/AST/ASTContext.cpp
4636–4656
clang/lib/CodeGen/CGDebugInfo.cpp
1152

Should this be checking !BTFTypeTag.empty() instead?

I reviewed this a bit, but I think it might be heading in a slightly wrong direction. I think you should be using an AttributedType but I don't think we need to add AttributedBTFType as a subclass to do this. An AttributedType holds the type kind information in it, but you should be able to access an AttributedTypeLoc object to get to the semantic attribute to access the string argument to the attribute. Generally, you'd call getTypeSourceInfo() on the declaration to get the type source info for its type, you can call getTypeLoc() on that object to get a generic TypeLoc object, and then call getAs<AttributedTypeLoc>() on that to turn it into an AttributedTypeLoc and from there you can call getAttr() to get the Attr * for the semantic attribute.

Thanks for detailed explanation. This is very useful. I will look into this. I agree with your approach and I myself also have concern with AttributedBTFType. I will just use AttributedType going forward.

clang/include/clang/AST/PropertiesBase.td
134

I think StringRef may be possible. I will investigate.

clang/include/clang/AST/TypeLoc.h
896

Honestly I am not 100%. But since we will AttributedType and this should not be an issue any more.

clang/include/clang/Basic/Attr.td
1850

Will do in the next revision.

clang/include/clang/Serialization/ASTRecordWriter.h
286–287

Not 100% sure whether this will be used during serialization/deserialization. I will double check.

clang/lib/CodeGen/CGDebugInfo.cpp
1152

Yes, that is true. This probably from old previous revision leftover which I am using a char array and the first '\0' indicating no attribute.

  • removed AttributedBTFType subclass. Instead of using corresponding *TypeLoc to retrieve string argument for btf_type_tag attribute.
  • fix clang-format warnings

Adding some additional reviewers who may have more opinions about the changes in CodeGen.

clang/include/clang/Basic/AttrDocs.td
2029–2034 ↗(On Diff #382143)
clang/lib/CodeGen/CGDebugInfo.cpp
942–948

Phab's diff view is not helpful here -- removing the curly braces around single-line statements per our usual coding standards.

1295
1396–1397

Can you spell out the types here?

1575
3985
4388
5112
clang/lib/Sema/SemaType.cpp
6544

Should you also validate that the type is a pointer type?

All other comments make sense. Will fix them in the next revision.

clang/lib/Sema/SemaType.cpp
6544

Looks like we cannot do this. For example,

int __attribute__((btf_type_tag("tag"))) *p;

When we reach HandleBTFTypeTagAttribute, we actually have Type as int.
We don't have information for its parent type.

During code generation, we have

pointer_type
   attributed_type
      builtin_int type

at that point, we can check pointer type's next typeloc's to decide whether it has attributed_type or not.

  • fix various clang-format and coding style issue suggested by @aaron.ballman
aaron.ballman added inline comments.Oct 27 2021, 8:56 AM
clang/lib/Sema/SemaType.cpp
6544

Ugh, GNU-style attributes are particularly awful. My initial inclination was that's the exact sort of situation we want to ignore because the type attribute should appertain to the type it's specified on (int, not int *), but GNU-style attributes "slide" to whatever the compiler thinks makes the most sense, so they're unpredictable for these kinds of uses.

However, the rule of thumb in Clang is that ignoring the attribute should come with a diagnostic so the user knows it's being ignored, otherwise they have a harder time debugging problems when they've misused the attribute. Currently, I think this will give no diagnostics when ignoring the attribute in way too many situations (attaching it to a struct declaration, attaching it to a non-pointer, etc).

We could perhaps change GetDeclSpecTypeForDeclarator() to add additional checking logic after the call to distributeTypeAttrsFromDeclarator(). However, we don't currently have any type attributes that need this kind of special handling and so that could perhaps wait for a follow-up.

Agreed. If after semantic analysis for a declaration we can check declaration's typeloc chain again, we might be able to give proper warnings. I am happy to implement this as a followup as you suggested.

Attribute bits look good to me, but I'd appreciate if @dblaikie could weigh in on whether he thinks the CodeGen changes are fine. My concern there is around whether the changes to function signatures to accept a TypeLoc are reasonable or not.

clang/include/clang/Basic/AttrDocs.td
2031 ↗(On Diff #382518)

This makes it a bit more clear what the current behavior is and that we might change that behavior in the future,

Attribute bits look good to me, but I'd appreciate if @dblaikie could weigh in on whether he thinks the CodeGen changes are fine. My concern there is around whether the changes to function signatures to accept a TypeLoc are reasonable or not.

Thanks for pulling me in here - appreciate all the reviewing you've been doing on this whole series & glad to be pulled in anywhere you think's helpful.

Ah, yeah, I see what you mean - that does seem sort of unfortunate. Is it possible these attributes could only appear on typedefs and they'd be more readily carried through that without needing extra typeloc tracking? (sorry for not having read back through the rest of the review - which may've started there and ended up here as a more general form of the attribute?)

Ah, yeah, I see what you mean - that does seem sort of unfortunate. Is it possible these attributes could only appear on typedefs and they'd be more readily carried through that without needing extra typeloc tracking? (sorry for not having read back through the rest of the review - which may've started there and ended up here as a more general form of the attribute?)

For the question, "is it possible these attributes could only appear on typedefs?" The answer is "not really possible". We are targeting existing linux kernel where existing type attributes (user, rcu, ...) have been used in places other than typedef quite extensively (e.g., function argument type, function return type, field type, etc.).

In one of my earlier prototypes, I put the tag string itself in AttributedType and with this we can avoid TypeLoc, but I understand this is not conventional usage and that is why we do through TypeLoc mechanism. @aaron.ballman can further comment on this.

Ah, yeah, I see what you mean - that does seem sort of unfortunate. Is it possible these attributes could only appear on typedefs and they'd be more readily carried through that without needing extra typeloc tracking? (sorry for not having read back through the rest of the review - which may've started there and ended up here as a more general form of the attribute?)

For the question, "is it possible these attributes could only appear on typedefs?" The answer is "not really possible". We are targeting existing linux kernel where existing type attributes (user, rcu, ...) have been used in places other than typedef quite extensively (e.g., function argument type, function return type, field type, etc.).

In one of my earlier prototypes, I put the tag string itself in AttributedType and with this we can avoid TypeLoc, but I understand this is not conventional usage and that is why we do through TypeLoc mechanism. @aaron.ballman can further comment on this.

FWIW, I made that request because AttributedTypeLoc stores the Attr * for the attributed type, so we can get the attribute argument information from that rather than having to duplicate it within a new TypeLoc object.

Ah, yeah, I see what you mean - that does seem sort of unfortunate. Is it possible these attributes could only appear on typedefs and they'd be more readily carried through that without needing extra typeloc tracking? (sorry for not having read back through the rest of the review - which may've started there and ended up here as a more general form of the attribute?)

For the question, "is it possible these attributes could only appear on typedefs?" The answer is "not really possible". We are targeting existing linux kernel where existing type attributes (user, rcu, ...) have been used in places other than typedef quite extensively (e.g., function argument type, function return type, field type, etc.).

In one of my earlier prototypes, I put the tag string itself in AttributedType and with this we can avoid TypeLoc, but I understand this is not conventional usage and that is why we do through TypeLoc mechanism. @aaron.ballman can further comment on this.

FWIW, I made that request because AttributedTypeLoc stores the Attr * for the attributed type, so we can get the attribute argument information from that rather than having to duplicate it within a new TypeLoc object.

So could the debug info code be done with AttributedType (I guess currently CGDebugInfo skips over those - but we could not skip over them and check if it's one of these attributes, otherwise skip) rather than passing around the extra TypeLoc?

Ah, yeah, I see what you mean - that does seem sort of unfortunate. Is it possible these attributes could only appear on typedefs and they'd be more readily carried through that without needing extra typeloc tracking? (sorry for not having read back through the rest of the review - which may've started there and ended up here as a more general form of the attribute?)

For the question, "is it possible these attributes could only appear on typedefs?" The answer is "not really possible". We are targeting existing linux kernel where existing type attributes (user, rcu, ...) have been used in places other than typedef quite extensively (e.g., function argument type, function return type, field type, etc.).

In one of my earlier prototypes, I put the tag string itself in AttributedType and with this we can avoid TypeLoc, but I understand this is not conventional usage and that is why we do through TypeLoc mechanism. @aaron.ballman can further comment on this.

FWIW, I made that request because AttributedTypeLoc stores the Attr * for the attributed type, so we can get the attribute argument information from that rather than having to duplicate it within a new TypeLoc object.

So could the debug info code be done with AttributedType (I guess currently CGDebugInfo skips over those - but we could not skip over them and check if it's one of these attributes, otherwise skip) rather than passing around the extra TypeLoc?

That's my hope, but I wasn't certain if there were type modifications happening where we need to calculate a TypeLoc that's different from what we would get out of the type itself. I would expect that doesn't happen during CodeGen (seems awfully late to be changing types), but wasn't 100% sure.

Ah, yeah, I see what you mean - that does seem sort of unfortunate. Is it possible these attributes could only appear on typedefs and they'd be more readily carried through that without needing extra typeloc tracking? (sorry for not having read back through the rest of the review - which may've started there and ended up here as a more general form of the attribute?)

For the question, "is it possible these attributes could only appear on typedefs?" The answer is "not really possible". We are targeting existing linux kernel where existing type attributes (user, rcu, ...) have been used in places other than typedef quite extensively (e.g., function argument type, function return type, field type, etc.).

In one of my earlier prototypes, I put the tag string itself in AttributedType and with this we can avoid TypeLoc, but I understand this is not conventional usage and that is why we do through TypeLoc mechanism. @aaron.ballman can further comment on this.

FWIW, I made that request because AttributedTypeLoc stores the Attr * for the attributed type, so we can get the attribute argument information from that rather than having to duplicate it within a new TypeLoc object.

So could the debug info code be done with AttributedType (I guess currently CGDebugInfo skips over those - but we could not skip over them and check if it's one of these attributes, otherwise skip) rather than passing around the extra TypeLoc?

That's my hope, but I wasn't certain if there were type modifications happening where we need to calculate a TypeLoc that's different from what we would get out of the type itself. I would expect that doesn't happen during CodeGen (seems awfully late to be changing types), but wasn't 100% sure.

Hmm - not sure I really follow. Is there a greater risk of this than with other types we're emitting/lrelying entirely on types and not TypeLocs already? Any way we can validate whether this is an issue/is necessary to handle?

Otherwise I'd be inclined to go with the pure Type based solution, maybe leave a "watch out for bugs related to type changes" comment somewhere in case bugs do come up in the future?

Just to be sure my understanding is correct. Given an AttributedType node, we do have a way to get the corresponding Attr, is it right? @aaron.ballman

Just to be sure my understanding is correct. Given an AttributedType node, we do have a way to get the corresponding Attr, is it right? @aaron.ballman

Oh yeah, now I remember what the problem is here! Type * is the abstract representation of type information in Clang. e.g. it represents an int in general, not a specific use of int. For the specific use, you need to ask the declaration for its particular TypeSourceInfo * object. So AttributedType tracks what KIND of attribute it was, but not the actual attribute itself, which means there's no way to go from an AttributedType * to an Attr * without first going through a TypeSourceInfo *. You can get one of those from various declarations (DeclaratorDecl and TypedefNameDecl), but the functions where you need to get to the attribute instance from don't always have the declaration handy (it'd be good to check if there are any interfaces that do have the decl handy so we can skip passing the extra argument there). So we need to pass something in some places, either a TypeSourceInfo * or a TypeLoc. A TypeLoc is two pointers wide, so it might make sense to pass the TypeSourceInfo * instead, but I don't have strong opinions there.

@aaron.ballman I checked the source. Looks like we can easily get TypeLoc from TypeSourceInfo, but not from TypeSourceInfo to TypeLoc. But We need TypeLoc so we can get attr information and also traverse TypeLoc's.. We might be able to pass TypeSourceInfo in a few functions e.g., createFieldType(), but we still need to do TSI->getTypeLoc() and pass TypeLoc to other functions like getOrCreateType(), createType() etc. So I am inclined to just use TypeLoc.

@dblaikie Based on the discussion so far, I suspect we might have to use TypeLoc. Please let me know what you think.

@aaron.ballman I checked the source. Looks like we can easily get TypeLoc from TypeSourceInfo, but not from TypeSourceInfo to TypeLoc. But We need TypeLoc so we can get attr information and also traverse TypeLoc's.. We might be able to pass TypeSourceInfo in a few functions e.g., createFieldType(), but we still need to do TSI->getTypeLoc() and pass TypeLoc to other functions like getOrCreateType(), createType() etc. So I am inclined to just use TypeLoc.

@dblaikie Based on the discussion so far, I suspect we might have to use TypeLoc. Please let me know what you think.

That sounds reasonable to me, but one possibility would be to change createType() and getOrCreateType() to take a TypeSourceInfo * rather than a QualType (because you can go from the TypeSourceInfo * back to the QualType by calling getType() on it). However, that could also be a heavier lift due to the number of call sites.

That sounds reasonable to me, but one possibility would be to change createType() and getOrCreateType() to take a TypeSourceInfo * rather than a QualType (because you can go from the TypeSourceInfo * back to the QualType by calling getType() on it). However, that could also be a heavier lift due to the number of call sites.

I think this probably hard to do. If I understand correctly, TypeSourceInfo * is available for declarations. But majority of getOrCreateType() does not have TypeSourceInfo *. For example for "int *p", "int *" does have a corresponding TypeSourceInfo *, but "int **" or "int *" does not.

Unless you mean we can get QualType from TypeLoc which is actually true. But then the problem is currently I only handle limited cases, certainly all C++ template/Class etc. are not covered, so we cannot really remove QualType as in many cases TypeLoc.isNull() is true.

aaron.ballman accepted this revision.Nov 4 2021, 5:53 AM

That sounds reasonable to me, but one possibility would be to change createType() and getOrCreateType() to take a TypeSourceInfo * rather than a QualType (because you can go from the TypeSourceInfo * back to the QualType by calling getType() on it). However, that could also be a heavier lift due to the number of call sites.

I think this probably hard to do. If I understand correctly, TypeSourceInfo * is available for declarations. But majority of getOrCreateType() does not have TypeSourceInfo *. For example for "int *p", "int *" does have a corresponding TypeSourceInfo *, but "int **" or "int *" does not. Unless you mean we can get QualType from TypeLoc which is actually true. But then the problem is currently I only handle limited cases, certainly all C++ template/Class etc. are not covered, so we cannot really remove QualType as in many cases TypeLoc.isNull() is true.

Yeah, that's why I was thinking this was too heavy of a lift. So this LGTM!

This revision is now accepted and ready to land.Nov 4 2021, 5:53 AM
dblaikie accepted this revision.Nov 4 2021, 9:57 AM

Seems unfortunate that attributes on types are only available through TypeLocs rather than through sugar (like if we used a typedef it'd be visible in the type sugar, but not if it's written on the type usage itself) - but that's above my domain knowledge here & I'll take it there are some good reasons for that to be the way it is.

clang/lib/CodeGen/CGDebugInfo.cpp
1380

I'm /guessing/ this can be rewritten as:

if (TL)

? (similarly elsewhere in this patch)

1390–1391

Is this null check necessary, or does "getAs" return null if the underlying value is null already anyway? (oh, and this would apply above as well)

yonghong-song added inline comments.Nov 4 2021, 12:14 PM
clang/lib/CodeGen/CGDebugInfo.cpp
1380

Yes. Indeed. Thanks for the suggestion! Will make changes in the next revision.

1390–1391

Yes. null check is necessary. Otherwise, we will get an assertion error like

clang: /home/yhs/work/llvm-project/clang/include/clang/AST/Type.h:689: const clang::ExtQualsTypeCommonBase* clang::QualType::getCommonPtr() const: Assertion `!isNull() && "Cannot retrieve 
a NULL type pointer"' failed.                                                                                                                                                               
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.                                                       
Stack dump:                                                                                                                                                                                 
0.      Program arguments: clang -g -D__TARGET_ARCH_x86 -mlittle-endian -I/home/yhs/work/bpf-next/tools/testing/selftests/bpf/tools/include -I/home/yhs/work/bpf-next/tools/testing/selftest
s/bpf -I/home/yhs/work/bpf-next/tools/include/uapi -I/home/yhs/work/bpf-next/tools/testing/selftests/usr/include -idirafter /home/yhs/work/llvm-project/llvm/build.cur/install/lib/clang/14.
0.0/include -idirafter /usr/local/include -idirafter /usr/include -Wno-compare-distinct-pointer-types -DENABLE_ATOMICS_TESTS -O2 -target bpf -c progs/btf_dump_test_case_multidim.c -o /home
/yhs/work/bpf-next/tools/testing/selftests/bpf/btf_dump_test_case_multidim.o -mcpu=v3                                                                                                       
1.      <eof> parser at end of file                                                                                                                                                         
2.      progs/btf_dump_test_case_multidim.c:32:5: LLVM IR generation of declaration 'f'                                                                                                     
3.      progs/btf_dump_test_case_multidim.c:32:5: Generating code for declaration 'f'                                                                                                       
 #0 0x0000000001e9e03f PrintStackTraceSignalHandler(void*) Signals.cpp:0:0                                                                                                                  
 #1 0x0000000001e9be7c llvm::sys::CleanupOnSignal(unsigned long) (/home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-14+0x1e9be7c)                                                
 #2 0x0000000001dea448 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0                                                                                                         
 #3 0x00007f7b08be4c20 __restore_rt sigaction.c:0:0                                                                                                                                         
 #4 0x00007f7b078dc37f raise (/lib64/libc.so.6+0x3737f)                                                                                                                                     
 #5 0x00007f7b078c6db5 abort (/lib64/libc.so.6+0x21db5)                                                                                                                                     
 #6 0x00007f7b078c6c89 _nl_load_domain.cold.0 loadmsgcat.c:0:0                                                                                                                              
 #7 0x00007f7b078d4a76 .annobin___GI___assert_fail.end assert.c:0:0                                                                                                                         
 #8 0x0000000002193ba3 (/home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-14+0x2193ba3)                                                                                          
 #9 0x00000000021b4981 clang::CodeGen::CGDebugInfo::CreateType(clang::FunctionType const*, llvm::DIFile*, clang::TypeLoc) (/home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-14+0
x21b4981)                                                                                                                                                                                   
#10 0x00000000021b11d8 clang::CodeGen::CGDebugInfo::CreateTypeNode(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-14+0x21b11d
8)                                                                                                                                                                                          
#11 0x00000000021b1416 clang::CodeGen::CGDebugInfo::getOrCreateType(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-14+0x21b14
16)                                                                                                                                                                                         
#12 0x00000000021b340d clang::CodeGen::CGDebugInfo::CreatePointerLikeType(llvm::dwarf::Tag, clang::Type const*, clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/yhs/work/llvm-project
/llvm/build.cur/install/bin/clang-14+0x21b340d)                                                                                                                                             
#13 0x00000000021b3b01 clang::CodeGen::CGDebugInfo::CreateType(clang::PointerType const*, llvm::DIFile*, clang::TypeLoc) (/home/yhs/work/llvm-project/llvm/build.cur/install/bin/clang-14+0x
21b3b01)
...
  • Change TypeLoc.isNull() check to !TypeLoc
This revision was landed with ongoing or failed builds.Nov 4 2021, 2:03 PM
This revision was automatically updated to reflect the committed changes.

I went ahead and reverted this, as it caused crashes when compiling a number of projects. The most reduced testcase is this:

$ cat reduced.c 
void a(*);
void a() {}
$ clang -c reduced.c -O2 -g

A full case (which reduces into this) is this, https://martin.st/temp/iconv-preproc.c, built with clang -target i686-w64-mingw32 -w -c -O2 -g iconv-preproc.c.

I bisected a crash in the Linux kernel down to the reland commit (and it is not fixed as of https://github.com/llvm/llvm-project/commit/2249ecee8d9a6237f51485bd39f01ba031575987):

$ git bisect log
# bad: [bdaa181007a46d317a1665acb8234eddeee82815] [TwoAddressInstructionPass] Update existing physreg live intervals
# good: [d4b1cf8f9c48a49ab808df15c4ab378276a07b82] [OpenMP] Build device runtimes for sm_86
git bisect start 'bdaa181007a46d317a1665acb8234eddeee82815' 'd4b1cf8f9c48a49ab808df15c4ab378276a07b82'
# good: [f2703c3c3353031de8de8c465a59d31488b11fb3] [DAG] FoldConstantArithmetic - rename NumOps -> NumElts. NFC.
git bisect good f2703c3c3353031de8de8c465a59d31488b11fb3
# good: [c68183b81e5257186c9403cf91f8b958af7459bc] [gn build] Use `=` for of -fdebug-compilation-dir
git bisect good c68183b81e5257186c9403cf91f8b958af7459bc
# bad: [2d8ec3c61d3c2d47b303187bb882ca23544f6fc5] [libcxx] [test] Narrow down XFAILs regarding a MSVC mode specific bug to "windows-dll && msvc"
git bisect bad 2d8ec3c61d3c2d47b303187bb882ca23544f6fc5
# good: [97c899f3c5d9bbff2824b3252b21378bf96f3f3f] [mlir] Add callback to provide a pass pipeline to MlirOptMain
git bisect good 97c899f3c5d9bbff2824b3252b21378bf96f3f3f
# bad: [4070f305f9a0c488d7177754d77c0b367e8695bf] [mlir][DialectConversion] Legalize all live argument conversions
git bisect bad 4070f305f9a0c488d7177754d77c0b367e8695bf
# bad: [3466e00716e12e32fdb100e3fcfca5c2b3e8d784] Reland "[Attr] support btf_type_tag attribute"
git bisect bad 3466e00716e12e32fdb100e3fcfca5c2b3e8d784
# good: [f64580f8d2ce5e1161857f9c89c2eee7a74c5ab8] [AArch64][GISel] Optimize 8 and 16 bit variants of uaddo.
git bisect good f64580f8d2ce5e1161857f9c89c2eee7a74c5ab8
# first bad commit: [3466e00716e12e32fdb100e3fcfca5c2b3e8d784] Reland "[Attr] support btf_type_tag attribute"

cvise spits out:

$ cat efi.i
typedef unsigned long efi_query_variable_info_t(int);
typedef struct {
  struct {
    efi_query_variable_info_t __attribute__((regparm(0))) * query_variable_info;
  };
} efi_runtime_services_t;
efi_runtime_services_t efi_0;

$ clang -m32 -O2 -c -o /dev/null efi.i

$ clang -m32 -O2 -g -c -o /dev/null efi.i
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang -m32 -O2 -g -c -o /dev/null efi.i
1.      <eof> parser at end of file
 #0 0x0000000002b31d13 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2b31d13)
 #1 0x0000000002b2fb3e llvm::sys::RunSignalHandlers() (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2b2fb3e)
 #2 0x0000000002ab8d13 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #3 0x0000000002ab8e8e CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #4 0x00007fa9dd533870 __restore_rt sigaction.c:0:0
 #5 0x0000000002eaa00e clang::CodeGen::CGDebugInfo::CreateType(clang::FunctionType const*, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eaa00e)
 #6 0x0000000002ea4598 clang::CodeGen::CGDebugInfo::getOrCreateType(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ea4598)
 #7 0x0000000002ea7cc4 clang::CodeGen::CGDebugInfo::CreatePointerLikeType(llvm::dwarf::Tag, clang::Type const*, clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ea7cc4)
 #8 0x0000000002eb6558 clang::CodeGen::CGDebugInfo::CreateTypeNode(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eb6558)
 #9 0x0000000002ea4598 clang::CodeGen::CGDebugInfo::getOrCreateType(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ea4598)
#10 0x0000000002eaa610 clang::CodeGen::CGDebugInfo::createFieldType(llvm::StringRef, clang::QualType, clang::SourceLocation, clang::AccessSpecifier, unsigned long, unsigned int, llvm::DIFile*, llvm::DIScope*, clang::RecordDecl const*, llvm::MDTupleTypedArrayWrapper<llvm::DINode>, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eaa610)
#11 0x0000000002eab3c0 clang::CodeGen::CGDebugInfo::CollectRecordNormalField(clang::FieldDecl const*, unsigned long, llvm::DIFile*, llvm::SmallVectorImpl<llvm::Metadata*>&, llvm::DIType*, clang::RecordDecl const*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eab3c0)
#12 0x0000000002eab90c clang::CodeGen::CGDebugInfo::CollectRecordFields(clang::RecordDecl const*, llvm::DIFile*, llvm::SmallVectorImpl<llvm::Metadata*>&, llvm::DICompositeType*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eab90c)
#13 0x0000000002eb111a clang::CodeGen::CGDebugInfo::CreateTypeDefinition(clang::RecordType const*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eb111a)
#14 0x0000000002eb65bd clang::CodeGen::CGDebugInfo::CreateTypeNode(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eb65bd)
#15 0x0000000002ea4598 clang::CodeGen::CGDebugInfo::getOrCreateType(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ea4598)
#16 0x0000000002eaa610 clang::CodeGen::CGDebugInfo::createFieldType(llvm::StringRef, clang::QualType, clang::SourceLocation, clang::AccessSpecifier, unsigned long, unsigned int, llvm::DIFile*, llvm::DIScope*, clang::RecordDecl const*, llvm::MDTupleTypedArrayWrapper<llvm::DINode>, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eaa610)
#17 0x0000000002eab3c0 clang::CodeGen::CGDebugInfo::CollectRecordNormalField(clang::FieldDecl const*, unsigned long, llvm::DIFile*, llvm::SmallVectorImpl<llvm::Metadata*>&, llvm::DIType*, clang::RecordDecl const*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eab3c0)
#18 0x0000000002eab90c clang::CodeGen::CGDebugInfo::CollectRecordFields(clang::RecordDecl const*, llvm::DIFile*, llvm::SmallVectorImpl<llvm::Metadata*>&, llvm::DICompositeType*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eab90c)
#19 0x0000000002eb111a clang::CodeGen::CGDebugInfo::CreateTypeDefinition(clang::RecordType const*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eb111a)
#20 0x0000000002eb65bd clang::CodeGen::CGDebugInfo::CreateTypeNode(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eb65bd)
#21 0x0000000002ea4598 clang::CodeGen::CGDebugInfo::getOrCreateType(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ea4598)
#22 0x0000000002ea9864 clang::CodeGen::CGDebugInfo::CreateType(clang::TypedefType const*, llvm::DIFile*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ea9864)
#23 0x0000000002eb65f1 clang::CodeGen::CGDebugInfo::CreateTypeNode(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2eb65f1)
#24 0x0000000002ea4598 clang::CodeGen::CGDebugInfo::getOrCreateType(clang::QualType, llvm::DIFile*, clang::TypeLoc) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ea4598)
#25 0x0000000002ebfe3a clang::CodeGen::CGDebugInfo::EmitGlobalVariable(llvm::GlobalVariable*, clang::VarDecl const*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ebfe3a)
#26 0x0000000002dea4cc clang::CodeGen::CodeGenModule::EmitGlobalVarDefinition(clang::VarDecl const*, bool) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2dea4cc)
#27 0x0000000002dec4a0 clang::CodeGen::CodeGenModule::EmitTentativeDefinition(clang::VarDecl const*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2dec4a0)
#28 0x0000000003f08882 clang::Sema::ActOnEndOfTranslationUnit() (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x3f08882)
#29 0x0000000003df2f8b clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x3df2f8b)
#30 0x0000000003ded45d clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x3ded45d)
#31 0x00000000035f4d50 clang::FrontendAction::Execute() (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x35f4d50)
#32 0x000000000356c7ff clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x356c7ff)
#33 0x00000000036a1a23 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x36a1a23)
#34 0x000000000183866b cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x183866b)
#35 0x00000000018361cd ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#36 0x0000000003402952 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1>(long) Job.cpp:0:0
#37 0x0000000002ab8c27 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x2ab8c27)
#38 0x00000000034024b7 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x34024b7)
#39 0x00000000033ca628 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x33ca628)
#40 0x00000000033ca8f7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x33ca8f7)
#41 0x00000000033e2ef1 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x33e2ef1)
#42 0x0000000001835ab1 main (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x1835ab1)
#43 0x00007fa9dcfd8b25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#44 0x0000000001832ffe _start (/home/nathan/cbl/src/llvm-project/build/stage1/bin/clang-14+0x1832ffe)
clang-14: error: clang frontend command failed with exit code 139 (use -v to see invocation)
ClangBuiltLinux clang version 14.0.0 (https://github.com/llvm/llvm-project 3466e00716e12e32fdb100e3fcfca5c2b3e8d784)
Target: i386-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/nathan/cbl/src/llvm-project/build/stage1/bin
clang-14: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

The sema portions of this change are still causing an issue. Although the revert (rGf95bd18b5faa6a5af4b5786312c373c5b2dce687) and the subsequent re-application in rG3466e00716e12e32fdb100e3fcfca5c2b3e8d784 addressed the debug info issue, the Sema issue remains.

// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c11 -x c %s
struct dispatch_object_s;
void _dispatch_queue_get_head(struct dispatch_object_s * volatile dq_items_head) {
  (_Atomic __typeof__(dq_items_head) *)dq_items_head;
}

will trigger an assertion failure.

Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2022, 11:53 AM

@compnerd Thanks for reporting. I will take a look as soon as possible.