This is an archive of the discontinued LLVM Phabricator instance.

[Attr] Fix a btf_type_tag AST generation bug
ClosedPublic

Authored by yonghong-song on Feb 21 2022, 7:18 PM.

Details

Summary

Current ASTContext.getAttributedType() takes attribute kind,
ModifiedType and EquivType as the hash to decide whether an AST node
has been generated or note. But this is not enough for btf_type_tag
as the attribute might have the same ModifiedType and EquivType, but
still have different string associated with attribute.

For example, for a data structure like below,

struct map_value {
      int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a; 
      int __attribute__((btf_type_tag("tag2"))) __attribute__((btf_type_tag("tag4"))) *b; 
};

The current ASTContext.getAttributedType() will produce
an AST similar to below:

struct map_value {
      int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *a; 
      int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) *b; 
};

and this is incorrect.

It is very difficult to use the current AttributedType as it is hard to
get the tag information. To fix the problem, this patch introduced
BTFTagAttributedType which is similar to AttributedType
in many ways but with an additional BTFTypeTagAttr. The tag itself can
be retrieved with BTFTypeTagAttr.
With the new BTFTagAttributed type, the debuginfo code can be greatly
simplified compared to previous TypeLoc based approach.

Diff Detail

Event Timeline

yonghong-song requested review of this revision.Feb 21 2022, 7:18 PM
yonghong-song created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2022, 7:18 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

fix clang-format issues.

@aaron.ballman ping, did you get time to look at this patch?

@aaron.ballman ping, did you get time to look at this patch?

Sorry, I was in C meetings all last week, still digging out from under the emails that piled up.

I think the approach taken here is likely incorrect. Not all AttributedType objects need this "extra info" as a string, and there's no reason to believe future attributes will want their extra info to be in the form of a string. However, there's also a fundamental flaw -- the AttributedType isn't tracking that extra information, so it cannot be profiled in all circumstances (I left a comment in the review for that).

As best I can tell, you want a different attribute argument to result in a different type in the type system. That needs to happen by adding a new type to the type system instead of using the generic AttributedType type. This new type can track the extra information needed for uniquing the type within the type system. As an example, VectorType is formed via an attribute, but it needs additional information about what kind of vector it is. I don't think you can do what you're trying in this patch because the AttributedType tracks the attribute *kind* but not the actual semantic attribute in use. You need an AttributedTypeLoc to get access to that via the type system, and you don't always have access to type locations.

clang/include/clang/AST/Type.h
4777

This form of type profiling was not updated (and can't be, because there's nothing tracking what extra info is associated with the type), which further suggests we need a new type in the type system.

@aaron.ballman Thanks for suggestion. Agree that Having a new Type is a better idea. I guess, I will add BTFTagAttributeType which extends AttributeType to see whether it works or not.

@aaron.ballman Thanks for suggestion. Agree that Having a new Type is a better idea. I guess, I will add BTFTagAttributeType which extends AttributeType to see whether it works or not.

It almost definitely does NOT want to extend AttributedType, but be its own type. VectorType/BitIntType are both somewhat good examples to follow (besides the 'dependent' version, since this is C only). That said, make sure you implement ExprConstant.cpp's visitors for your type so that you get these types to 'work' like normal types.

@erichkeane Thanks for suggestion. I will add as an independent type then.

yonghong-song edited the summary of this revision. (Show Details)
  • Create a separate type BTFTagAttributed instead of using Attributed type based on previous discussion
Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2022, 9:09 AM

@aaron.ballman @erichkeane Just updated the patch with a new implementation, which added a new BTFTagAttributedType instead of reusing existing AttributedType. It is quite complicated and various parts of clang are touched. I mostly just followed what AttributedType did but considering BTFTagAttributedType is C only so my current implementation may unnecessarily complicate things. I didn't have much test coverage either. It would be great you can give a look and let me know (1). what I am missing, (2). where I can simplify things, (3) a different approach if you see fit, (4). what additional tests I should have. Thanks!

I only looked at the 'new type' code, and it generally looks correct with the exception of the comments below.

clang/include/clang/AST/Type.h
4796

I suspect both of these aren't necessary. If the intent here is to just wrap a single type, I think we can have only 1. Then the fields here are just the Tag, and the WrappedType.

clang/include/clang/Serialization/ASTRecordReader.h
131

This is a bug. readString returns a std::string temporary, which creating the StringRef to, it now no longer exists.

clang/lib/Sema/SemaType.cpp
194

This seems pretty absurdly expensive here... Should we instead have the BTFTagAttributedType store its Attr here?

clang/lib/Sema/TreeTransform.h
6872

Most of this tree-transform doesn't really do much, since this is a 'C' only type, but otherwise we'd probably want to allow the tag itself to be dependent.

We still need this though, since there are other non-template tree-transforms.

You also might need to make this not contain a StringRef based on the serialization issues above.

yonghong-song added inline comments.Mar 7 2022, 3:43 PM
clang/include/clang/AST/Type.h
4796

I didn't find WrappedType in clang code base. Maybe I missed something?
But I understand what you mean. Here, unlike AttributedType which needs to deal with attributes which may encode as qualifiers, BTFTagAttributedType won't have this issue so one 'WrappedType' (representing underlying type of the btf_tag attribute) should be enough.

clang/include/clang/Serialization/ASTRecordReader.h
131

Will fix.

clang/lib/Sema/SemaType.cpp
194

Indeed this is a good idea. This can simplify the code a lot.

clang/lib/Sema/TreeTransform.h
6872

I will try to do some experiment and simplify this. Indeed this is C and non templates are involved.

@erichkeane Thanks for the review. I will try to address your comments as soon as possible.

  • address @erichkeane comments
  • further TODO: add more tests

@erichkeane Just posted a patch to address your comments. Could you take a look? I will try to add more tests, esp. serialization and libclang. If you have any suggestions in these two areas (as I am not familiar with them), it would be great!

  • add a pch test to test serialization with btf_type_tag.
  • simplify the code
yonghong-song edited the summary of this revision. (Show Details)Mar 8 2022, 7:38 PM

1 comment, otherwise LGTM. I'd like @aaron.ballman to take a look once you've got the attribute transforming as well though.

clang/lib/Sema/TreeTransform.h
6876

The attribute needs to be transformed as well. This should basically be a 'no op', but we need to do so in case we end up using 'treetransform' for something like a 'clone a node' type thing.

Thanks for this! One thing I noticed is that we seem to be lacking a lot of Sema tests for the changes now that this is a type attribute. Of particular interest to me would be cases using _Generic and __attribute__((overloadable)) to demonstrate what kind of type system effects there are. Same for redeclaration behavior or other type system situations.

clang/include/clang/AST/ASTContext.h
1596
clang/include/clang/AST/Type.h
4799

You should fix the formatting issue, but also, the parameter names don't match the usual naming conventions (please don't replace attr with Attr though; that's a type name, so go with something else like BTFAttr).

4806

Should this be returning a const BTFTypeTagAttr * instead?

4811

You should fix the formatting issue (here and everywhere in the file, I'll stop commenting on them).

clang/include/clang/AST/TypeLoc.h
904–917

This seems a bit suspicious to me -- we are store the same attribute twice (once on the BTFTagAttributedType and once on its type loc). Do we need to do that, or can we have the type loc reach through the type pointer to get the attribute from there? (Alternatively, do we want to require callers to have access to an BTFTagAttributedTypeLoc in order to query the attribute, similar to how AttributedType/AttributedTypeLoc work?)

925

loc is unused? (and should have its identifier removed if this is on purpose)

clang/include/clang/Serialization/ASTRecordWriter.h
126–127

This doesn't appear to be used? Or is this called automagically because of the changes in TypeProperties.td?

clang/lib/AST/ASTContext.cpp
4690–4691

Parameter names don't match coding style (same advice as above).

4695

Locals should also make the coding style.

clang/lib/CodeGen/CGDebugInfo.cpp
1144
clang/lib/Sema/TreeTransform.h
6872

We still need this though, since there are other non-template tree-transforms.

Are we sure any of them can be hit for this new type? It'd be nice to keep this out of the template instantiation bits if possible.

@aaron.ballman Thanks for the review! I will address all your comments and will also add tests with _Generic and __attribute__((overloadable)).

clang/include/clang/AST/Type.h
4799

Will do. Sorry, I didn't spend effort on formatting itself as I know I will need to make revisions. But indeed should do it regardless!

4806

Yes, will use 'const BTFTypeTagAttr *` in the next revision.

clang/include/clang/AST/TypeLoc.h
904–917

Good point. Will change

struct BTFTagAttributedLocInfo {
  const Attr *TypeAttr;
};

to

struct BTFTagAttributedLocInfo {
};

since we can get the attribute from the BTFTagAttributedType itself.

925

loc is needed in the parameter for this function. For example, in QualifiedTypeLoc, we have

/// Initializes the local data of this type source info block to
/// provide no information.
void initializeLocal(ASTContext &Context, SourceLocation Loc) {
  // do nothing
}
clang/include/clang/Serialization/ASTRecordWriter.h
126–127

It is used because the changes in TypeProperties.td.

  • clang-format fix
  • change variable names to be llvm compliant.
  • remove Attr from BTFTagAttributedTypeLoc as it can be retrieved from the BTFTagAttributedType
  • Also transform the attribute in TreeTransform
  • add additional test with _Generic and overloadable attribute
  • further clang-format fixes.
  • use BTFTypeTagAttr instead of Attr in PropertiesBase.td

@aaron.ballman @erichkeane Hopefully I addressed your comments. Could you take a look again?

aaron.ballman added inline comments.Mar 11 2022, 9:35 AM
clang/include/clang/AST/TypeLoc.h
923
clang/lib/CodeGen/CGDebugInfo.cpp
1144

This one may have been missed.

clang/test/Sema/attr-btf_type_tag.c
31–37

This test looks confused -- those functions have different names, so they're not overloading one another. Even if they were named with the same identifier, it wouldn't be sufficient to test which gets called (you can redeclare overloads as in https://godbolt.org/z/aebMbY1aM).

I think this case needs a codegen test to see which function is actually dispatched to.

yonghong-song added inline comments.Mar 11 2022, 5:50 PM
clang/include/clang/AST/TypeLoc.h
923

I didn't see getLocalDataSize() in AttributedTypeLoc so it is not needed for AttributedTypeLoc. BTFTagAttributedTypeLoc usage is very similar to AttributedTypeLoc, so I think we are fine here.

clang/lib/CodeGen/CGDebugInfo.cpp
1144

You mean:

auto *BTFTagAttrTy = dyn_cast<BTFTagAttributedType>(PointeeTy);

I missed it. Will fix in the next revision.

clang/test/Sema/attr-btf_type_tag.c
31–37

sorry for confusion. the test is bad. For the example in the above godbolt.org link, we have

__attribute__((overloadable)) void func(int a);
__attribute__((overloadable)) void func(int a);

Such multiple declarations are actually okay since we allow duplicated declarations. As you mentioned, we really want to know which function definition it picks with overloadable attribute.
I tried a few examples:

$ cat test.c
#if 1
#define __attr __attribute__((btf_type_tag("tag1")))
#else
#define __attr __attribute__((noderef))
#endif

void bar1(int __attr *);
void bar2(int *);
void isdigit1(int __attr *a) __attribute__((overloadable)) {
  bar1(a);
}
void isdigit1(int *a) __attribute__((overloadable)) {
  bar2(a);
}

void foo(int __attr *a, int *b) {
  isdigit1(a);
  isdigit1(b);
}

$ clang -g -std=c2x test.c -S -O2 -emit-llvm -DUSE_TAG
test.c:12:6: error: redefinition of 'isdigit1'
void isdigit1(int *a) __attribute__((overloadable)) {
     ^
test.c:9:6: note: previous definition is here
void isdigit1(int __attr *a) __attribute__((overloadable)) {
     ^
1 error generated.
$ clang -g -std=c2x test.c -S -O2 -emit-llvm
test.c:12:6: error: redefinition of 'isdigit1'
void isdigit1(int *a) __attribute__((overloadable)) {
     ^
test.c:9:6: note: previous definition is here
void isdigit1(int __attr *a) __attribute__((overloadable)) {
     ^
1 error generated.
$

so just attribute itself cannot differentiate overloadable functions. In the above test.c, if I changed one of isdigle1 and corresponding bar?() function from 'int' to 'float', compilation can succeed.

So I will add overloadable failure and success case in Sema/ directory. I am not sure whether a codegen case is needed for the case like

void bar1(float __attr *);
void bar2(int *);
void isdigit1(float __attr *a) __attribute__((overloadable)) {
  bar1(a);
}
void isdigit1(int *a) __attribute__((overloadable)) {
  bar2(a);
}

or not. But if you think it is necessary, I am happy to add it.

  • fix one 'auto' declaration issue
  • rework on the test with overloadable attribute.
aaron.ballman added inline comments.Mar 14 2022, 11:55 AM
clang/include/clang/AST/TypeLoc.h
923

The main difference is that AttributedLocInfo has a member and BTFTagAttributedLocInfo is empty. This is why I think it's closer to an AdjustedLocInfo object which also is an empty struct.

clang/test/Sema/attr-btf_type_tag.c
29

I'd like a third test case:

int g3 = _Generic(0, int __tag1 * : 0, int * : 0, default : 0); // expected-error {{something about duplicate compatible associations}}
31–37

Okay, I am fine with that behavior. CodeGen tests are only necessary if the overloads can actually be defined -- since they can't, I think the Sema tests are sufficient to show that these are redeclarations and not independent overloads.

yonghong-song added inline comments.Mar 14 2022, 11:32 PM
clang/include/clang/AST/TypeLoc.h
923

I think adding getLocalDataSize() return 0 should be OK. But it looks like we don't need to do it, at least for BTFTagAttributedTypeLoc. The following are some details.

The parent class of BTFTagAttributedTypeLoc is ConcreteTypeLoc, which contains:

unsigned getLocalDataSize() const {
  unsigned size = sizeof(LocalData);
  unsigned extraAlign = asDerived()->getExtraLocalDataAlignment();
  size = llvm::alignTo(size, extraAlign);
  size += asDerived()->getExtraLocalDataSize();
  return size;
}

unsigned getExtraLocalDataSize() const {
  return 0;
}

unsigned getExtraLocalDataAlignment() const {
  return 1;
}

Manually calculating getLocalDataSize() can get the same return value 0. So I think it is okay not to define getLocalDataSize in BTFTagAttributedTypeLoc.

The AdjustedLocInfo seems having some inconsistency, at least based on comments:

struct AdjustedLocInfo {}; // Nothing.

unsigned getLocalDataSize() const {
  // sizeof(AdjustedLocInfo) is 1, but we don't need its address to be unique
  // anyway.  TypeLocBuilder can't handle data sizes of 1.
  return 0;  // No data.
}

Maybe previously AdjustedLocInfo size is 1 and the implementation of getLocalDataSize() to set size to 0 explicitly in which case, the function is needed inside AdjustedTypeLoc. It might be needed now since AdjustedLocInfo size is 0.

clang/test/Sema/attr-btf_type_tag.c
29

Will add. I will break the above test into multiple lines. Otherwise, there will be multiple errors in the same line.

31–37

Thanks for confirmation!

yonghong-song added inline comments.Mar 14 2022, 11:37 PM
clang/include/clang/AST/TypeLoc.h
923

For

Maybe previously AdjustedLocInfo size is 1 and the implementation of getLocalDataSize() to set size to 0 explicitly in which case, the function is needed inside AdjustedTypeLoc. It might be needed now since AdjustedLocInfo size is 0.

I mean "it might not be needed now since AdjustedLocInfo size is 0".

  • add one more _Generic test.

I think we're pretty close; it looks like there's still one unresolved conversation about template instantiation needs and whether we can remove that code or not.

clang/include/clang/AST/TypeLoc.h
923

Ahh, I see why we get away with it then. Thank you!

clang/lib/Sema/TreeTransform.h
6872

I think this may be the only unresolved conversation in the review.

yonghong-song added inline comments.Mar 15 2022, 9:54 AM
clang/lib/Sema/TreeTransform.h
6872

Are we sure any of them can be hit for this new type? It'd be nice to keep this out of the
template instantiation bits if possible.

Actually I am not sure. But looks like we have to have the function implemented due to
'automatic code generation'. Let me take a look. If we don't need it since the attribute for C
only. The implementation can be just a unreachable failure or somehow I will see whether I can
tweak the auto code generation to avoid instantiating this function.

aaron.ballman added inline comments.Mar 15 2022, 10:00 AM
clang/lib/Sema/TreeTransform.h
6872

If we need it, we need it. But the more we can remove (like replacing the implementation with an unreachable), the better, so thank you for looking into it!

  • change TransformBTFTagAttributedType() implementation with a simple llvm_unreachable() message. I tested linux kernel and its selftests/bpf compilation with clang and found the only TreeTransformation function is called with TransformAutoType() due to __auto_type usage.
This revision is now accepted and ready to land.Mar 16 2022, 6:40 AM
This revision was automatically updated to reflect the committed changes.

This change ends up leaving some unhandled enum with switches inside lldb, and it isn't obvious to me how to fix them. Can you take a quick look?

llvm-project/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4098:11: error: enumeration value 'BTFTagAttributed' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~
llvm-project/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4757:11: error: enumeration value 'BTFTagAttributed' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~
llvm-project/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:5140:11: error: enumeration value 'BTFTagAttributed' not handled in switch [-Werror,-Wswitch]
  switch (qual_type->getTypeClass()) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~
3 errors generated.