This is an archive of the discontinued LLVM Phabricator instance.

[codeview] Added support for bitfield type
AbandonedPublic

Authored by aaboud on Jun 18 2016, 5:45 AM.

Details

Reviewers
majnemer
rnk
Summary

Added support for bitfield type for record members.

Diff Detail

Event Timeline

aaboud updated this revision to Diff 61164.Jun 18 2016, 5:45 AM
aaboud retitled this revision from to [codeview] Added support for bitfield type.
aaboud updated this object.
aaboud added reviewers: rnk, majnemer.
aaboud added subscribers: llvm-commits, bwyma.
majnemer added inline comments.Jun 18 2016, 10:44 AM
lib/CodeGen/AsmPrinter/CodeViewDebug.cpp
1549–1562

Inferring the properties of the bitfield is problematic, consider:

#pragma pack(1)
struct S {
  char x;
  int y : 2;
  int z : 2;
};
S s;

You will give back:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 2
  BitOffset: 8
}
BitField (0x1002) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 2
  BitOffset: 10
}
UnknownLeaf (0x1003) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: char (0x70)
    FieldOffset: 0x0
    Name: x
  }
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x0
    Name: y
  }
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1002
    FieldOffset: 0x0
    Name: z
  }
}

I would think the correct output to be:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 2
  BitOffset: 0
}
BitField (0x1002) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 2
  BitOffset: 2
}
UnknownLeaf (0x1003) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: char (0x70)
    FieldOffset: 0x0
    Name: x
  }
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x4
    Name: y
  }
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1002
    FieldOffset: 0x4
    Name: z
  }
}

Perhaps we need to encode the storage offset for the bitfield in the metadata?

aaboud updated this revision to Diff 61249.Jun 20 2016, 6:11 AM

Improved solution and LIT test, thanks to the example David provided.

majnemer edited edge metadata.Jun 20 2016, 9:27 AM

This approach still seems a little problematic, consider:

#pragma pack(1)
struct S {
  char  : 0;
  int y : 1;
};
S s;

Your code will produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 8
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x0
    Name: y
  }
}

However, I think it should produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 0
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x1
    Name: y
  }
}

I'm pretty sure that the only real way to get bitfields right is by encoding the storage offset in the DI metadata.

This approach still seems a little problematic, consider:

#pragma pack(1)
struct S {
  char  : 0;
  int y : 1;
};
S s;

Your code will produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 8
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x0
    Name: y
  }
}

However, I think it should produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 0
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x1
    Name: y
  }
}

I'm pretty sure that the only real way to get bitfields right is by encoding the storage offset in the DI metadata.

I think my code will produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 0
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x0
    Name: y
  }
}

Which is the same as MS cl.

This approach still seems a little problematic, consider:

#pragma pack(1)
struct S {
  char  : 0;
  int y : 1;
};
S s;

Your code will produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 8
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x0
    Name: y
  }
}

However, I think it should produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 0
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x1
    Name: y
  }
}

I'm pretty sure that the only real way to get bitfields right is by encoding the storage offset in the DI metadata.

I think my code will produce:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: int (0x74)
  BitSize: 1
  BitOffset: 0
}
UnknownLeaf (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x0
    Name: y
  }
}

Which is the same as MS cl.

Would you mind adding it as a test case? Also, can you please rebase your patch? I cannot seem to apply it to trunk.

aaboud updated this revision to Diff 61927.Jun 26 2016, 8:08 PM
aaboud edited edge metadata.

Add more cases to the LIT test.
Updated patch to top of trunk.

Consider this input:

#pragma pack(1)
struct S {
  char    : 8;
  short   : 8;
  short x : 8;
} s;

Your patch appears to give:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: short (0x11)
  BitSize: 8
  BitOffset: 0
}
FieldList (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x2
    Name: x
  }
}

MSVC gives:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: short (0x11)
  BitSize: 8
  BitOffset: 8
}
FieldList (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x1
    Name: x
  }
}

Consider this input:

#pragma pack(1)
struct S {
  char    : 8;
  short   : 8;
  short x : 8;
} s;

Your patch appears to give:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: short (0x11)
  BitSize: 8
  BitOffset: 0
}
FieldList (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x2
    Name: x
  }
}

MSVC gives:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: short (0x11)
  BitSize: 8
  BitOffset: 8
}
FieldList (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x1
    Name: x
  }
}

You are right, this case is not handled correctly yet.
However, this is because Clang (FE) does not provide the information, this is how LLVM IR looks like:

!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "S", file: !1, line: 2, size: 24, align: 8, elements: !6, identifier: ".?AUS@@")
!6 = !{!7}
!7 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !5, file: !1, line: 5, baseType: !8, size: 8, align: 16, offset: 16)
!8 = !DIBasicType(name: "short", size: 16, align: 16, encoding: DW_ATE_signed)

and this how it should look like to fix the problem, once we make Clang emit the following the above code will work:

!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "S", file: !1, line: 2, size: 24, align: 8, elements: !6, identifier: ".?AUS@@")
!6 = !{!7, !9, 11}
!7 = !DIDerivedType(tag: DW_TAG_member, scope: !5, file: !1, line: 3, baseType: !8, size: 8, align: 8, offset: 0)
!8 = !DIBasicType(name: "char", size: 8, align: 8, encoding: DW_ATE_signed)
!9 = !DIDerivedType(tag: DW_TAG_member, scope: !5, file: !1, line: 4, baseType: !10, size: 8, align: 16, offset: 8)
!10 = !DIBasicType(name: "short", size: 16, align: 16, encoding: DW_ATE_signed)
!11 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !5, file: !1, line: 5, baseType: !11, size: 8, align: 16, offset: 16)

Then all we need to do, is not to emit members with no name, but to consider the size and alignment they consume.

Consider this input:

#pragma pack(1)
struct S {
  char    : 8;
  short   : 8;
  short x : 8;
} s;

Your patch appears to give:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: short (0x11)
  BitSize: 8
  BitOffset: 0
}
FieldList (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x2
    Name: x
  }
}

MSVC gives:

BitField (0x1001) {
  TypeLeafKind: LF_BITFIELD (0x1205)
  Type: short (0x11)
  BitSize: 8
  BitOffset: 8
}
FieldList (0x1002) {
  TypeLeafKind: LF_FIELDLIST (0x1203)
  DataMember {
    AccessSpecifier: Public (0x3)
    Type: 0x1001
    FieldOffset: 0x1
    Name: x
  }
}

You are right, this case is not handled correctly yet.
However, this is because Clang (FE) does not provide the information, this is how LLVM IR looks like:

!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "S", file: !1, line: 2, size: 24, align: 8, elements: !6, identifier: ".?AUS@@")
!6 = !{!7}
!7 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !5, file: !1, line: 5, baseType: !8, size: 8, align: 16, offset: 16)
!8 = !DIBasicType(name: "short", size: 16, align: 16, encoding: DW_ATE_signed)

and this how it should look like to fix the problem, once we make Clang emit the following the above code will work:

!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "S", file: !1, line: 2, size: 24, align: 8, elements: !6, identifier: ".?AUS@@")
!6 = !{!7, !9, 11}
!7 = !DIDerivedType(tag: DW_TAG_member, scope: !5, file: !1, line: 3, baseType: !8, size: 8, align: 8, offset: 0)
!8 = !DIBasicType(name: "char", size: 8, align: 8, encoding: DW_ATE_signed)
!9 = !DIDerivedType(tag: DW_TAG_member, scope: !5, file: !1, line: 4, baseType: !10, size: 8, align: 16, offset: 8)
!10 = !DIBasicType(name: "short", size: 16, align: 16, encoding: DW_ATE_signed)
!11 = !DIDerivedType(tag: DW_TAG_member, name: "x", scope: !5, file: !1, line: 5, baseType: !11, size: 8, align: 16, offset: 16)

Then all we need to do, is not to emit members with no name, but to consider the size and alignment they consume.

I don't think this is a good idea. It would be far cheaper to encode the offset of the underlying allocation in the extraData.

aaboud updated this revision to Diff 61998.Jun 27 2016, 11:59 AM

Added support of unnamed bitfield.

I don't think this is a good idea. It would be far cheaper to encode the offset of the underlying allocation in the extraData.

Please, review the last version of the patch and the patch for clang (D21766).
I think these patches complete the support for all bitfield cases.
However, if you have a better way to implement this support, I do not mind to abandon these patches.

Thanks.

aaboud abandoned this revision.Jul 5 2016, 12:41 AM

Different implementation was committed at rL274200.