Page MenuHomePhabricator

Fix the handling of unnamed bit-fields when parsing DWARF

Authored by shafik on Jan 17 2020, 1:08 PM.



We ran into an assert when debugging clang and performing an expression on a class derived from DeclContext. The assert was indicating we were getting the offsets wrong for RecordDeclBitfields. We were getting both the size and offset of unnamed bit-field members wrong. We could fix this case with a quick change but as I extended the test suite to include more combinations we kept finding more cases that were being handled incorrectly. A fix that handled all the new cases as well as the cases already covered required a refactor of the existing technique.

I removed a duplicate of BitfieldInfo and renamed BitfieldInfo -> FieldInfo since it is being used for both bit-fields and non-bit-fields. I extended to cover five more cases we uncovered while fixing this bug.

Diff Detail

Event Timeline

shafik created this revision.Jan 17 2020, 1:08 PM

Very cool. I think we can improve the code quite a bit while we're here.

155 ↗(On Diff #238861)

Why don't we inspect the values of fields, too?


AT_foo makes git grep less useful


Can you please add comments explaining what each of these cases handle?


Looks like this could benefit from a refactoring with early exits. Can be a follow-up commit though.


for example, it's not obvious what happened to end up in this else branch


Not you fault, but: This data structure is a disaster waiting to happen. Instead of having magic sentinel values, should we use an Optional<FieldInfo> that is always valid if it exists? Or should the members be Optionals?

(Just some quick comments, will review this properly during normal working hours)

Without this fix debugging Clang with LLDB is essentially impossible, so I'm in favour of landing this with as few pre-commit refactorings as possible to make backporting easier and getting this in ASAP. We probably also want to ping Hans to get this into the 10 release branch (which was created 2 days ago, so that's just a simple cherry-pick).

Also you might want to check out the recently added expect_expr command instead of calling expect (see the inline comment for an example). The API currently doesn't support the whole fuzzy substr matching (which is on purpose) so you might not be able to use it everywhere (at least the current version).

207 ↗(On Diff #238861)

We can do this since last week: self.expect_expr("(lba.a)", result_type="unsigned int", result_value="2") which is a much more safer and convenient way of doing this.

128 ↗(On Diff #238861)

I would prefer generic names as otherwise Clang folks randomly see the test when they grep for usages of these variables by name in the LLVM repo.

(Jim pointed out that we land this without expect_expr to make back porting easier but somehow Phabricator didn't add his comment to the review here. expect_expr is not yet in the downstream Github branch but it is in the 10 release branch, so that makes sense to me).

Anyway, some more points from my side:

  1. I actually thought this would fix the crashes in CGRecordLowering::lower() that we kept seeing, but I've been dogfooding this patch today and I'm still getting the crashes when dumping arbitrary clang Decls from LLDB:
3  libsystem_platform.dylib 0x0000000102881000 _sigtramp + 18446603342990035968
4  liblldb.11.0.0git.dylib  0x0000000109d8eb3b (anonymous namespace)::CGRecordLowering::lower(bool) + 14555
5  liblldb.11.0.0git.dylib  0x0000000109d89aa9 clang::CodeGen::CodeGenTypes::ComputeRecordLayout(clang::RecordDecl const*, llvm::StructType*) + 249
6  liblldb.11.0.0git.dylib  0x0000000109e948b1 clang::CodeGen::CodeGenTypes::ConvertRecordDeclType(clang::RecordDecl const*) + 785
7  liblldb.11.0.0git.dylib  0x0000000109e91800 clang::CodeGen::CodeGenTypes::ConvertType(clang::QualType) + 2048
8  liblldb.11.0.0git.dylib  0x0000000109d8f269 (anonymous namespace)::CGRecordLowering::getStorageType(clang::FieldDecl const*) + 41
9  liblldb.11.0.0git.dylib  0x0000000109d8c9f3 (anonymous namespace)::CGRecordLowering::lower(bool) + 6035
10 liblldb.11.0.0git.dylib  0x0000000109d89aa9 clang::CodeGen::CodeGenTypes::ComputeRecordLayout(clang::RecordDecl const*, llvm::StructType*) + 249
11 liblldb.11.0.0git.dylib  0x0000000109e948b1 clang::CodeGen::CodeGenTypes::ConvertRecordDeclType(clang::RecordDecl const*) + 785
12 liblldb.11.0.0git.dylib  0x0000000109e91313 clang::CodeGen::CodeGenTypes::ConvertType(clang::QualType) + 787
13 liblldb.11.0.0git.dylib  0x0000000109d8f269 (anonymous namespace)::CGRecordLowering::getStorageType(clang::FieldDecl const*) + 41
14 liblldb.11.0.0git.dylib  0x0000000109d8c9f3 (anonymous namespace)::CGRecordLowering::lower(bool) + 6035
15 liblldb.11.0.0git.dylib  0x0000000109d89aa9 clang::CodeGen::CodeGenTypes::ComputeRecordLayout(clang::RecordDecl const*, llvm::StructType*) + 249
16 liblldb.11.0.0git.dylib  0x0000000109e948b1 clang::CodeGen::CodeGenTypes::ConvertRecordDeclType(clang::RecordDecl const*) + 785
17 liblldb.11.0.0git.dylib  0x0000000109e91800 clang::CodeGen::CodeGenTypes::ConvertType(clang::QualType) + 2048
18 liblldb.11.0.0git.dylib  0x0000000109f0a22a (anonymous namespace)::X86_64ABIInfo::classifyArgumentType(clang::QualType, unsigned int, unsigned int&, unsigned int&, bool) const + 634
19 liblldb.11.0.0git.dylib  0x0000000109f07515 (anonymous namespace)::X86_64ABIInfo::computeInfo(clang::CodeGen::CGFunctionInfo&) const + 2117
20 liblldb.11.0.0git.dylib  0x0000000109b513f6 clang::CodeGen::CodeGenTypes::arrangeLLVMFunctionInfo(clang::CanQual<clang::Type>, bool, bool, llvm::ArrayRef<clang::CanQual<clang::Type> >, clang::FunctionType::ExtInfo, llvm::ArrayRef<clang::FunctionType::ExtParameterInfo>, clang::CodeGen::RequiredArgs) + 2822
21 liblldb.11.0.0git.dylib  0x0000000109b51aec arrangeLLVMFunctionInfo(clang::CodeGen::CodeGenTypes&, bool, llvm::SmallVectorImpl<clang::CanQual<clang::Type> >&, clang::CanQual<clang::FunctionProtoType>) + 700
22 liblldb.11.0.0git.dylib  0x0000000109b51ba9 clang::CodeGen::CodeGenTypes::arrangeCXXMethodType(clang::CXXRecordDecl const*, clang::FunctionProtoType const*, clang::CXXMethodDecl const*) + 153
23 liblldb.11.0.0git.dylib  0x0000000109b51e0d clang::CodeGen::CodeGenTypes::arrangeCXXMethodDeclaration(clang::CXXMethodDecl const*) + 525
24 liblldb.11.0.0git.dylib  0x0000000109e15831 clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) + 177
25 liblldb.11.0.0git.dylib  0x0000000109e0b48b clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) + 2235
26 liblldb.11.0.0git.dylib  0x0000000109e19996 clang::CodeGen::CodeGenModule::EmitTopLevelDecl(clang::Decl*) + 150

So either there is another corner case not handled in this patch, we have another bug that screws up our layout or my local build is broken (I already dropped every patch beside this one and rebuild, so the last one is unlikely). Or I was mistaken what actually was causing this crash.

  1. I'm still kinda confused by some parts of the code but the undocumented in/out last_field_info thing we did before was also confusing, so if this gets bitfields working then I'm fine with getting this in. However I wish we didn't add another in/out parameter by now having a normal field info and a field info for bitfields. If we could model this as one FieldInfo with a is_bitfield flag or so that would be preferred.

IIUC then those two bitfield info variables are mutually exclusive? I.e., either the last field was bitfield or a normal field? If yes, would it make sense to model this as one FieldInfo variable and have a bool value for differentiating between them? That way we don't have the possibility that we forget to clear one when we set the other.


This code looks like we forgot to initialise the last_bitfield_info fields and would IMHO also be more readable if we had a bool value like is_bitfield etc.



teemperor requested changes to this revision.Jan 20 2020, 3:52 AM
This revision now requires changes to proceed.Jan 20 2020, 3:52 AM

I just went back to the radar we had about the CG crash and this *does* fix that issue (rdar://53932023). I could also not reproduce the CG crash again so I assume that was something else that fiddled around in that area. So LGTM minus the thing with the last_bitfield_info last_field_info duality which I would prefer we could get rid of before landing (but it's not a must and can be done as a follow-up commit).

shafik updated this revision to Diff 239732.Jan 22 2020, 4:28 PM
shafik marked 14 inline comments as done.

Realized that the assert that I was hitting in some cases only reproduced using C++ for example one case we needed to use a class w/ private members. So created a new bit-field test which is C++ specific.

Addressed comments:

  • Removed magic numbers from FieldInfo
  • Removed the need for last_field_info and last_bitfield_info
  • Refactored away more code

It was not obvious during the initial refactoring wave but yes, I removed one of them.


I removed a lot of in the next wave of refactoring.


This is a good point, it took a bit more refactoring of the code using this but I removed the magic numbers and used an Optional<FieldInfo> for the unnamed_field_info case.

LGTM modulo some minor points regarding the test. The refactoring of the parsing code and using expect_expr can be done as NFC follow-ups. Thanks for the patch, great work!


All the setup code can just be:

lldbutil.run_to_source_breakpoint(self, '// Set break point at this line.',
        lldb.SBFileSpec("main.cpp", False))

Do we need stdio and string.h?


//// instead of //.

teemperor accepted this revision.Jan 23 2020, 9:23 AM
This revision is now accepted and ready to land.Jan 23 2020, 9:23 AM
shafik updated this revision to Diff 239995.Jan 23 2020, 1:23 PM
shafik marked an inline comment as done.
  • Addressing minor comments
  • Adding fix for we need to guard last_field_info.NextBitfieldOffsetIsValid(...) with a check to make sure we have a bit-field. Objective C treats non-bitfields as all having offset zero.
shafik marked 2 inline comments as done.Jan 23 2020, 1:23 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2020, 2:51 PM
aprantl added inline comments.Jan 23 2020, 3:06 PM

Is this needed?


Is this needed?


Since it's only used in bitfields we probably don't even need uint64_t? It speeds up compilation of the testcase quite a bit if the std module doesn't have to be built.


This is personal taste, but for a struct I probably wouldn't even bother writing a trivial accessor. Up to you.

labath added a subscriber: labath.Jan 24 2020, 3:19 AM
labath added inline comments.

BTW, I've changed this to '\0' because the memory following this array is uninitialized, and that would confuse the the "const char *" pretty printer, which would continue to print the garbage after the array. The actual value of arr does not seem to be relevant for this test (I hope), but the pretty printer behavior is suboptimal, so I've filed pr44649 to track that.

shafik marked 2 inline comments as done.Jan 24 2020, 11:14 AM
shafik added inline comments.

Thank you for catching that and patching!