This is an archive of the discontinued LLVM Phabricator instance.

Fix alignment issues in Clang.
ClosedPublic

Authored by jyknight on Jun 4 2015, 11:04 PM.

Details

Summary

Adds static_asserts to ensure alignment of concatenated objects is
correct, and fixes them where they are not.

Some const-correctness changes snuck in here too, since they were in the
area of code I was modifying.

This seems to make Clang actually work without Bus Error on 32bit
sparc.

Diff Detail

Repository
rL LLVM

Event Timeline

jyknight updated this revision to Diff 27179.Jun 4 2015, 11:04 PM
jyknight retitled this revision from to Fix alignment issues in Clang..
jyknight updated this object.
jyknight edited the test plan for this revision. (Show Details)
jyknight added a subscriber: Unknown Object (MLST).
majnemer added inline comments.
include/clang/AST/Type.h
3927–3933 ↗(On Diff #27179)

This code is commented out.

4276–4280 ↗(On Diff #27179)

As is this.

lib/CodeGen/CGCleanup.cpp
129 ↗(On Diff #27179)

deallocate sounds more natural to me, what do you think?

jyknight updated this revision to Diff 27807.Jun 16 2015, 6:30 PM

Address review comments.

Any further comments? (Note: I will globally replace the text "Alignment sufficient for" to "Alignment is not sufficient for", as per the review comments on the LLVM half of this change.)

jyknight updated this revision to Diff 28891.Jul 1 2015, 12:57 PM

Updated to head.

rnk added a subscriber: rnk.Jul 6 2015, 3:04 PM

Most of actual fixes look good, but this adds an awful lot of static_asserts, and they are very far from the code implementing the trailing object array idiom. I chatted with Richard, and he suggested that wrap up the reinterpret_cast<T>(this + 1) idiom into something higher level. We could write a helper like this:

namespace llvm {
template <TrailingTy, LeadingTy>
TrailingTy *getTrailingObjects(LeadingTy *Obj) {
  static_assert(llvm::AlignOf<LeadingTy>::Alignment >= llvm::AlignOf<TrailingTy>::Alignment,
                "TrailingTy requires more alignment than LeadingTy provides");
  return reinterpret_cast<TrailingTy *>(Obj + 1);
}
}

The static_assert message ends up being less specific, but clang at least will print out the template instantiation backtrace, which should make it easy to figure out. This pattern should work so long as the trailing type is complete before the definition of the accessor, but the TemplateSpecializationType and TemplateArgument case is probably won't work. We should probably do the static_assert manually in the constructor in that case. The constructor also implements the trailing object array idiom, so again it keeps the assert closer to the relevant implementation.

What do you think?

include/clang/AST/Type.h
3974–3980 ↗(On Diff #28891)

This still appears to be an issue?

So, a couple things:

Firstly:

I think it's pretty sad that this idiom of manually dealing with memory layout is used all over the place, and has basically no abstraction. In addition to having to coordinate carefully between allocation and access, as it stands, it's almost impossible to tell from the header what the layout of the object actually is -- for example, in Decl.h, there's no hint that ImportDecl has a SourceLocation array appended to it. This is somewhat unfortunate.

I understand that it's done this way for speed and memory efficiency, but I can't help but think there *must* be a better way to accomplish it than to splatter manual object size arithmetic and reinterpret casts so widely over the codebase.

It'd be great if the GCC zero-length-array extension could be used, or if C++ had adopted C11's flexible array members. That would help in the perhaps 75% of cases where it's just a single optional trailing object, or array of trailing objects of the same type.

That makes the allocation size arithmetic easier -- offsetof(Type, arrayfield[num_elements]). It makes the access easier (just reference the field). And it makes the alignment simply handled by the compiler. I believe GCC's extension is supported by GCC, Clang, and VC++. I guess there's a desire to not use extensions like that even if they are supported by the compilers we care about?

And besides being not standard C++, that also wouldn't work for the ~25% of cases which have multiple kinds of appended objects, that come one after another. As an alternative to the GCC zero-length-array option, I had thought of perhaps making a template base class that handles these sorts of layouts, somehow, but hadn't really thought it through terribly far how it might work. (Or if it'd be too complicated to even be worth it.)

Secondly:

There are more cases where the suggestion doesn't work than just forward declarations. "this + 1" or "X + 1" is only used some of the time, there's many other ways that code accesses appended objects.

E.g. DeclTemplate.h DependentFunctionTemplateSpecializationInfo has:

FunctionTemplateDecl * const *getTemplates() const {
  return reinterpret_cast<FunctionTemplateDecl*const*>(&getTemplateArgs()[NumArgs]);
}

Or then there's DeclRefExpr

return reinterpret_cast<ASTTemplateKWAndArgsInfo *>(
    llvm::alignAddr(&getInternalFoundDecl() + 1,
                    llvm::alignOf<ASTTemplateKWAndArgsInfo>()));

which re-aligns, and thus doesn't require that getInternalFoundDecl() is aligned properly for the ASTTemplateKWAndArgsInfo object -- but the base DeclRefExpr itself really ought to be aligned enough for ASTTemplateKWAndArgsInfo, or else the layout of the object would change depending on its address!

Annnnnnnnnnnnnnnnnyways:

I had originally started to put these asserts in the constructors, but that seemed far from the accesses and was hidden off in a random cpp file, and sometimes there are multiple of them, spread around. So I tried the accesses, but sometimes those were spread around multiple places too. So eventually I just settled on the class definition, because it's always right there and there's one of it. :)

I can make the requested change, in the subset of cases where it works, and will certainly do so if still requested. But I'm not really sure it's worth it, because it removes what is often the only documentation of Weirdness going on, at the class definition site.

And, I think the Right Thing is to address the deeper problem of what really seems to me like borderline unreadable code -- somehow. If that can be done, it will naturally allow alignment asserts to be removed at the same time as the casts and sizeof arithmetic, because it'd be unnecessary if the compiler or some common library code was ensuring that allocation and sizing and offsets and alignment are all done consistently and correctly.

rnk added a comment.Jul 6 2015, 5:00 PM

I think there were issues with trailing arrays, which is why Clang ended up settling on this pattern in the first place. Barring significant changes, I think wrapping this up with some library code is the way to go. That's kind of what we were thinking with getTrailingObjects(). :-)

I think for the oddball cases like multiple trailing arrays, we can do stuff like this:

FunctionTemplateDecl * const *getTemplates() const {
  return getTrailingObjects<FunctionTemplateDecl*const>(&getTemplateArgs()[NumArgs-1]);
}

In general, we probably want to lay out trailing arrays in order of higher alignments to lower alignments to avoid the extra alignment code, and if we get it wrong, the wrapper will tell us.

For the DeclRefExpr::getFoundDeclInternal() case, I could see adding some kind of wrapper struct with 8 byte alignment around the found found Decl pointer.

As far as documentation goes, I'd rather update the Doxygen to indicate which classes have trailing data. Most of the commonly used classes like DeclRefExpr are already annotated as such.

Another benefit of the getTrailingObjects wrapper is that we can static_assert that the class with trailing data is final.

In conclusion, I guess I would like to push for this wrapper. It seems like it moves incrementally closer towards the better end state that you want, which is less pointer arithmetic and casts sprinkled across the AST. Sound reasonable?

rnk added a reviewer: rsmith.Jul 6 2015, 5:01 PM

Yes, that does sounds entirely reasonable.

However, before implementing your suggestion, I'm going to spend at least a little while first trying to write a more complete solution that I actually like, in hopes of not spending time to rework things to get a mediocre improvement when it could be a large improvement.

E.g., something like:

class ImportDecl : public Decl, TrailingObjects<ImportDecl, SourceLocation> {...}

or

class DependentFunctionTemplateSpecializationInfo : TrailingObjects<DependentFunctionTemplateSpecializationInfo, TemplateArgumentLoc, FunctionTemplateDecl*> {...}

with TrailingObjects implementing some useful protected functions.

If this doesn't seem to be working out reasonably, I'll implement your suggestion. :)

Okay, so, I've deleted all the static-asserts from this CL -- so it is now basically just the fixes I had before, and not the static_asserts everywhere. Can we agree to commit a reduced-scope CL, so at least the alignment problems are resolved?

I'm also about to upload some CLs with a proposal for, and some example conversions to TrailingObjects class. If/Once there's agreement that it seems good and the initial few are submitted, I can finish up the remaining classes in follow-up CLs. (I'll cc everyone watching this on those new ones too.)

jyknight updated this revision to Diff 29925.Jul 16 2015, 11:03 AM

Reduce scope of patch.

rnk accepted this revision.Jul 16 2015, 12:14 PM
rnk added a reviewer: rnk.

lgtm

This revision is now accepted and ready to land.Jul 16 2015, 12:14 PM
This revision was automatically updated to reflect the committed changes.
ovyalov added inline comments.
cfe/trunk/lib/CodeGen/EHScopeStack.h
99

Android i386 build sets ScopeStackAlignment = 4 and breaking the build - http://lab.llvm.org:8011/builders/lldb-x86_64-ubuntu-14.04-android/builds/742
Could you handle this use case or consider reverting the CL?

Blah.

So, the issue is that LLVM_ALIGNAS needs an integer, not a constant
expression, because of MSVC limitations. Otherwise, EHCleanupScope would
also specify uint64_t alignment, instead of 8.

I'll just change ScopeStackAlignment to a fixed 8, to match.