This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Handle target-specific builtins returning aggregates.
ClosedPublic

Authored by simon_tatham on Jan 6 2020, 7:01 AM.

Details

Summary

A few of the ARM MVE builtins directly return a structure type. This
causes an assertion failure at code-gen time if you try to assign the
result of the builtin to a variable, because the RValue created in
EmitBuiltinExpr from the llvm::Value produced by codegen is always
made by RValue::get(), which creates a non-aggregate RValue that
will fail an assertion when AggExprEmitter::withReturnValueSlot calls
Src.getAggregatePointer(). A similar failure occurs if you try to use
the struct return value directly to extract one field, e.g.
vld2q(address).val[0].

The existing code-gen tests for those MVE builtins pass the returned
structure type directly to the C return statement, which apparently
managed to avoid that particular code path, so we didn't notice the
crash.

Now EmitBuiltinExpr checks the evaluation kind of the builtin's return
value, and does the necessary handling for aggregate returns. I've added
two extra test cases, both of which crashed before this change.

Event Timeline

simon_tatham created this revision.Jan 6 2020, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2020, 7:01 AM
rjmccall added inline comments.Jan 6 2020, 9:37 AM
clang/lib/CodeGen/CGBuiltin.cpp
4332

The requirement is that we construct an RValue of the right kind for the expression's result type, so the correct approach here is to switch over getEvaluationKind(E->getType()). This is also a more reliable signal than the presence of the return value slot.

We then need to think about how the return value will be returned to us in each case:

  • For scalars, RValue::get(V) is right.
  • We can assert for now that there are no target builtins returning a complex type; if we ever need to add this, we'll probably have to break apart an IR struct.
  • So that just leaves aggregates. I believe we can't rely on ReturnValue being non-null on entry to this function; for example, IIRC we'll pass down a null slot if the code does an immediate member access into the result, e.g. vld2q(...).val[0]. It looks like the custom logic for VLD24 does not do anything reasonable in this case: it returns a first-class struct, which the rest of IRGen will definitely not expect. Probably the most reasonable thing would be to create a slot if we have an aggregate result and the caller didn't give us a slot (out here in EmitBuiltinExpr), and then we can just require EmitTargetBuiltinExpr to return null or something else that makes it clear that they've done the right thing with the slot that was passed in. You can create a slot with CreateMemTemp.

Thanks for the helpful advice!

Here's a revised version that checks getEvaluationKind. I've also added a test for the vld2q(...).val[0] construction you mentioned: you're right, of course, that that also crashed existing clang, and it still crashed with my previous patch applied. This version seems to work in both cases.

and then we can just require EmitTargetBuiltinExpr to return null or something else that makes it clear that they've done the right thing with the slot that was passed in

The current return value of EmitTargetBuiltinExpr will be the llvm::Value * corresponding to the last of a sequence of store instructions that writes the constructed aggregate into the return value slot. There's no need to actually use that Value for anything in this case, but we can still test that it's non-null, to find out whether EmitTargetBuiltinExpr has successfully recognized a builtin and generated some code, or whether we have to fall through to the ErrorUnsupported.

simon_tatham edited the summary of this revision. (Show Details)Jan 7 2020, 5:56 AM
simon_tatham marked an inline comment as done.Jan 7 2020, 5:58 AM

The current return value of EmitTargetBuiltinExpr will be the llvm::Value * corresponding to the last of a sequence of store instructions that writes the constructed aggregate into the return value slot. There's no need to actually use that Value for anything in this case, but we can still test that it's non-null, to find out whether EmitTargetBuiltinExpr has successfully recognized a builtin and generated some code, or whether we have to fall through to the ErrorUnsupported.

Ah, yes, good point.

clang/lib/CodeGen/CGBuiltin.cpp
4350

I would just structure this as:

// If the result is an aggregate, force ReturnValue to be non-null so that
// the target-specific emission code can always just emit into it.
TypeEvaluationKind EvalKind = getEvaluationKind(E->getType());
if (EvalKind == TEK_Aggregate && ReturnValue.isNull()) { ... }

// Try to emit a target builtin.
if (Value *V = EmitTargetBuiltinExpr(BuiltinID, E, ReturnValue)) {
 // Turn the result into the appropriate kind of RValue.
 switch (EvalKind) { ... }
}
simon_tatham marked an inline comment as done.
simon_tatham edited the summary of this revision. (Show Details)

Restructured as suggested.

rjmccall added inline comments.Jan 8 2020, 10:17 AM
clang/lib/CodeGen/CGBuiltin.cpp
4352

We generally don't add default cases to exhaustive switches. You can put this llvm_unreachable after the switch.

simon_tatham marked 2 inline comments as done.

Moved the llvm_unreachable.

rjmccall accepted this revision.Jan 9 2020, 8:54 AM

Thanks, LGTM!

This revision is now accepted and ready to land.Jan 9 2020, 8:54 AM
This revision was automatically updated to reflect the committed changes.