This is an archive of the discontinued LLVM Phabricator instance.

Don't assume promotable integers are zero/sign-extended already in x86-64 ABI.
Needs ReviewPublic

Authored by emilio on Jan 14 2020, 3:55 PM.

Details

Summary

Sign-extension is not guaranteed by the ABI, and thus the callee cannot assume
it.

This fixes PR44228 and PR12207.

With these changes, there are still two tests that need updating:

  • CodeGenObjC/optimized-setter.m fails with:
clang: llvm/lib/IR/Instructions.cpp:400: void llvm::CallInst::init(llvm::FunctionType *, llvm::Value *, ArrayRef<llvm::Value *>, ArrayRef<llvm::OperandBundleDef>, const llvm::Twine &): Assertion `(i >= FTy->getNumParams() || FTy->getParamType(i) == Args[i]->getType()) && "Calling a function with a bad signature!"' failed.
  • OpenMP/parallel_for_simd_codegen.cpp fails with:
clang: clang/lib/CodeGen/CGCall.cpp:3858: clang::CodeGen::RValue clang::CodeGen::CodeGenFunction::EmitCall(const clang::CodeGen::CGFunctionInfo &, const clang::CodeGen::CGCallee &, clang::CodeGen::ReturnValueSlot, const clang::CodeGen::CallArgList &, llvm::CallBase **, clang::SourceLocation): Assertion `IRFuncTy == TypeFromVal' failed.

These two are probably bad assumptions in some of the ObjC / OpenMP-specific
code, but I want to check this patch is on the right track before digging more.

Diff Detail

Event Timeline

emilio created this revision.Jan 14 2020, 3:55 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
jdoerfert resigned from this revision.Jan 14 2020, 5:35 PM
nikic added a subscriber: nikic.Jan 15 2020, 1:03 AM

Sign-extension is not guaranteed by the ABI, and thus the callee cannot assume it.

This is a bit too vague.
Can you quote specific parts of the appropriate standard/document?

The relevant discussion is here. From the "Parameter Passing" section in https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf, there's no mention of sign-extension requirements for arguments.

The only related thing is
"When a value of type _Bool is returned or passed in a register or on the stack,
bit 0 contains the truth value and bits 1 to 7 shall be zero."
with a footnote:
"Other bits are left unspecified, hence the consumer side of those values can rely on it being 0 or 1 when truncated to 8 bit."

which says that _Bool has only significant low 8 bits and the rest is unspecified.

Then Michael Matz (one of the editors of that document) clarifies below:

Yes, we (intentionally) haven't required any extensions to happen for arguments

or return values smaller than 64bit (e.g. we haven't even specified that arguments <= 32bit would be zero-extended in the high bits, as would have been
natural with the instruction set). If LLVM relies on that it would be a bug.

There's more information on that bug and related ones, but I think the above should be enough :)

+@chandlerc @rjmccall

Didn't we work this out already when John added the alignment tracking stuff? I remember this bug involving libjpegturbo standalone assembly receiving a 32-bit argument, and then using the full 64-bit RDI register to read it, but clang stopped zero extending it. I thought the ABI doc was ambiguous at the time, and we studied GCC's behavior, and that is how we arrived at clang's current behavior. I am generally skeptical about changing behavior in this area just because the ABI doc says something now. It keeps changing. What did it say in the past? Shouldn't we pay more attention to that?

Found the libjpegturbo thing:
https://github.com/libjpeg-turbo/libjpeg-turbo/commit/498d9bc92fcf39124b6f08e57326944dedd2ddd6

In D72742#1822554, @rnk wrote:

+@chandlerc @rjmccall

Didn't we work this out already when John added the alignment tracking stuff? I remember this bug involving libjpegturbo standalone assembly receiving a 32-bit argument, and then using the full 64-bit RDI register to read it, but clang stopped zero extending it. I thought the ABI doc was ambiguous at the time, and we studied GCC's behavior, and that is how we arrived at clang's current behavior. I am generally skeptical about changing behavior in this area just because the ABI doc says something now. It keeps changing. What did it say in the past? Shouldn't we pay more attention to that?

Found the libjpegturbo thing:
https://github.com/libjpeg-turbo/libjpeg-turbo/commit/498d9bc92fcf39124b6f08e57326944dedd2ddd6

That seems somewhat related. This would be a similar change to the one that clang did back then, but for the bytes in the lower 32-bits when the argument is smaller than that.

The ABI document, afaict, has never defined the inputs to be zero-extended. See https://groups.google.com/forum/?hl=en#!topic/x86-64-abi/E8O33onbnGQ for an old thread discussing this, and the two bugs that are referenced in the commit message (one of them fairly old).

GCC does avoid the zero-extension on the caller sometimes, when not required by the language and when its optimizer finds it suitable.

rnk added a comment.Jan 15 2020, 2:00 PM

GCC does avoid the zero-extension on the caller sometimes, when not required by the language and when its optimizer finds it suitable.

I believe at the time, faced with ambiguity in the document, experiments were done to try to uncover what GCC actually did in practice. I guess LLVM settled on the "things are extended to 32-bits" model.

This change seems like it has backwards compatibility concerns with clang. If new clang stops extending things and it calls old clang code, won't that cause breakages? Based on the test code updates, it seems like you are updating both call sites and function prototypes. Won't that be problematic for the users that care about ABI stability over GCC ABI compatibility?

clang/test/CodeGen/2007-06-18-SextAttrAggregate.c
11

This test was originally added for x86_64, see the bug:
https://bugs.llvm.org/show_bug.cgi?id=1513

It needs to be updated.

clang/test/CodeGenCXX/const-init-cxx11.cpp
536

I agree it would be nice to avoid emitting these coercions through memory for such simple cases.

Didn't we work this out already when John added the alignment tracking stuff? I remember this bug involving libjpegturbo standalone assembly receiving a 32-bit argument, and then using the full 64-bit RDI register to read it, but clang stopped zero extending it.

So, this is complicated. LLVM IR's zext/sext representation is not explicit about the width to which types are extended. Per the language reference, the choice of extension width is at least target-specific, and I say "at least" because nothing is obviously stopping it from also being type-specific; that is, in theory i1 zeroext could mean an extension to 8-bit but i8 zeroext could mean an extension to 32-bit or beyond. The use of isPromotableIntegerType() here means that this only applies to integer types of lower rank than int, which on x86_64 are all 16-bit or less; that apparently embeds an assumption that the promotion width is 32-bit for this target. libjpegturbo was incorrectly assuming that a formally 32-bit argument was extended to the full 64-bit register width; obviously that's related to what we're talking about here, but it's not actually affected by the proposed patch.

Extending arguments to at least the width of int is necessary if you want unprototyped functions with small arguments to be callable through a prototype. The reverse isn't true, and no such concern applies to return types, so all in all I doubt that this was an influence on Clang's behavior. We should already be explicitly extending arguments to i32 when passing an unprototyped or non-required argument.

This change seems like it has backwards compatibility concerns with clang. If new clang stops extending things and it calls old clang code, won't that cause breakages? Based on the test code updates, it seems like you are updating both call sites and function prototypes. Won't that be problematic for the users that care about ABI stability over GCC ABI compatibility?

What defines the ABI for a particular target is also complicated. For established targets with binary-compatibility requirements, ABI is arguably best defined by what you're not willing to break compatibility with. Clang supports several targets for which the system compiler is not GCC, and therefore GCC's behavior is not particularly relevant to the discussion (for those targets). I suspect that at least PS4 would strongly object to an ABI change here, no matter what the ABI document says; I don't know about Darwin, BSD, etc. So I think it is going to be necessary to make this target-OS-specific.

For targets for which GCC clearly defines the ABI, though, I think we clearly need to take this change.

clang/test/CodeGenCXX/const-init-cxx11.cpp
536

In the short term, yes, we should update the code so that we don't need a coercion through memory to go from i1 to i8.

The right long-term change is that LLVM's zext/sext attributes should encode the width to which the value is extended.

emilio added a comment.Feb 3 2020, 5:29 AM

Could anyone update me with how do they want me to proceed here? Is fixing the coercions enough to allow this to land? Do I need to make it target-specific? If so, which targets should keep the current behavior?

Another slightly more backwards-compatible alternative (though hacky, arguably) is to just not use the zext for optimization purposes in the backend. This would be simpler and should keep clang always sign-extending on the caller too, for now. We could then after a while, switch to this approach. D71178 contains such a patch, for comparison.

Thoughts?

Could anyone update me with how do they want me to proceed here? Is fixing the coercions enough to allow this to land? Do I need to make it target-specific?

It needs to be target-specific.

If so, which targets should keep the current behavior?

At least Darwin and PS4, maybe some others. BSD?

Another slightly more backwards-compatible alternative (though hacky, arguably) is to just not use the zext for optimization purposes in the backend. This would be simpler and should keep clang always sign-extending on the caller too, for now. We could then after a while, switch to this approach. D71178 contains such a patch, for comparison.

It wouldn't be unreasonable to have a way to request this behavior, but I'm not sure about switching to it for all zext / sext.

I wrote some stuff on https://bugs.llvm.org/show_bug.cgi?id=44228, probably best to continue the discussion there. I really don't think we should take this patch -- at least not without reopening the ABI discussion first.

Passing-by remark: i'm not sure it was noted already,
but i believe that the function linkage should be differentiated here,
the ABI mismatch can only exist if the function can actually be called
from the external code, which isn't always the case.

Herald added a project: Restricted Project. · View Herald TranscriptApr 2 2022, 8:54 AM