This is an archive of the discontinued LLVM Phabricator instance.

[test-suite] Add prototypes to functions in Olden and MiBench
AbandonedPublic

Authored by arichardson on Jan 26 2018, 6:00 AM.

Details

Reviewers
MatzeB
Summary

When calling a function without a prototype it gets called using the
variadic calling convention. For a small number of arguments this is
usually the same as calling a non-variadic function. However, on
architectures such as CHERI the variadic and non-variadic calling
conventions use different registers (even for less than 4 arguments) and
therefore the code will crash at runtime.

This change is required in order to run the test suite on CHERI (and
possibly other architectures that have different variadic and non-variadic
calling conventions). It also allows compiling this code with
-Werror=strict-prototypes, which is currently the default on the
CI for CHERI in order to avoid these runtime crashes.

Event Timeline

arichardson created this revision.Jan 26 2018, 6:00 AM
MatzeB added a comment.EditedJan 26 2018, 11:01 AM

As far as I can tell, those are correct C programs.

Using this old K&R style definitions:

void func(x,y)
  int x;
  int y;
{ ... }

does indeed give you a different calling convention than

void func(int x, int y) { ... }

and the former is supposed to work with a void func() declaration.

This style of coding is luckily getting scarce. But it is still allowed by C and there is software out there written this way. From that perspective I consider it a good thing that we have some code like this in the llvm test-suite so we get some testing.

May I propose a compromise here: Leave the benchmarks as is but add a cmake variable (similar to the existing TEST_SUITE_BENCHMARKING_ONLY) allowing users to optionally skip benchmarks using K&R style functions.

arichardson added a comment.EditedJan 28 2018, 9:37 AM

Calling K&R functions is not a problem. It is fine to use them as long as there is a declaration with the parameter types in the LLVM IR and not just an declare i32 @foo(...). The problem is that we are calling functions that don't have a declaration and therefore the compiler assumes that it is a variadic function call. This leads to runtime crashes on our CHERI platform.

They are not necessarily correct C programs since they include undefined behaviour

Calling K&R functions is not a problem. It is fine to use them as long as there is a declaration with the parameter types in the LLVM IR and not just an declare i32 @foo(...). The problem is that we are calling functions that don't have a declaration and therefore the compiler assumes that it is a variadic function call. This leads to runtime crashes on our CHERI platform.

They are not necessarily correct C programs since they include undefined behaviour

To the best of my knowledge, this is perfectly fine (even in C11):

/* a.c: */
short bar(x)
  short x;
{
    return x;
}

/* b.c: */
short bar();
int main(void) {
    bar(42);
}

If a compiler cannot handle it then it has a bug and I'm not in favor of changing the benchmarks/tests because of a bug. On the contrary: We need tests to uncover such bugs!

I agree with you that llvm representing prototype-less functions and variadic functions the same is unfortunate (and appears to only allow implementations where they are the same at a first glance?). Maybe ask clang or llvm mailing lists for comments...

Your example will work for x86 and probably ARM, but as far as I can tell it is up to the ABI which calling convention should be used for functions without prototypes.
For us these calling conventions are incompatible since the variadic one will pass arguments on the stack instead of in registers so if the target function is not actually variadic this will cause it to read garbage from undefined registers. While it *should* work if there is only zero or one arguments it is not guaranteed to work with more than 1.

I also found that apparently WebAssembly does the same: https://github.com/WebAssembly/tool-conventions/issues/16 and it also causes issues with functions without prototypes (https://bugs.llvm.org/show_bug.cgi?id=35385)

MatzeB requested changes to this revision.Feb 9 2018, 8:41 AM

Your example will work for x86 and probably ARM, but as far as I can tell it is up to the ABI which calling convention should be used for functions without prototypes.
For us these calling conventions are incompatible since the variadic one will pass arguments on the stack instead of in registers so if the target function is not actually variadic this will cause it to read garbage from undefined registers. While it *should* work if there is only zero or one arguments it is not guaranteed to work with more than 1.

I also found that apparently WebAssembly does the same: https://github.com/WebAssembly/tool-conventions/issues/16 and it also causes issues with functions without prototypes (https://bugs.llvm.org/show_bug.cgi?id=35385)

I am repeating myself:

  • This is perfectly legal C, so I consider it a legitimate test.
  • If your compiler cannot handle it (even if it is llvms fault) then you need to fix the compiler not the test.
  • I am fine to add a cmake option to skip this test for cases where it doesn’t work.
This revision now requires changes to proceed.Feb 9 2018, 8:41 AM
arichardson abandoned this revision.Feb 9 2018, 9:29 AM

Okay I won't bother submitting further improvements that actually allow us to run the test suite without crashing.

I'll just add that what we do in WebAssembly right now for unprototyped functions is to codegen them using the fixed-arg calling convention; at link-time if there are any mismatches between the definition and any calls, then the program has UB, and the resulting WebAssembly module will be invalid. I think this currently has the problem that if the definition does turn out to be vararg, then we have generated an invalid module even though the program is correct (I consider that a bug or shortcoming in our toolchain though). In any case, we do aim to support correct programs with prototypeless functions. If the test suite has non-vararg cases where there are mismatches between the callsites and/or the definition (I have seen that in some other test/benchmark suites), then even though that works on most X86 and ARM ABIs that I know of it's still UB and I'd be in favor of removing it from LLVM's test suite.

I'll just add that what we do in WebAssembly right now for unprototyped functions is to codegen them using the fixed-arg calling convention; at link-time if there are any mismatches between the definition and any calls, then the program has UB, and the resulting WebAssembly module will be invalid. I think this currently has the problem that if the definition does turn out to be vararg, then we have generated an invalid module even though the program is correct (I consider that a bug or shortcoming in our toolchain though). In any case, we do aim to support correct programs with prototypeless functions. If the test suite has non-vararg cases where there are mismatches between the callsites and/or the definition (I have seen that in some other test/benchmark suites), then even though that works on most X86 and ARM ABIs that I know of it's still UB and I'd be in favor of removing it from LLVM's test suite.

  • My reading of the C standard is that using a prototype less declaration and defining that as a vararg functions is invalid in C. If you find bugs in this area we should fix them.
  • This patch simply removes all prototype less functions which is not fine.
  • The llvm IR coming out of clang appears to mark all prototype less functions as varargs which from what I suspect makes it impossible to implement varargs different from prototype-less functions. But that is a problem in clang/IR, it doesn't change the fact that the tests are valid C.

I'll just add that what we do in WebAssembly right now for unprototyped functions is to codegen them using the fixed-arg calling convention; at link-time if there are any mismatches between the definition and any calls, then the program has UB, and the resulting WebAssembly module will be invalid. I think this currently has the problem that if the definition does turn out to be vararg, then we have generated an invalid module even though the program is correct (I consider that a bug or shortcoming in our toolchain though). In any case, we do aim to support correct programs with prototypeless functions. If the test suite has non-vararg cases where there are mismatches between the callsites and/or the definition (I have seen that in some other test/benchmark suites), then even though that works on most X86 and ARM ABIs that I know of it's still UB and I'd be in favor of removing it from LLVM's test suite.

  • My reading of the C standard is that using a prototype less declaration and defining that as a vararg functions is invalid in C. If you find bugs in this area we should fix them.

We should eliminate UB from the test suite. I'm also not sure that this is UB, however. Looking at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf section 6.5.2.2 on Function calls, it seems that this is okay so long as the types actually used to define the function agree with the types used for the call (after default promotion rules are applied). The most relevant text seems to be in p6, " If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined."

  • This patch simply removes all prototype less functions which is not fine.
  • The llvm IR coming out of clang appears to mark all prototype less functions as varargs which from what I suspect makes it impossible to implement varargs different from prototype-less functions. But that is a problem in clang/IR, it doesn't change the fact that the tests are valid C.

Agreed.

hfinkel added a subscriber: rsmith.Feb 9 2018, 11:47 AM

I'll just add that what we do in WebAssembly right now for unprototyped functions is to codegen them using the fixed-arg calling convention; at link-time if there are any mismatches between the definition and any calls, then the program has UB, and the resulting WebAssembly module will be invalid. I think this currently has the problem that if the definition does turn out to be vararg, then we have generated an invalid module even though the program is correct (I consider that a bug or shortcoming in our toolchain though). In any case, we do aim to support correct programs with prototypeless functions. If the test suite has non-vararg cases where there are mismatches between the callsites and/or the definition (I have seen that in some other test/benchmark suites), then even though that works on most X86 and ARM ABIs that I know of it's still UB and I'd be in favor of removing it from LLVM's test suite.

  • My reading of the C standard is that using a prototype less declaration and defining that as a vararg functions is invalid in C. If you find bugs in this area we should fix them.

We should eliminate UB from the test suite. I'm also not sure that this is UB, however. Looking at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf section 6.5.2.2 on Function calls, it seems that this is okay so long as the types actually used to define the function agree with the types used for the call (after default promotion rules are applied). The most relevant text seems to be in p6, " If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined."

@rsmith, is this right?

  • This patch simply removes all prototype less functions which is not fine.
  • The llvm IR coming out of clang appears to mark all prototype less functions as varargs which from what I suspect makes it impossible to implement varargs different from prototype-less functions. But that is a problem in clang/IR, it doesn't change the fact that the tests are valid C.

Agreed.

There is also an upside to not supporting unprototyped functions. Not only did people writing C in the distant past before prototypes not know what we know today about what it takes to have reasonably robust C code, but leaving C code unmaintained for decades is the opposite of what it takes to have reasonably robust C code, so such code is in double trouble. Obliging people who have unmaintained C codebases to either maintain them or forego using them on new platforms is a positive outcome. It's within reason to assign a variety of relative weights to that outcome, relative to the other outcomes.

  • My reading of the C standard is that using a prototype less declaration and defining that as a vararg functions is invalid in C. If you find bugs in this area we should fix them.

Interesting, I'll look closer at that. We haven't gotten any complaints about it so far.

  • This patch simply removes all prototype less functions which is not fine.

Sorry, in case I wasn't clear, I agree that these tests are valid C and we should not change them.

  • The llvm IR coming out of clang appears to mark all prototype less functions as varargs which from what I suspect makes it impossible to implement varargs different from prototype-less functions. But that is a problem in clang/IR, it doesn't change the fact that the tests are valid C.

Yeah, you can sort of cheat because in C, vararg functions must have at least one non-vararg parameter (at least AFAIK no C standard implemented by Clang allows that). So you can tell a real vararg function declare i32 @foo(i32, ...) from a non-prototype function declare i32 @foo(...). I suspect that only works for C though.