Page MenuHomePhabricator

[Clang/Test]: Update tests where `noundef` attribute is necessary
Needs ReviewPublic

Authored by guiand on Jun 22 2020, 10:29 AM.

Details

Summary

The noundef attribute generally applies to all scalar function
arguments and return values. In the interest of having Clang's
test suite reflect its codegen, it has been updated to include
the new attribute.

Most tests were updated programatically, although some
required more manual intervention, usually due to regexps or
other complications in parsing. Some tests were too burdensome
to update immediately, and to these were applied a cc1 flag to
mask emitting noundef attributes in argument position.

Diff Detail

Event Timeline

guiand created this revision.Jun 22 2020, 10:29 AM

This very large diff affects more than 1,000 files. Use the Changeset List to browse changes.

This we will "look" at once the rest is settled ;)

guiand updated this revision to Diff 273570.Jun 25 2020, 5:56 PM
guiand retitled this revision from [Clang/Test]: Update tests where `frozen` attribute is necessary to [Clang/Test]: Update tests where `noundef` attribute is necessary.
guiand edited the summary of this revision. (Show Details)

Updated almost all tests to rename frozen->noundef. Still needs rebase to master

guiand updated this revision to Diff 275771.Jul 6 2020, 11:12 AM

Another wave of test updates

lenary removed a subscriber: lenary.Jul 6 2020, 11:13 AM
guiand updated this revision to Diff 277992.Jul 14 2020, 3:00 PM

Update tests again

guiand updated this revision to Diff 280588.Jul 24 2020, 2:36 PM

Update tests to reflect more strict noundef rules

guiand updated this revision to Diff 281452.Jul 28 2020, 6:53 PM

All tests up to date. Of particular note are the ppc-*mmintrin.c tests, which seemed to drastically change upon rerunning the test autogen script.

@jdoerfert what would the procedure be for reviewing these test changes / getting this landed with the noundef patch?

guiand updated this revision to Diff 282328.Jul 31 2020, 3:11 PM

Rebased; all tests passing again. Removed the change to the ppc-*mmintrin.c tests, instead I just use the -disable-noundef-args flag`. Cleaned up typos.

jdoerfert resigned from this revision.Aug 3 2020, 12:06 AM
jdoerfert edited reviewers, added: rsmith, rjmccall; removed: jdoerfert, sscalpone.
jdoerfert added subscribers: rjmccall, rsmith.

@jdoerfert what would the procedure be for reviewing these test changes / getting this landed with the noundef patch?

Hmmm. I guess @rsmith @rjmccall and others need to decide how this proceeds.

Are you seriously adding an attribute to literally every argument and return value? Why is this the right representation?

If changes like this are required for all these tests this is going to be a complete pain for downstream forks like ours. At the very least, make whatever script you used to update these public, as I don't want to have to recreate it from scratch when merging this in. I had enough "fun" with the LLD mass-renaming (UpperCamel -> lowerCamel) and that was _with_ a supposedly-working script (it didn't quite do the right thing and I seem to recall there being two refactoring commits, only one of which had a script); I do not want a repeat of that experience.

Are you seriously adding an attribute to literally every argument and return value? Why is this the right representation?

As far as I understand, this should not be literally every argument. If it is, I did get something wrong.

guiand added a comment.Aug 6 2020, 1:16 PM

Are you seriously adding an attribute to literally every argument and return value? Why is this the right representation?

This adds an attribute to every argument and return value where the language rules denote it cannot be undef (which is ideally in most places, but definitely not everywhere). In a previous incarnation of this idea, the attribute was called partialinit, and had the inverse meaning -- the idea being that it would be a less invasive change because it is more common for these values to be fully defined (@rsmith at some point noted that in C, return values in general may be undef, which would complicate that a bit, but anyway was the idea). partialinit was scrapped as pretty much everybody agreed it wasn't ideal to have a negative attribute: the presence of partialinit meaning "proceed as normal", and its absence denoting a different behavior.

As for being an attribute, that was the level of granularity we had decided on. The alternative I suppose would be to some form of metadata.

As for being enabled by default, the idea was that transformation maintainers could get to using the attribute right away. This might be a good place to compromise.

And as for applying them to all the tests (rather than suppressing the attribute by default or something along those lines), we figured it would be best to keep the tests reflecting the actual common-case codegen. If noundef was on and showing up everywhere, the idea was to have the tests reflect that.

eugenis added a subscriber: eugenis.Aug 6 2020, 1:22 PM
guiand added a comment.Aug 6 2020, 1:23 PM

At the very least, make whatever script you used to update these public, as I don't want to have to recreate it from scratch when merging this in. I had enough "fun" with the LLD mass-renaming (UpperCamel -> lowerCamel) and that was _with_ a supposedly-working script (it didn't quite do the right thing and I seem to recall there being two refactoring commits, only one of which had a script); I do not want a repeat of that experience.

That's totally fair, and I can try and prepare a more polished version of the script I was using to get started with these changes. But the script isn't perfect, unfortunately, and follows roughly the 80-20 rule. The tests aren't written in a regular language so there are lots of places where manual intervention was necessary.

Sadly I think you're right in that this will cause headaches for downstream forks. Maybe this is another good reason to have noundef turned off by default, at least for now.

Are you seriously adding an attribute to literally every argument and return value? Why is this the right representation?

We've discussed quite a few options in D81678, which is a better place for this conversation.

The original patch used an inverted attribute, partialinit, which would be a lot less common. The problem with that is that it redefines the meaning of when an argument does not have the attribute, which breaks compatibility with existing bitcode and requires updating all frontends and any code that emits new functions or edits function signatures in IR.

There was an option of a function-level attribute in addition to this one, which would mean "all arguments and the return value are noundef". That would be a bit uncomfortable to work with in IR transforms, and, arguably, would not reduce the total number of attributes that much.

There are also a bunch of options that introduce divergence between

  • tests and real use - a test-only cc1 flag to suppress attribute generation
  • msan and non-msan compilation - but the reviewers in D81678 seem pretty excited to use this info for undef/poison optimizations in all configurations

In fact, we already have the test-only cc1 flag that disables the new attribute, and use it in some of the tests. This could be extended to all tests, that should make the tests diff smaller and more mechanical, and hopefully help with the downstream concerns.

Are you seriously adding an attribute to literally every argument and return value? Why is this the right representation?

This adds an attribute to every argument and return value where the language rules denote it cannot be undef (which is ideally in most places, but definitely not everywhere).

Okay, so it's just 90% of arguments and return values. Can you link the discussion? I seem to have completely missed this.

I agree that a positive attribute seems like the right polarity in the abstract, and since undef-ness is a core static property in LLVM, it probably does need to be fairly invasive. It's just unfortunate for a couple of reasons: first, this is going to be quite disruptive for downstream users and forks; second, it's yet another contribution towards the giant pile of attributes that seem to have become necessary to do any work in LLVM; and third, it's progress towards working around the problems with undef rather than towards replacing it entirely with a proper indeterminate-but-consistent abstraction, which is what's been badly needed for years.

guiand added a comment.Aug 6 2020, 3:04 PM

Most of the discussion has been in https://reviews.llvm.org/D81678 (implementing the attribute in clang) and https://reviews.llvm.org/D82316 (adding the attribute to langref).

After discussing with @eugenis, for the meantime it might be best to do the following:

  • Change the masking attribute to be -fdisable-noundef-analysis (name notwithstanding), and have it completely turn off all noundefs
  • Change the llvm-lit configuration to use the new codegen flag for all the tests by default
  • Have noundef emitted in the frontend by default (when the codegen flag isn't present)
jdoerfert added a comment.EditedAug 11 2020, 7:42 PM

After discussing with @eugenis, for the meantime it might be best to do the following:

  • Change the masking attribute to be -fdisable-noundef-analysis (name notwithstanding), and have it completely turn off all noundefs
  • Change the llvm-lit configuration to use the new codegen flag for all the tests by default
  • Have noundef emitted in the frontend by default (when the codegen flag isn't present)

TBH, I don't see how this solves any problem. It just makes it a problem for someone in the future... (FWIW, I say this being in full support of noundef)

TBH, I don't see how this solves any problem. It just makes it a problem for someone in the future... (FWIW, I say this being in full support of noundef)

That's true. At the same time, it might allow us to have a more baby-steps approach: be able to use the attribute now, update tests in their own time, and hopefully amortize the effect on downstream forks as @jrtc27 mentioned.

But the work for updating the tests is done now, in this patch, and as time goes on more and more of this work will need to be duplicated as the tests change every day.

There's one other option here, which has been mentioned a few times but I'll flesh it out for the purposes of discussion.

  • Instead of updating the default test configuration to use -disable-noundef-analysis, we update every failing test (due to noundef) to use that flag in the RUN lines. This unblocks the clang noundef change.
  • Then I can split up this patch into many smaller patches, probably organized by subfolder of clang/test, and these smaller patches would remove the -disable-noundef-analysis flags and add in the relevant noundef attributes.
  • Test authors which don't want theirs changed could voice that they want to just keep the masking flag for their files.

After discussing with @eugenis, for the meantime it might be best to do the following:

  • Change the masking attribute to be -fdisable-noundef-analysis (name notwithstanding), and have it completely turn off all noundefs
  • Change the llvm-lit configuration to use the new codegen flag for all the tests by default
  • Have noundef emitted in the frontend by default (when the codegen flag isn't present)

TBH, I don't see how this solves any problem. It just makes it a problem for someone in the future... (FWIW, I say this being in full support of noundef)

It solves the problem of merge conflicts in downstream forks.

This seems the best we can do given the constraints. I understand the concern that now we are not testing exactly the same mode that is normally used, but on the other hand, the majority of clang tests do not care about noundef - it's an orthogonal feature that just happens to affect their CHECK lines.

@rjmccall We've discussed several different possibilities here. Does any of them strike you as a good step forward here?

We really need to do something about this.
How about a change that adds -fdisable-noundef-analysis to every RUN line with %clang?
(-) We are not testing exactly the same mode that is used by the users - but that's already true for many other flags that clang driver passes to -cc1!
(+) Easy to automate, an update script can be provided to downstream users.
(+) Less "magic" than the llvm-lit idea (4 comments above)

aqjune added a subscriber: aqjune.Nov 5 2020, 6:11 AM

second, it's yet another contribution towards the giant pile of attributes that seem to have become necessary to do any work in LLVM

I don't think this is true. There are a few optimizations disabled (either fully or conditionally) because it is incorrect when its input is undefined (D85765, D85684, a few optimizations in SimplifyCFG). There are even optimizations that are incorrect w.r.t. undef but still running simply because removing all of them isn't practical.
Giving a guarantee that a value is well defined is very helpful because these optimizations can be revived.
Currently, there is no such guarantee in function boundaries because it is legal to pass undef to a function argument. This explains dead argument elimination and function outlining (which may introduce fn call with undef).