This is an archive of the discontinued LLVM Phabricator instance.

[BuildLibCalls] Add more noundef to library functions
ClosedPublic

Authored by aqjune on Aug 13 2020, 4:40 AM.

Details

Summary

This patch follows D85345 and adds more noundef attributes to return values/arguments of library functions
that are mostly about accessing the file system or processes.

A few functions like chmod or times use typedef mode_t and clock_t.
They are neither struct nor union, so they cannot contain undef even if they're lowered to iN in IR. So, it is fine to add noundef to them.

  • clock_t's actual type is size_t (C17, 7.27.1.3), so it isn't struct or union.
  • For mode_t, either int or long is used in practice because programmers use bit manipulation. So, I think it is okay that it's never aggregate in practice.

After this patch, the remaining library functions are those that eagerly participate in optimizations: they can be removed, reordered, or
introduced by a transformation from primitive IR operations.
For them, a few testings is needed, since it may not be valid to add noundef anymore even if C standard says it's okay.

Diff Detail

Event Timeline

aqjune created this revision.Aug 13 2020, 4:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 13 2020, 4:40 AM
aqjune requested review of this revision.Aug 13 2020, 4:40 AM
aqjune edited the summary of this revision. (Show Details)Aug 13 2020, 4:41 AM

Added reviewers who work on noundef

LGTM, although I'm just curious about the omissions here. why not mark something like round noundef?

LGTM, although I'm just curious about the omissions here. why not mark something like round noundef?

Thanks!

I think round can have noundef as well.
Calling round can set floating point exception (FE_INEXACT) if the result does not fit in the type, so round isn't expected to freely move around (e.g. hoist it out of a loop). So I don't think having noundef (which makes round raise UB in undef inputs) will block existing optimizations.

For noundefs for arithmetic functions, I'll make a separate patch (and another separate patch for malloc/free functions which was promised before).

LGTM

Would it be better / easier to apply this to all functions by default, and then opt out some? Which ones would we even need to opt out? I imagine that would only be a handful of "weird" functions like open/openat (for the mode argument), maybe ioctl/fcntl.

LGTM

Would it be better / easier to apply this to all functions by default, and then opt out some? Which ones would we even need to opt out? I imagine that would only be a handful of "weird" functions like open/openat (for the mode argument), maybe ioctl/fcntl.

There were a few tricky functions which could be introduced by optimizations. MergeICmps may transform icmps of loads into memcmp: since icmps of loads can yield poison, having noundef at memcmp's return value may introduce UB.
But, I agree that most of the functions will have noundef, and it should be applied by default after all noundef patches to library fns are landed. Noundefs are added one by one here for review process.

For functions that have conditionally unused arguments like open/openat - I think this is an interesting topic. If the mode argument is allowed to be undef, we can support dead argument elimination by analyzing whether it isn't creating a file. On the other hand, if the mode argument has noundef attribute, we can analyze that the value given to it is never undef.

aqjune added a comment.Sep 8 2020, 1:01 AM

Kindly ping

jdoerfert accepted this revision.Sep 8 2020, 7:36 AM

@eugenis accepted before and this looks OK to me.

This revision is now accepted and ready to land.Sep 8 2020, 7:36 AM
This revision was automatically updated to reflect the committed changes.