This is an archive of the discontinued LLVM Phabricator instance.

Detect source location overflow due includes
ClosedPublic

Authored by dnsampaio on Nov 13 2019, 7:45 AM.

Details

Summary

As discussed in http://lists.llvm.org/pipermail/cfe-dev/2019-October/063459.html
the overflow of the souce locations (limited to 2^31 chars) can generate all sorts of
weird things (bogus warnings, hangs, crashes, miscompilation and correct compilation).
In debug mode this assert would fail. So it might be a good start, as in PR42301,
to detect the failure and exit with a proper error message.

Diff Detail

Event Timeline

dnsampaio created this revision.Nov 13 2019, 7:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 13 2019, 7:45 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
dnsampaio marked an inline comment as done.Nov 13 2019, 7:49 AM
dnsampaio added inline comments.
clang/lib/Basic/SourceManager.cpp
587

For debug builds, I could not find any other way to not reach an assert failure other than exiting here. Perhaps there is a more llvm specific way to die? llvm_unreachable() ?

miyuki added inline comments.Nov 13 2019, 8:25 AM
clang/lib/Basic/SourceManager.cpp
587

There is a similar error err_file_too_large. How is it handled? llvm_unreachable is intended for code that is unreachable: it causes an assertion failure in debug builds and undefined behavior in release builds.

miyuki added inline comments.Nov 13 2019, 8:26 AM
clang/include/clang/Basic/DiagnosticCommonKinds.td
288

... >, DefaultFatal;

dnsampaio updated this revision to Diff 229113.Nov 13 2019, 8:38 AM
dnsampaio marked 2 inline comments as done.
  • Add ", DefaultFatal";
clang/lib/Basic/SourceManager.cpp
587

The err_file_too_large above in this file allows the function to finish and return further on. Here, we assert false before the message being printed.
If llvm_unreachable is undefined behavior, then I should stick to exit(1) ?

miyuki added inline comments.Nov 13 2019, 8:57 AM
clang/lib/Basic/SourceManager.cpp
587

I think you should return an invalid file ID (i.e. return FileID(); and propagate the error through the call stack. Clang can be used as a library, so you can't just call exit(), this would terminate the user's program.

dnsampaio updated this revision to Diff 229254.Nov 14 2019, 2:33 AM
dnsampaio marked an inline comment as done.
  • Return an invalid FileID instead of exiting.
dnsampaio marked an inline comment as done.Nov 14 2019, 2:34 AM
dnsampaio added inline comments.
clang/lib/Basic/SourceManager.cpp
587

It will still reach an false assert in builds that enable them. But in release it linger on and ends with the correct warning.

dnsampaio marked an inline comment as done.Nov 19 2019, 3:22 AM

Ping

miyuki added inline comments.Nov 19 2019, 3:33 AM
clang/lib/Basic/SourceManager.cpp
587

Has this problem been fixed?

dnsampaio marked 2 inline comments as done.Nov 19 2019, 5:50 AM

Yes. It does return a non-valid FileID, and in builds without assert you get the expected error message.

Yes. It does return a non-valid FileID, and in builds without assert you get the expected error message.

I was asking about "It will still reach an false assert in builds that enable them". Has this been fixed?

dnsampaio marked an inline comment as done.Nov 19 2019, 6:27 AM
dnsampaio added inline comments.
clang/lib/Basic/SourceManager.cpp
587

No. Don't intend to fix the in this patch, as they would be failing at the moment anyway for such cases.

miyuki requested changes to this revision.Nov 19 2019, 6:38 AM

This essentially means that if there is a problem related to this change, say something goes wrong after return FileID(); it would be impossible to debug until the corresponding assertions are fixed. Also the patch lacks tests (and adding a test is impossible because it would fail in debug builds).

This revision now requires changes to proceed.Nov 19 2019, 6:38 AM

Git diff 979ae80af7ec49624b932954d22cb91900f17121 did not send a test as well. Feel free to send me a reasonable sized reproducer, the one I have is about 36MB. Don't think it will be that well received.

Git diff 979ae80af7ec49624b932954d22cb91900f17121 did not send a test as well. Feel free to send me a reasonable sized reproducer, the one I have is about 36MB. Don't think it will be that well received.

#!/usr/bin/env python3

def gen_100k():
    s = '/*' + (' '*95) + '*/\n'
    with open('inc100k.h', 'wt') as out_f:
        for _ in range(0, 1000):
            out_f.write(s)

def gen_100M():
    s = '#include "inc100k.h"\n'
    with open('inc100M.h', 'wt') as out_f:
        for _ in range(0, 1000):
            out_f.write(s)

def gen_main():
    s = '#include "inc100M.h"\n'
    with open('reproducer.c', 'wt') as out_f:
        for _ in range(0, 25):
            out_f.write(s)

def main():
    gen_100k()
    gen_100M()
    gen_main()

if __name__ == '__main__':
    main()

Just wondering, I don't fully understand why is that important from the point I do a return FileID();. The critical error has already been inserted to the error list. clang shall either die due an assert in debug, or with a nice message in release. Unless clang is all broken, and the error message would be corrupted, which I don't believe is the case.

miyuki added a comment.Dec 2 2019, 7:58 AM

clang shall either die due an assert in debug [...]

I think I have already mentioned the reasons why I don't think this is acceptable:

  • it would break tests, as they are usually run with enabled assertions (and since this diagnostic is quite easy to test, it should be tested)
  • it would make the code harder to debug (i.e. if someone finds another problem related to large includes, they would first need to fix the failing assertions before they could start debugging their problem)

IMHO we should discuss this in person, that would be quicker.

Rebased

Created a early exit when a source location overflow is detected,
avoiding further processing or reaching other assert failures.
For that, it was required to create a 'Failure' kind for includes,
as well add return points along the callstack.

Added test that works in both release and debug modes.

miyuki accepted this revision.Jan 17 2020, 11:00 AM

LGTM, but since we both work at Arm, let's wait a week for other folks to chime in if they have any objections.

This revision is now accepted and ready to land.Jan 17 2020, 11:00 AM
  • Fixed to return correct token
This revision was automatically updated to reflect the committed changes.