This is an archive of the discontinued LLVM Phabricator instance.

[Diagnostics] Add error handling to FormatDiagnostic()
Needs ReviewPublic

Authored by jkorous on Sep 10 2018, 10:02 AM.

Details

Reviewers
arphaman
vsapsai
Summary

There seem to be couple implicit assumptions that might be better represented explicitly by asserts.

Diff Detail

Repository
rC Clang

Event Timeline

jkorous created this revision.Sep 10 2018, 10:02 AM

Regarding the asserts to catch potential problems, seems most of them are for buffer overflows. Aren't sanitizers catching those cases, specifically Address Sanitizer? I haven't checked, just seems it would be good to check buffer overflow automatically instead of using explicit asserts.

Also there are a few changes I wouldn't call NFC. Those change loop iteration from "iterator != end" to "iterator < end". As it is functionality change, I'd like to have tests to cover that. Also I've fixed a few bugs with going past the end of buffer and bugs were actually inside the loop, not with buffer range check. It is tempting to play safe but it has a risk of hiding real bugs.

lib/Basic/Diagnostic.cpp
804

For example, I wouldn't call this one NFC.

jkorous planned changes to this revision.Sep 10 2018, 11:36 AM

Hi Volodymyr,
Thanks for the feedback - interesting points!

I see your point regarding NFC - I am going to drop it as you are right.

Two things about sanitizers come to mind:

  1. You'd need to guarantee that you have all possible code paths (or at least those you are able to cover with asserts) covered in tests to be able to replace asserts with sanitizers. (I think that even if that would be feasible asserts would prove to be way simpler.)
  2. I prefer explicit assert right in the place where an assumption is made to test somewhere else as it make understanding the code much easier.

Those change loop iteration from "iterator != end" to "iterator < end". As it is functionality change, I'd like to have tests to cover that.

Technically you are right but I assume (ok, busted) that without any bug in the iteration this is NFC. I will try to look if I can find some simple input that would break current implementation.

Also I've fixed a few bugs with going past the end of buffer and bugs were actually inside the loop, not with buffer range check. It is tempting to play safe but it has a risk of hiding real bugs.

But that almost sounds as that we should write fragile code so bugs from other parts of codebase show up... Anyway, I will think about this a bit more, it's an interesting point.

lib/Basic/Diagnostic.cpp
804

You are right, I am gonna drop the NFC.

jkorous retitled this revision from [Diagnostics][NFC] Add error handling to FormatDiagnostic() to [Diagnostics] Add error handling to FormatDiagnostic().Sep 10 2018, 11:36 AM

I tried to come up with some input that breaks current implementation so I could add the test. Problem is that invalid memory read doesn't guarantee deterministic crash.
E. g. with this input the current implementation is definitely reading way past the buffer:

SmallVector<char, 1> IgnoreMe;
const char* Foo = "foo%";
const char* FooEnd = Foo + 4;
Diag.FormatDiagnostic(Foo, FooEnd, IgnoreMe);

...and it actually found some string there yet it didn't crash until it hit some unrelated assert

(lldb) p DiagStr
(const char *) $0 = 0x0000000100adc53b " SplatSizeInBits == 0 && \"SplatSizeInBits must divide width!\""
(lldb) p *DiagStr
(const char) $1 = ' '
(lldb) p DiagEnd
(const char *) $2 = 0x0000000100ad4155 "0"

The only reliable fail is passing nullptr which currently leads to SIGABRT (nullptr dereferenced)

SmallVector<char, 1> IgnoreMe;
const char* Foo = "foo";
Diag.FormatDiagnostic(Foo, nullptr, IgnoreMe);

I am reconsidering the necessity of such tests here. WDYT?

jkorous requested review of this revision.Sep 10 2018, 2:55 PM

Hi Volodymyr, do you think you might take another look?

It seems like there are too many asserts and they are too specific, they seem to be aimed at specific potential bugs. What about asserts that make sure we maintain some invariants? For example, check DiagStr < DiagEnd once in a loop instead of every place we increment DiagStr. Do you think it should catch the same problems but maybe a little bit later?

My suggestion is based on assumption that FormatDiagnostic works only with predefined format strings and not with strings provided by compiler users (arguments can be provided by users but not format strings).

I won't be able for some time to check the review again, so it's OK not to wait for my approval.