This is an archive of the discontinued LLVM Phabricator instance.

[llvm-symbolizer]Fix printing of malformed address values not passed via stdin
ClosedPublic

Authored by TH3CHARLie on Jan 6 2020, 8:28 PM.

Details

Summary

relates https://bugs.llvm.org/show_bug.cgi?id=44443

Adding missing newline when printing bad input values.

Fix testcase

Diff Detail

Event Timeline

TH3CHARLie created this revision.Jan 6 2020, 8:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2020, 8:28 PM

Thanks for the patch! Could you add an additional test case for passing the values via a response file, rather than directly on the command-line, i.e. llvm-symbolizer <some args> @response.txt where response.txt contains "some text", inctwo and "some text2" etc.

Also, I think it would be good to expand invalid-input-address.test to cover this. In fact, you might want to put both your new case and something equivalent to the existing one in sym.test in the same place. That way, we're not lumping multiple different kinds of checks into the same test file.

llvm/test/tools/llvm-symbolizer/sym.test
32

You should probably add the same test case as you have above for llvm-addr2line (the GNU-compatible version of llvm-symbolizer).

llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp
332–338

LLVM style for comments is a leading capital letter and trailing full stop:
// Strip newline characters.

334

Don't forget that on some systems (i.e. Windows), new lines end in '\r\n'. You need to make sure both are stripped.

Also, I think it would be good to expand invalid-input-address.test to cover this. In fact, you might want to put both your new case and something equivalent to the existing one in sym.test in the same place. That way, we're not lumping multiple different kinds of checks into the same test file.

I'm not quite sure about this so I'd like to confirm my understanding again before updating. Feel free to correct me if I misunderstand this: adding new tests(instead of moving existing ones in sym.test) to invalid-input-address.test that checks for invalid value like the sometext used in sym.test. And the testcases cover command line, ordinary textual file and response file.

Also, I think it would be good to expand invalid-input-address.test to cover this. In fact, you might want to put both your new case and something equivalent to the existing one in sym.test in the same place. That way, we're not lumping multiple different kinds of checks into the same test file.

I'm not quite sure about this so I'd like to confirm my understanding again before updating. Feel free to correct me if I misunderstand this: adding new tests(instead of moving existing ones in sym.test) to invalid-input-address.test that checks for invalid value like the sometext used in sym.test. And the testcases cover command line, ordinary textual file and response file.

That's correct. It tests the three different ways you can get inputs into llvm-symbolizer (command-line, response file, stdin). You may want to include llvm-addr2line and llvm-symbolizer versions too.

TH3CHARLie updated this revision to Diff 236540.Jan 7 2020, 2:43 AM

Update according to comments

Also, I think it would be good to expand invalid-input-address.test to cover this. In fact, you might want to put both your new case and something equivalent to the existing one in sym.test in the same place. That way, we're not lumping multiple different kinds of checks into the same test file.

I'm not quite sure about this so I'd like to confirm my understanding again before updating. Feel free to correct me if I misunderstand this: adding new tests(instead of moving existing ones in sym.test) to invalid-input-address.test that checks for invalid value like the sometext used in sym.test. And the testcases cover command line, ordinary textual file and response file.

That's correct. It tests the three different ways you can get inputs into llvm-symbolizer (command-line, response file, stdin). You may want to include llvm-addr2line and llvm-symbolizer versions too.

Well, if so, doesn't it duplicates what has been tested in sym.test?

TH3CHARLie marked 3 inline comments as done.Jan 7 2020, 2:45 AM
TH3CHARLie added inline comments.Jan 7 2020, 2:48 AM
llvm/test/tools/llvm-symbolizer/Inputs/response_addr.txt
2 ↗(On Diff #236540)

I'll update a newline here

Well, if so, doesn't it duplicates what has been tested in sym.test?

Yes, but sym.test is not a very good test as it tries to test too many different things at once, which means it's hard to actually tell what it's testing or to maintain the test. I'd like to split it up/rewrite it/delete it etc at some point, although it's not high on my priority list. If we keep adding things to that test, it will actually make things worse. By putting this testing in the other test case, we have all our testing for this piece of functionality (i.e. what to do about unknown addresses) in one place, making it easy to find etc.

Does that make sense?

Well, if so, doesn't it duplicates what has been tested in sym.test?

Yes, but sym.test is not a very good test as it tries to test too many different things at once, which means it's hard to actually tell what it's testing or to maintain the test. I'd like to split it up/rewrite it/delete it etc at some point, although it's not high on my priority list. If we keep adding things to that test, it will actually make things worse. By putting this testing in the other test case, we have all our testing for this piece of functionality (i.e. what to do about unknown addresses) in one place, making it easy to find etc.

Does that make sense?

Thanks for the explanation! Now it's much more clear to me

jhenderson added inline comments.Jan 7 2020, 3:01 AM
llvm/test/tools/llvm-symbolizer/Inputs/response_addr.txt
2 ↗(On Diff #236540)

A more appropriate thing to do is have these tests generate their input files on the fly with something like the following:

# RUN: echo "some text" 0x40054d "some text2" > %t.rsp
# RUN: llvm-symbolizer @%t.rsp ...

A similar pattern can be used for the stdin reading.

(note: I'm not sure if those quotes will appear in the output or not - you'd best double-check that).

llvm/test/tools/llvm-symbolizer/sym.test
21–22

If you're adding the test cases in a new test, you don't need these cases in this test too.

TH3CHARLie updated this revision to Diff 236565.Jan 7 2020, 5:58 AM

update invalid-input-address.test

TH3CHARLie marked an inline comment as done.Jan 7 2020, 5:58 AM
jhenderson added inline comments.Jan 7 2020, 8:33 AM
llvm/test/tools/llvm-symbolizer/invalid-input-address.test
3–4

I don't think the "good" address is important here, so you can remove it. Also, the comment is referring to the original RUN line that was there before, so move the echo runs to nearer where they're used.

5–23

I think LARGE-ADDR would be preferable here.

7

You should add a comment before this line showing what these are intended to test.

7–9

print-address isn't important for these checks (no address is applicable for bad inputs).

Also, prefer '--' for options isntead of '-'.

Also, I'd recommend naming the CHECK prefixes here something like "LLVM" and llvm-addr2line's to "GNU".

15–17

Let's put the checks nearer their corresponding RUN lines, like this:

RUN: llvm-symbolizer ... | FileCheck --check-prefix=LARGE-ADDR %s

LARGE-ADDR: ...

RUN: llvm-symbolizer ... | FileCheck --check-prefix=BAD-INPUT %s

BAD-INPUT: ...
TH3CHARLie updated this revision to Diff 236733.Jan 7 2020, 6:02 PM
TH3CHARLie marked 5 inline comments as done.

Update invalid-input-address.test according to comments

jhenderson added inline comments.Jan 8 2020, 1:21 AM
llvm/test/tools/llvm-symbolizer/invalid-input-address.test
3

Could you change this to "--check-prefix" instead of "-check-prefix" whilst you're here, please? Same comment applies throughout the test.

17

Nit: add some spaces here, so that it lines up with the check below, like the LARGE-ADDR case:

LLVM-BAD-INPUT:      some text
LLVM-BAD-INPUT-NEXT: some text2
25–26

This was silly of me, sorry, I should have noticed earlier - you don't need separate CHECK sets for llvm-symbolizer and llvm-addr2line, because they test the same things:

Here's what I'd recommend:

RUN: llvm-symbolizer ... | FileCheck %s --check-prefix=BAD-INPUT
# <other llvm-symbolizer cases>

RUN: llvm-addr2line ... | FileCheck %s --check-prefix=BAD-INPUT
# <other llvm-addr2line cases>

BAD-INPUT: ...
TH3CHARLie updated this revision to Diff 236781.Jan 8 2020, 2:01 AM

response to reviews

TH3CHARLie marked 3 inline comments as done.Jan 8 2020, 2:01 AM
jhenderson accepted this revision.Jan 8 2020, 2:21 AM

Assuming the tests all pass, LGTM, thanks! Do you have commit access yet?

This revision is now accepted and ready to land.Jan 8 2020, 2:21 AM

Assuming the tests all pass, LGTM, thanks! Do you have commit access yet?

Thanks for the review! All tests passed and I've recently gained commit access and want to try to commit this. If there's nothing else, I will commit this

Assuming the tests all pass, LGTM, thanks! Do you have commit access yet?

Thanks for the review! All tests passed and I've recently gained commit access and want to try to commit this. If there's nothing else, I will commit this

Great, good luck! Only thing I would say is to make sure to not forget to include "Differential Revision: https://reviews.llvm.org/D72313" as a line in your commit message so that this review gets auto-closed.

This revision was automatically updated to reflect the committed changes.