This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Implement --prefix option
ClosedPublic

Authored by tinti on Jul 31 2020, 5:26 AM.

Details

Summary

[llvm-objdump] Implement --prefix option

The prefix given to --prefix will be added to GNU absolute paths when
used with --source option (source interleaved with the disassembly).

This matches GNU's objdump behavior.

GNU and C++17 rules for absolute paths are different.

Depends on D87667.
Fixes PR46368.

Diff Detail

Event Timeline

tinti created this revision.Jul 31 2020, 5:26 AM
tinti requested review of this revision.Jul 31 2020, 5:27 AM

Thanks for looking at this! I don't have time to review today, but will try to do so early next week. In the meantime, please could you reupload your patch with full context (see https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface for how to do this).

rengolin added inline comments.Jul 31 2020, 5:50 AM
llvm/test/tools/llvm-objdump/X86/disassemble-archive-with-modified-source-path.ll
10 ↗(On Diff #282201)

These are Unix tools and not always available on Windows (PowerShell emulate some but not all behaviour). Do we need a tag to make sure it only runs on platforms we know will cope with the command lines above?

I'm not sure anyone cares about GNU compatibility on Windows anyway. :)

tinti added inline comments.Jul 31 2020, 8:52 AM
llvm/test/tools/llvm-objdump/X86/disassemble-archive-with-modified-source-path.ll
10 ↗(On Diff #282201)

Agreed. But I believe in this case I need to fix.

Current:

  • myprefixC:/ws/w16c2-1

Expected:

  • C:/myprefix/ws/w16c2-1

    8: c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llvm-objdump.exe: warning: 'a.o': failed to find source myprefixC:/ws/w16c2-1/llvm-project/premerge-checks/build/test/tools/llvm-objdump/X86/Output/disassemble-archive-with-modified-source-path.ll.tmp/subdir\a.c
MaskRay requested changes to this revision.Jul 31 2020, 11:37 AM
MaskRay added inline comments.
llvm/test/tools/llvm-objdump/X86/disassemble-archive-with-modified-source-path.ll
1 ↗(On Diff #282201)

Please don't just copy disassemble-archive-with-source.ll

The test is named disassemble-archive-with-modified-source-path.ll
I wonder whether using an archive is significant to the test. If not, just use a plain .o

If an archive can trigger a slightly different code path you also want to test, you can add a second llvm-objdump RUN line for an archive.

Consider reusing an existing .ll file and not duplicating LLVM IR, e.g. source-interleave-x86_64.test

This revision now requires changes to proceed.Jul 31 2020, 11:37 AM
tinti added inline comments.Jul 31 2020, 5:33 PM
llvm/test/tools/llvm-objdump/X86/disassemble-archive-with-modified-source-path.ll
1 ↗(On Diff #282201)

Please don't just copy disassemble-archive-with-source.ll

The test is named disassemble-archive-with-modified-source-path.ll
I wonder whether using an archive is significant to the test. If not, just use a plain .o

Understood. I believe the both .o and .c are needed.

If an archive can trigger a slightly different code path you also want to test, you can add a second llvm-objdump RUN line for an archive.

Consider reusing an existing .ll file and not duplicating LLVM IR, e.g. source-interleave-x86_64.test

I will. The source-interleave-x86_64.test has source-interleave-x86_64.c too.

tinti updated this revision to Diff 282363.Jul 31 2020, 6:55 PM

Expected to fix failed test case on Windows.

Use source-interleave-x86_64.test as base implementation for the test case.

tinti updated this revision to Diff 282411.Aug 1 2020, 12:27 PM

Make old test unsupported for Windows.
Add new test for Windows.
Reword --prefix help text.

Please remember to update the documentation located at llvm/docs/CommandGuide for llvm-objdump, to include the new option.

llvm/test/tools/llvm-objdump/X86/disassemble-archive-with-modified-source-path.ll
1 ↗(On Diff #282201)

I will. The source-interleave-x86_64.test has source-interleave-x86_64.c too.

That test to me sounds like the right one to be basing your test off-of. I agree the archive part doesn't sound important.

10 ↗(On Diff #282201)

sed is one of the required tools to run the lit test suite on Windows. There are multiple other tests that have similar tools.

I care about GNU compatibility on Windows - we ship LLVM-based tools for Windows developers such as llvm-objdump, and the option seems useful to me!

tinti added a comment.Aug 3 2020, 6:27 PM

Please remember to update the documentation located at llvm/docs/CommandGuide for llvm-objdump, to include the new option.

My bad.

I believe that there are no more docs to be updated.

I have also removed the help text --source from the command line.
Both --source or --line-numbers will use the behavior implemented by --prefix.

I believe this is correct because:

GNU's objdump will not run with --line-numbers only.
LLVM's objdump assumes --source with --line-numbers.

I found a bug also in the code. By fixing Windows implementation I made it wrong. --prefix my should prepend only my and not /my.

I will compile binutils for Windows and do more tests.

Hold for a bit please but I am adding the new patch with a FIXME anyway.

tinti updated this revision to Diff 282782.Aug 3 2020, 6:28 PM
  • Add documentation.
  • Add a FIXME on Windows.
  • Fix a forced absolute paths bug on Unix-lie.
MaskRay added inline comments.Aug 3 2020, 10:22 PM
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths-windows.test
1 ↗(On Diff #282782)

Tests which only run on Windows should be added very carefully. Many developers don't have Windows testing method. When they do refactoring, they can easily neglect Windows-only tests. /\ has some pain but you can get round with {{[/\\]}}

llvm/tools/llvm-objdump/llvm-objdump.cpp
1035

What if LineInfo.FileName is relative?

Hi @tinti - please remember to upload with full diff context. I previously pointed you at the page on how to do so. Not including full context makes things harder to review.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
7 ↗(On Diff #282782)

In both tests, it probably makes sense to only have a --prefix case - there is existing testing for --source so we don't need to show that behaviour works. I also think you need a case where you have --prefix and it allows you to find the path. Perhaps you could use just /Inputs in the sed line, and then use --prefix to add %/p to the path.

9 ↗(On Diff #282782)

When using a mixture of CHECK and CHECK-NEXT lines, it's a little more readable to me if you indent the CHECK lines so that their value start lines up with those for CHECK-NEXT:

; CHECK:      0000000000000000 <foo>:
; CHECK-NEXT: ; int foo() {
tinti added inline comments.Aug 4 2020, 3:56 AM
llvm/tools/llvm-objdump/llvm-objdump.cpp
1035

--prefix should only touch absolute paths.

I have made a wrong assumption that --prefix maps absolute paths to others absolute path.

That is not the case --prefix maps absolute path to any type of path.

tinti added inline comments.Aug 4 2020, 4:02 AM
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths-windows.test
1 ↗(On Diff #282782)

I just found the error because the buildbot pointed it.

I don't have too a Windows testing method. I am thinking on removing this test and add only a comment depending on the behavior of GNU's objdump on Windows.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
7 ↗(On Diff #282782)

Agreed. I will try.

9 ↗(On Diff #282782)

Agreed.

tinti added a comment.Aug 4 2020, 6:32 PM

Hi @tinti - please remember to upload with full diff context. I previously pointed you at the page on how to do so. Not including full context makes things harder to review.

I will do it properly now. My bad.

tinti updated this revision to Diff 283794.Aug 6 2020, 7:13 PM

Add documentation.
Remove Windows test.
Implement remove trailing separators on prefix.
Add a real usage test case (wrong prefix fixed by --prefix).

jhenderson added inline comments.Aug 10 2020, 1:38 AM
llvm/docs/CommandGuide/llvm-objdump.rst
172 ↗(On Diff #283794)

I'd change this text to "When disassembling with the --source option, add prefix to..." (where --source is formatted in the same way as other option references - see the reference e.g. to --disassemble in the --source help text).

I'm also a little confused what is meant by "absolute" paths here. Is it literally gluing the specified prefix onto the front of the path (e.g. "/foo/bar" becomes "prefix/foo/bar")?

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
2 ↗(On Diff #283794)

Is this option not intended to be supported on Windows? I couldn't figure out GNU objdump's behaviour for Windows, so it's possible it doesn't work, but it would be nice, if we could get it to. One option would be to insert the prefix after the drive letter. Thus C:\foo would become C:\prefix\foo. It might be necessary to look at GNU's source code to identify its Windows behaviour.

10–13 ↗(On Diff #283794)

As I said before, I don't think you need the cases where llvm-objdump is called without --prefix. They are not necessary because --source is tested elsewhere. Keeping them in complicates the test making it harder to see what is actually being tested, and also slows it down.

However, I can potentially be persuaded otherwise - I just need to see a justification for this.

17 ↗(On Diff #283794)

What is the purpose of the {{.*}} following myprefix here? If it's representing %/p, then you can use FileCheck's -D option to ensure the output is correct. Furthermore, you can use the same option for the file name. I believe something like the following would work:

; RUN: ... | FileCheck %s --check-prefix CHECK-PREFIX -DFILE=%t.o -DROOT=%/p
; CHECK-PREFIX:      0000000000000000 <foo>:
; CHECK-PREFIX-NEXT: warning: '[[FILE]]': failed to find source myprefix[[ROOT]]/Inputs/source-interleave-x86_64.c
31–34 ↗(On Diff #283794)

I'd make this case the "primary" case (i.e. first in the file), since it is the actually useful one.

You also don't need a unique set of check lines - they are identical to other cases, and thus can be reused (try removing the --check-prefix command - you will see the test still passes).

39–41 ↗(On Diff #283794)

These three checks aren't being used - you are using the CHECK cases for this test case (and thus showing that the behaviour is the same as if --prefix wasn't specified). You probably want a comment saying what this case is actually testing (i.e. that --prefix / is the same as without specifying --prefix).

45 ↗(On Diff #283794)

Why does this case not need the {{.*}} that is present in the first "myprefix" case?

llvm/tools/llvm-objdump/llvm-objdump.cpp
1033

I don't think you need this comment. Most options are GNU compatible, but even if they weren't, we probably wouldn't highlight them.

1035–1037

Can you get rid of the llvm:: prefix here?

2900–2903

I think you can get rid of the outer braces here.

2901

Can you get rid of the llvm:: prefix here? We have a using namespace llvm directive at the start of this function.

tinti added a comment.Aug 10 2020, 4:47 AM
llvm/docs/CommandGuide/llvm-objdump.rst
172 ↗(On Diff #283794)

I'd change this text to "When disassembling with the --source option, add prefix to..." (where --source is formatted in the same way as other option references - see the reference e.g. to --disassemble in the --source help text).

I tried to avoid this here because I saw other examples that don't do this here. I will change.

I'm also a little confused what is meant by "absolute" paths here. Is it literally gluing the specified prefix onto the front of the path (e.g. "/foo/bar" becomes "prefix/foo/bar")?

Yes. It is a dead simple prepend.

The --prefix will not touch a path like ../foo/filename.c. It only touches absolute paths like /foo/filename.c.

So with --prefix bar:

  1. ../foo/filename.c stays the same way.
  2. /foo/filename.c becomes bar/foo/filename.c.
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
2 ↗(On Diff #283794)

Is this option not intended to be supported on Windows? I couldn't figure out GNU objdump's behaviour for Windows, so it's possible it doesn't work, but it would be nice, if we could get it to. One option would be to insert the prefix after the drive letter. Thus C:\foo would become C:\prefix\foo. It might be necessary to look at GNU's source code to identify its Windows behaviour.

It is intended to be supported on Windows. The behavior implemented up to now is correct and matches GNU objdump. On Windows path C:\foo with prefix bar is supposed and will to become barC:\foo. It looks wrong but it is not. Now I understand why it is supposed to be that way. Please read below.

There is another option that I have already implemented but not submitted called --prefix-strip. --prefix-strip receives an integer and strip parts by the separators from the original absolute path. So to make C:\foo\filename.c become C:\prefix\foo\filename.c one need to use --prefix C:\prefix\ and --prefix-strip 1.

10–13 ↗(On Diff #283794)

The reason I keep that was to make clear the flow of changes done by --prefix.

How the untouched output was and what has changed. This can be changed to a comment too.

17 ↗(On Diff #283794)

Is to ensure that the output is correct.

I did not know this option on FileCheck. I will use that one.

31–34 ↗(On Diff #283794)

Ok.

I will change.

45 ↗(On Diff #283794)

Because in %t2.o I did the incomplete prefix (without %p).

Complete and correct:
; RUN: sed -e "s,SRC_COMPDIR,%/p/Inputs,g" %p/Inputs/source-interleave.ll > %t.ll

Incomplete:
; RUN: sed -e "s,SRC_COMPDIR,/Inputs,g" %p/Inputs/source-interleave.ll > %t2.ll

llvm/tools/llvm-objdump/llvm-objdump.cpp
1033

Ok.

I will remove.

1035–1037

Ok.

2900–2903

Ok.

2901

Ok.

tinti updated this revision to Diff 284953.Aug 11 2020, 7:32 PM

Reword llvm/docs/CommandGuide/llvm-objdump.rst
Keep Windows as unsupported. [1]
Use -D in FileCheck for better testing.
Put primary test on top of the file. The primary use is to fix a wrong or incomplete path.
Fix missing --check-prefix on trailing test.
Remove comments about compatibility with GNU's objdump.
Remove namespace llvm:: where it is not needed.
Remove braces.
Add test comments.
Remove test checks that can be removed. [2]

[1] I believe this code will work fine on Windows but I have not tested.
[2] Tests are running faster.

tinti marked 13 inline comments as done.Aug 11 2020, 7:34 PM

I'm out of time. There are probably some more bits in the test for me to look at, but here are some comments to keep you going!

llvm/docs/CommandGuide/llvm-objdump.rst
170 ↗(On Diff #284953)

Just realised that you need to show that this option takes an argument.

172 ↗(On Diff #284953)

A couple of suggestions to further clarify the text.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
2 ↗(On Diff #283794)

Thanks, that makes sense to me. If you remove the UNSUPPORTED line, the per-merge bot might tell you whether it works on Windows. I also primarily develop on Windows, so can double-check too if you want.

My suspicion is that this might not work, due to how --prefix and absolute paths work on Windows. Once you've implemented --prefix-strip, that one can be supported in a separate test, which will work on Windows then hopefully.

tinti updated this revision to Diff 285905.Aug 16 2020, 3:11 PM

Fix documentation.

tinti marked 4 inline comments as done.Aug 16 2020, 3:16 PM
tinti added inline comments.
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
2 ↗(On Diff #283794)

As far as I understood it should pass.

Would be awesome if you could test and remove UNSUPPORTED if it pass.

jhenderson added inline comments.Aug 19 2020, 2:01 AM
llvm/docs/CommandGuide/llvm-objdump.rst
172 ↗(On Diff #285905)
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
15 ↗(On Diff #285905)

Hmmm... this is interesting. With llvm-objdump, this test case fails for Windows. However, if I replace llvm-objdump with GNU objdump and run the same invocation, it works. I dug into this a little. The problem is with the meaning of "absolute paths" for Windows - it's ambiguous. C++17's filesystem::is_absolute, whose behaviour is matched by sys::path::is_absolute essentially says that the path must be unambiguous (i.e. that there must be no drive letter for Windows). However, other languages, e.g. Python, treat a path as absolute with or without the drive letter. GNU objdump appears to follow the Python etc approach, so '/Inputs\foo' is treated as an absolute path and is prefixed.

I think we need extra logic to get this to work on Windows, to ensure GNU compatibility. Basically, you need to identify the case where a path is absolute within the current drive (i.e. starts with '/', though it might be better, if possible to rely on the underlying functions to do that), and treat that as absolute. What you can't do is modify the behaviour of sys::path::is_absolute itself.

2 ↗(On Diff #283794)

I've tested, and it doesn't work - see below for why.

jhenderson added inline comments.Aug 19 2020, 2:01 AM
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
1 ↗(On Diff #285905)

Please use double coment markers for true comments (as opposed to RUN/CHECK etc lines), i.e. ;;. This helps them stand out from the actual functional lines of the test. Older tests may not follow this approach, but we can for newer ones.

11–12 ↗(On Diff #285905)

No need for a double blank line here and below - just use a single one and make the comments stand out more as suggested above.

25 ↗(On Diff #285905)
28–29 ↗(On Diff #285905)

You can avoid the duplicate check pattern here, since the output should be the same as earlier.

Sorry for the delay in getting back to you - I wasn't working on Monday and had a busy day yesterday.

tinti marked an inline comment as done.Aug 24 2020, 2:22 PM
tinti added inline comments.
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
15 ↗(On Diff #285905)

I did a small test.

bool is_absolute2(const llvm::Twine &path, llvm::sys::path::Style style) {
  if (style == llvm::sys::path::Style::windows) {
    llvm::SmallString<128> path_storage;
    llvm::StringRef p = path.toStringRef(path_storage);
    
    if (p.front() == '/' || p.front() == '\\')
      return true;

    if (p.size() >= 2 && (p[0] && p[1] == ':'))
      return true;
  }

  return llvm::sys::path::is_absolute(path, style); 
}

bool is_separator2(char value, llvm::sys::path::Style style) {
  // not needed
  return llvm::sys::path::is_separator(value, style);
}

...

  std::string test1("/input/foo.c");
  std::string test2("c:/input/foo.c");
  std::string test3("\\\\input\\foo.c");
  std::string test4("c:/");
  std::string test5("c:");
  std::string test6("//");
  std::string test7("\\");
  std::string test8(":");

  std::vector<std::string> tests;
  tests.push_back(test1);
  tests.push_back(test2);
  tests.push_back(test3);
  tests.push_back(test4);
  tests.push_back(test5);
  tests.push_back(test6);
  tests.push_back(test7);
  tests.push_back(test8);

  for (auto it = tests.begin(); it < tests.end(); ++it) {
    llvm::outs() << "is_absolute?(\"" << *it << "\")" << "\n";
    llvm::outs() << "  gnu::windows:   " << IS_DOS_ABSOLUTE_PATH(it->c_str()) << "\n";
    llvm::outs() << "  llvm::windows:  " << llvm::sys::path::is_absolute(*it, llvm::sys::path::Style::windows) << "\n";
    llvm::outs() << "  llvm2::windows: " << is_absolute2(*it, llvm::sys::path::Style::windows) << "\n";

    llvm::outs() << "  gnu::posix:     " << IS_UNIX_ABSOLUTE_PATH(it->c_str()) << "\n";
    llvm::outs() << "  llvm::posix:    " << llvm::sys::path::is_absolute(*it, llvm::sys::path::Style::posix) << "\n";
    llvm::outs() << "  llvm2::posix:   " << is_absolute2(*it, llvm::sys::path::Style::posix) << "\n";
    
    llvm::outs() << "is_separator?(\"" << *it << "\")" << "\n";
    llvm::outs() << "  gnu::windows:   " << IS_DOS_DIR_SEPARATOR(it->back()) << "\n";
    llvm::outs() << "  llvm::windows:  " << llvm::sys::path::is_separator(it->back(), llvm::sys::path::Style::windows) << "\n";
    llvm::outs() << "  llvm2::windows: " << is_separator2(it->back(), llvm::sys::path::Style::windows) << "\n";

    llvm::outs() << "  gnu::posix:     " << IS_UNIX_DIR_SEPARATOR(it->back()) << "\n";
    llvm::outs() << "  llvm::posix:    " << llvm::sys::path::is_separator(it->back(), llvm::sys::path::Style::posix) << "\n";
    llvm::outs() << "  llvm2::posix:   " << is_separator2(it->back(), llvm::sys::path::Style::posix) << "\n";
    llvm::outs() << "\n";
  }

The output is:

is_absolute?("/input/foo.c")
  gnu::windows:   1
  llvm::windows:  0
  llvm2::windows: 1
  gnu::posix:     1
  llvm::posix:    1
  llvm2::posix:   1
is_separator?("/input/foo.c")
  gnu::windows:   0
  llvm::windows:  0
  llvm2::windows: 0
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0

is_absolute?("c:/input/foo.c")
  gnu::windows:   1
  llvm::windows:  1
  llvm2::windows: 1
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0
is_separator?("c:/input/foo.c")
  gnu::windows:   0
  llvm::windows:  0
  llvm2::windows: 0
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0

is_absolute?("\\input\foo.c")
  gnu::windows:   1
  llvm::windows:  1
  llvm2::windows: 1
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0
is_separator?("\\input\foo.c")
  gnu::windows:   0
  llvm::windows:  0
  llvm2::windows: 0
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0

is_absolute?("c:/")
  gnu::windows:   1
  llvm::windows:  1
  llvm2::windows: 1
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0
is_separator?("c:/")
  gnu::windows:   1
  llvm::windows:  1
  llvm2::windows: 1
  gnu::posix:     1
  llvm::posix:    1
  llvm2::posix:   1

is_absolute?("c:")
  gnu::windows:   1
  llvm::windows:  0
  llvm2::windows: 1
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0
is_separator?("c:")
  gnu::windows:   0
  llvm::windows:  0
  llvm2::windows: 0
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0

is_absolute?("//")
  gnu::windows:   1
  llvm::windows:  0
  llvm2::windows: 1
  gnu::posix:     1
  llvm::posix:    1
  llvm2::posix:   1
is_separator?("//")
  gnu::windows:   1
  llvm::windows:  1
  llvm2::windows: 1
  gnu::posix:     1
  llvm::posix:    1
  llvm2::posix:   1

is_absolute?("\")
  gnu::windows:   1
  llvm::windows:  0
  llvm2::windows: 1
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0
is_separator?("\")
  gnu::windows:   1
  llvm::windows:  1
  llvm2::windows: 1
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0

is_absolute?(":")
  gnu::windows:   0
  llvm::windows:  0
  llvm2::windows: 0
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0
is_separator?(":")
  gnu::windows:   0
  llvm::windows:  0
  llvm2::windows: 0
  gnu::posix:     0
  llvm::posix:    0
  llvm2::posix:   0

GNU and LLVM don't agree with /input/foo.c, c:, / and \. The is_absolute2 function agrees with GNU. I believe it is enough for this patch.

Working on it!

tinti updated this revision to Diff 287538.Aug 24 2020, 5:56 PM
tinti marked 6 inline comments as done.

Use ;; for comments.
Remove double blanks.
Remove unsupported for Windows [1].
Add helper function of objdump::is_absolute().
Remove duplicated checks.
Rename TRAILING check.

[1] Not tested.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
15 ↗(On Diff #285905)

I believe this time it will work. I have added a helper function bool objdumpIsAbsolute(const Twine &path, sys::path::Style style = sys::path::Style::native); at llvm-objdump.cpp. At first I have added it on llvm-objdump.h but I changed now. Should I keep it this way?

About Windows, would you mind to test again?

25 ↗(On Diff #285905)

Ok.

At least it appears that GNU's and LLVM's objdump agree on is_separator.

28–29 ↗(On Diff #285905)

I think I understand now what you were trying to say in the last replies.

I have changed to CHECK-MISS-PREFIX-FIX which produces the same output as the removed CHECK-TRAILING1.

jhenderson added inline comments.Aug 25 2020, 1:36 AM
llvm/docs/CommandGuide/llvm-objdump.rst
172 ↗(On Diff #285905)

Ping this comment.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
15 ↗(On Diff #285905)

In general terms, put the function near to where it is actually used. Where you've got it is fine.

7 ↗(On Diff #287538)

I'm not sure this or the above similar comment line are useful - this test doesn't have any control over how source-interleave.ll is generated, so the comment could become misleading. It's clear from hthe sed line that you are using the source-interleave.ll input too. No need to restate it.

29 ↗(On Diff #287538)

You don't need CHECK-TRAILING. You can use CHECK-BROKEN-PREFIX with -D ROOT=''.

llvm/tools/llvm-objdump/llvm-objdump.cpp
424

We tend to refer to the GNU tools as "GNU objdump", "GNU objcopy" etc (without the "'s")

427–428
429–430

I'm not sure this last sentence makes sense to me. Maybe you can just delete it.

432

Please change all your variable names in this function to use LLVM naming styles. See the Coding Standards for the details.

433

You never specify the style parameter so unsurprisingly, the check for style == sys::path::Style::windows doesn't work, and the test still fails on Windows. You can see this by looking at the test failures listed by the pre-merge checks near the top of this page. You might want a call to real_style.

438

I think this will crash if LineInfo.FileName is empty. This might be caused by a malformed input, but we shouldn't crash still.

You also should consider crafting a test for this case.

Also, use sys::path::is_separator instead of directly querying the character.

tinti marked 4 inline comments as done.Aug 25 2020, 9:13 AM
tinti added inline comments.
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
29 ↗(On Diff #287538)

In this case I believe I need.

CHECK-BROKEN-PREFIX uses %t2.
CHECK-TRAILING uses %t.

llvm/tools/llvm-objdump/llvm-objdump.cpp
433

You are correct. I miss understood the meaning of Style::native. I thought it would be set to either Style::posix or Style::native.

Unfortunately real_style is restricted to https://github.com/llvm-mirror/llvm/blob/master/lib/Support/Path.cpp#L39.

Thinking about it I can:

  • Add a #ifdef _WIN32 in llvm-objdump. I am really not willing to go with this one but I have seen in other tools such as llvm-objcopy [1].
  • Make real_style public or be able to query the style outside llvm::sys::path::* with a helper function. Doesn't look great also. It seems to be designed this way to avoid such queries.
  • Try another way to answer "was I built on Windows?"

LLDB has this issue too since GDB does use IS_DIR_ABSOLUTE [3]. It solves it in several ways (usage [4] and [5]):

With all this in mind I think I would go with Triple(sys::getProcessTriple()).isOSWindows() implementation. Correct me if I am wrong but on GNU you can only have one binary for one target whereas LLVM one may have one binary to many targets (I don't know if this applies here).

Taking into account CROSS-COMPILING between Linux and Windows: My guess would be to behave based on the running target rather than the compile target.

Out of ideas. It is becoming a bit complex.

Two more things. Should I care about __CYGWIN__, __MSDOS__, __OS2__? GNU tools care [2].

I would answer no. Maybe only care for __CYGWIN__.

git grep __CYGWIN__ | wc -l
43
git grep __MSDOS__ | wc -l
3
git grep __OS2__ | wc -l
3

Should I use isAlpha() like LLDB? There are some restriction about <drive letter>: format [6].

[1] See https://github.com/llvm/llvm-project/commit/6dc59629570ae6fe1836cff3ff53912917d17af7#diff-a074a8872d4fe193550d7a651ea18cabR229.
[2] https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=include/filenames.h;hb=1ab8d928977f4d9f137f972d03e079555d0f29fa#l35
[3] https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/buildsym.c;hb=1ab8d928977f4d9f137f972d03e079555d0f29fa#l519
[4] https://github.com/llvm/llvm-project/blob/e22f0dabcf976668d35595e17645095e97a8177a/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp#L653.
[5] https://github.com/llvm/llvm-project/blob/bc9b6b33a0d5760370e72ae06c520c25aa8d61c6/lldb/source/Utility/FileSpec.cpp#L59
[6] https://en.wikipedia.org/wiki/Drive_letter_assignment#Common_assignments

tinti updated this revision to Diff 287689.Aug 25 2020, 9:16 AM
tinti marked an inline comment as done.

Just adding my work so far.

tinti added a subscriber: lhames.Aug 25 2020, 7:34 PM
tinti added inline comments.
llvm/tools/llvm-objdump/llvm-objdump.cpp
433

Even more complex: https://reviews.llvm.org/D34446

Please see @lhames last comment.

tinti updated this revision to Diff 287815.Aug 25 2020, 7:35 PM

Work done so far.

Having given it some thought, I think the right thing to do might be to put a method alongside the existing is_absolute method called something like either is_rooted, or is_gnu_style_absolute or similar. That function would then look something like this:

bool is_rooted(StringRef Path, Style St) {
  if (is_absolute(Path, St))
    return true;
  if (real_style(St) == windows)
    return !Path.empty() && is_separator(Path.front(), St);
  return false;
}

I would probably recommend that that be added in a separate patch, with dedicated gtest unit tests and reviewers taken from among those who have recently worked in this area (plus myself). You would then rebase this patch on top of that one.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
32 ↗(On Diff #287815)

Nit: please remember to end comments with a '.'.

29 ↗(On Diff #287538)

CHECK-BROKEN-PREFIX and CHECK-TRAILING do NOT use %t or %t2. What do you think the -DFILE bit of the FileCheck invocation does?

Please take this in the well-meaning way it's intended, but I'd highly recommend you read through the FileCheck documentation at https://llvm.org/docs/CommandGuide/FileCheck.html. It seems that you may have several misunderstandings about how FileCheck works.

llvm/tools/llvm-objdump/llvm-objdump.cpp
3011–3014

I don't think you need this additional complexity. is_separator uses native style by default, which should be sufficient for our purposes.

tinti added a comment.Aug 26 2020, 4:31 PM

Having given it some thought, I think the right thing to do might be to put a method alongside the existing is_absolute method called something like either is_rooted, or is_gnu_style_absolute or similar. That function would then look something like this:

bool is_rooted(StringRef Path, Style St) {
  if (is_absolute(Path, St))
    return true;
  if (real_style(St) == windows)
    return !Path.empty() && is_separator(Path.front(), St);
  return false;
}

I would probably recommend that that be added in a separate patch, with dedicated gtest unit tests and reviewers taken from among those who have recently worked in this area (plus myself). You would then rebase this patch on top of that one.

Ok. I will be working on that.

I prefer using is_gnu_style_absolute. It makes clear that this function was created for compatibility reasons.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
32 ↗(On Diff #287815)

Ok.

29 ↗(On Diff #287538)

You can be crystal clear with me.

I red again the docs. Now it is more clear what you were saying.

There is no need to use CHECK-TRAILING: warning: ... because I can reuse CHECK-BROKEN-PREFIX: warning: ... with different parameters from -D.

My understanding was wrong. Given a RUN command, I thought FileCheck would only look for CHECKs after it.

llvm/tools/llvm-objdump/llvm-objdump.cpp
3011–3014

Ok.

jhenderson added inline comments.Aug 27 2020, 12:26 AM
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths.test
29 ↗(On Diff #287538)

Yes, exactly. Perhaps the best way to think about it is to imagine all the CHECK patterns in a separate file. FileCheck is just another process, which happens to take an input file to parse patterns from, and another input (via stdin or --input-file), which contains the text to check. By convention, the same file is used for the lit directives and the FileCheck patterns (hence '%s' in the FileCheck invocation, which means "this file"), but the lit directives (e.g. RUN lines) and FileCheck directives (CHECK lines) are actually completely unrelated and parsed by two separate processes, and thus have no bearing on one another.

I'm not familiar with that testing, sorry. You'll need to get someone else who knows that area to answer you (based on what is already there, I'm guessing no, since the existing ones don't already, but I don't really know).

tinti marked 9 inline comments as done.Oct 5 2020, 6:44 PM
tinti marked an inline comment as done.Oct 5 2020, 7:49 PM

Hi @jhenderson,

I believe I should create more tests addressing:

  • On Windows trailing '\' should be removed.
  • On Posix trailing '\' should be kept.

As far as I know I would need to duplicate source-interleave-absolute-paths.test and add REQUIRES: x (or UNSUPPORTED: x).

Is there a better way of doing it?

Hi @jhenderson,

I believe I should create more tests addressing:

  • On Windows trailing '\' should be removed.
  • On Posix trailing '\' should be kept.

As far as I know I would need to duplicate source-interleave-absolute-paths.test and add REQUIRES: x (or UNSUPPORTED: x).

Is there a better way of doing it?

I believe where there's a difference in behaviour based on the OS, you should have two separate tests, which use REQUIRES and UNSUPPORTED accordingly. I'm not aware of a way within a single test to customise what is checked for otherwise.

tinti updated this revision to Diff 296828.Oct 7 2020, 4:29 PM
tinti edited the summary of this revision. (Show Details)

Add specific tests.
Reword.

jhenderson added inline comments.Oct 8 2020, 1:26 AM
llvm/docs/CommandGuide/llvm-objdump.rst
173 ↗(On Diff #296828)

I don't think you need to mention GNU here. In general the concept of "absolute paths" is a little under-defined, plus people will likely expect the behaviour to match GNU.

llvm/docs/llvm-objdump.1
111 ↗(On Diff #296828)

Ditto.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths-all.test
1 ↗(On Diff #296828)

Maybe just drop "-all" from the test name. In fact, the test probably just wants to be named something like "prefix.test" or possibly "source-interleave-prefix.test" to show it tests the prefix option (in general, tests should be named after the option they are testing). This second point applies to all the new tests.

3–10 ↗(On Diff #296828)

I'd prefer it if you created these inputs immediately before their first use rather than early on. It would also be good if you named the output files according to the property they are trying to show.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths-non-windows.test
1 ↗(On Diff #296828)

This comment is ambiguous - it could imply behaviour that is not specific to Windows, which would also apply to the generic tests in the -all test.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths-windows.test
1 ↗(On Diff #282782)

Tests which only run on Windows should be added very carefully. Many developers don't have Windows testing method. When they do refactoring, they can easily neglect Windows-only tests. /\ has some pain but you can get round with {{[/\\]}}

I strongly disagree with making a test more general so that it isn't dependent on Windows behaviour - if we did that, we'd have no testing for Windows-specific behaviour which isn't good. I personally don't usually test my patches on Linux, as I almost exclusively develop on Windows. Does that mean I shouldn't write tests that are Linux-specific?

There are pre-merge bots that report test failures. Developers can and should rely on these to catch places where they aren't able to properly test things themselves locally.

To me, this test looks good.

llvm/tools/llvm-objdump/llvm-objdump.cpp
357

Same as for the documentation.

tinti marked 6 inline comments as done.Oct 8 2020, 5:36 AM
tinti added inline comments.
llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths-all.test
3–10 ↗(On Diff #296828)

Good point.

llvm/test/tools/llvm-objdump/X86/source-interleave-absolute-paths-non-windows.test
1 ↗(On Diff #296828)

I am struggling to find a good comment here.

Can I use 'POSIX' or 'unix-like' behavior? If yes, I would also rename the properly.

tinti updated this revision to Diff 296935.Oct 8 2020, 5:43 AM
tinti marked 2 inline comments as done.

Reword comments.
Substitute CHECK-TRAILING.
Remove extra sed pipe.
Add test for relative paths.

jhenderson accepted this revision.Oct 8 2020, 6:09 AM

LGTM, aside from some smaller formatting bits, but please wait for others.

llvm/test/tools/llvm-objdump/X86/source-interleave-prefix.test
3 ↗(On Diff #296935)
19 ↗(On Diff #296935)
31–32 ↗(On Diff #296935)

Here and elsewhere, my personal preference is to format it as in the suggested edit. The motivation is 1) The first line clearly indicates that there is no more of the specific command indicated by that line, whilst 2) the indentation on the second line shows the line is a continuation of the previous line.

tinti updated this revision to Diff 296952.Oct 8 2020, 6:35 AM
tinti marked 3 inline comments as done.

Reword.
Use James suggestions.

@MaskRay are you happy with the latest changes? If so, please change your review. Thanks!

MaskRay added inline comments.Oct 8 2020, 12:02 PM
llvm/test/tools/llvm-objdump/X86/source-interleave-prefix.test
8 ↗(On Diff #296952)

Since you have used = as the separator of a long option, consider changing --check-prefix to --check-prefix=

llvm/tools/llvm-objdump/llvm-objdump.cpp
1039

The two if can be merged

llvm/tools/llvm-objdump/llvm-objdump.h
51

Sort in an alphabetical order.

tinti updated this revision to Diff 297164.Oct 9 2020, 2:45 AM
tinti marked 3 inline comments as done.

Concise use of = in tests.
Merge ifs.
Reorder alphabetically the options.

rengolin accepted this revision.Oct 16 2020, 6:32 AM

@MaskRay ping?

You seem to have addressed his comments, and they were all in non-functional areas (comments, formatting, order), so, LGTM. Thanks!

MaskRay accepted this revision.Oct 16 2020, 8:39 AM
This revision is now accepted and ready to land.Oct 16 2020, 8:39 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Oct 16 2020, 11:15 AM

This breaks tests on Windows: http://45.33.8.238/win/25999/step_11.txt

PTAL, and revert for now if it takes a while to fix.

Looks like you maybe need to replace a few / with {{[/\\]}}.

@MaskRay you seem to have fixed partially, and @thakis solution is better than what I did. To avoid a similar clash, I'll revert my change and if it breaks again, you can fix it directly.

thanks!

tinti added a comment.Oct 16 2020, 1:53 PM

Thanks @MaskRay and @thakis.

@thakis I believe your solution will fix the issue.

I will be testing again.