Page MenuHomePhabricator

Fix SelectionDAG Graph Printing on Windows
ClosedPublic

Authored by justice_adams on Mar 26 2020, 9:40 AM.

Details

Summary

Currently, when compiling to IR (presumably at the clang level) LLVM mangles symbols and sometimes they have illegal file characters including ?'s in them. This causes a problem when building a graph via llc on Windows because the code currently passes the machine function name all the way down to the Windows API which frequently returns error 123 ERROR_INVALID_NAME
https://docs.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-

Thus, we need to remove those illegal characters from the machine function name before generating a graph, which is the purpose of this patch.
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

I've created a static helper function replace_illegal_filename_chars which within GraphWriter.cpp to help with replacing illegal file character names before generating a dot graph filename.

Diff Detail

Event Timeline

justice_adams created this revision.Mar 26 2020, 9:40 AM

I've attached a reproducible.zip folder. To reproduce this bug, unzip the folder and compile multiply.cpp to LLVM IR and then run

llc -view-dag-combine1-dags multiply.ll

I've also included the .ll file produced from my local environment, titled multiply.ll

  • removed static declaration from header file
  • cleaned up code according to clang-tidy specifications

Ping requesting review

RKSimon added a reviewer: rnk.Apr 13 2020, 2:17 PM
rnk added a comment.Apr 15 2020, 2:22 PM

Sorry about the delay, I find this whole feature kind of concerning. Are you telling me that LLVM lets you write a file with an arbitrary filename, if not path?

So, if I want clang to overwrite some file, all I have to do is pass clang -mllvm -view-dag-combine1-dags and I can create a file with a name from the input source? That seems dangerous.

Can you please move this code from the GraphWriter.h header to here?

// Windows can't always handle long paths, so limit the length of the name.
std::string N = Name.str();
N = N.substr(0, std::min<std::size_t>(N.size(), 140));
if (Filename.empty()) {
  Filename = createGraphFilename(N, FD);

N is only used if we call createGraphFilename anyway, and it feels related.

rnk added a comment.Apr 15 2020, 2:27 PM

(Despite the tone of incredulity in my last comment, thanks for the fix, functionally this seems like an improvement.)

@rnk

So, if I want clang to overwrite some file, all I have to do is pass clang -mllvm -view-dag-combine1-dags and I can create a file with a name from the input source? That seems dangerous.

Currently, the dag printer will create a file with a name corresponding to the machine-function from the IR that it's representing with the graph, not the source file name. Forgive me if I misinterpreted this comment.

Can you please move this code from the GraphWriter.h header to here?

Yes that seems to make sense. Does it make sense to move that into a separate patch?

rnk added a comment.Apr 15 2020, 4:23 PM

@rnk

So, if I want clang to overwrite some file, all I have to do is pass clang -mllvm -view-dag-combine1-dags and I can create a file with a name from the input source? That seems dangerous.

Currently, the dag printer will create a file with a name corresponding to the machine-function from the IR that it's representing with the graph, not the source file name. Forgive me if I misinterpreted this comment.

Yes, that was my interpretation. Functions can be named all kinds of evil things. It's pretty common for compiler developers to feed test cases from bug reports to the compiler, and this seems like an easy way for an attacker to write a temp file with a name of their choice. But, that is the existing functionality. I see that path separators are banned, so it's not an arbitrary file, it's a particular file in some temp directory.

Can you please move this code from the GraphWriter.h header to here?

Yes that seems to make sense. Does it make sense to move that into a separate patch?

I feel like it should be part of this patch.

  • moved handling of long paths to createGraphFilename()

@rnk Sorry for the delay, I moved the file path length handling accordingly

Apologies for the drive-by questions.

llvm/lib/Support/GraphWriter.cpp
101

If sys::fs::createTemporaryFile can return a too-long path here, then it could do so for anyone else, too. Should the length limit be handled there?

Is 140 arbitrary, a guestimate, or based on some actual limit?

justice_adams added inline comments.Apr 30 2020, 10:33 AM
llvm/lib/Support/GraphWriter.cpp
101

@amccarth It appears to be arbitrary https://reviews.llvm.org/D3883

I'm open to moving the length limiting to

sys::fs::createTemporaryFile

If that's preferred

rnk accepted this revision.Apr 30 2020, 11:44 AM

Thanks, I think this looks good.

Do you need someone to push this? If so, let me know, I'll try to get to it.

llvm/lib/Support/GraphWriter.cpp
87–88

This can be a range-based for.

101

I don't think that's necessary. I think most callers of createTemporaryFile supply a fixed prefix. The graph writer code here is special in that it is taking non-path-like user input and using that to form a path.

This revision is now accepted and ready to land.Apr 30 2020, 11:44 AM
justice_adams set the repository for this revision to rG LLVM Github Monorepo.
  • Converted to range-based for loop

@rnk Thanks for the suggestions, I've updated it. If it looks good on your end, yes, I would appreciate it if you could commit this for me, I don't have commit access.

  • Fix formatting
This revision was automatically updated to reflect the committed changes.