This is an archive of the discontinued LLVM Phabricator instance.

Fix the "TypeError: a bytes-like object is required, not 'str'" in exploded-graph-rewriter.py on Python 3.5+
ClosedPublic

Authored by psamolysov on Dec 20 2019, 2:45 AM.

Details

Summary

When I run the 'exploded-graph-rewriter.py' tool on Windows using Python 3.5 and above, the following error and stack trace occurs:

Traceback (most recent call last):
  File "C:\Work\Dev\llvm\llvm-monorepo\clang\utils\analyzer\exploded-graph-rewriter.py", line 1061, in <module>
    main()
  File "C:\Work\Dev\llvm\llvm-monorepo\clang\utils\analyzer\exploded-graph-rewriter.py", line 1057, in main
    explorer.explore(graph, visitor)
  File "C:\Work\Dev\llvm\llvm-monorepo\clang\utils\analyzer\exploded-graph-rewriter.py", line 911, in explore
    visitor.visit_end_of_graph()
  File "C:\Work\Dev\llvm\llvm-monorepo\clang\utils\analyzer\exploded-graph-rewriter.py", line 879, in visit_end_of_graph
    svg = graphviz.pipe('dot', 'svg', self.output())
  File "C:\Program Files\Python37\lib\site-packages\graphviz\backend.py", line 229, in pipe
    out, _ = run(cmd, input=data, capture_output=True, check=True, quiet=quiet)
  File "C:\Program Files\Python37\lib\site-packages\graphviz\backend.py", line 166, in run
    out, err = proc.communicate(input)
  File "C:\Program Files\Python37\lib\subprocess.py", line 920, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "C:\Program Files\Python37\lib\subprocess.py", line 1238, in _communicate
    self._stdin_write(input)
  File "C:\Program Files\Python37\lib\subprocess.py", line 854, in _stdin_write
    self.stdin.write(input)
TypeError: a bytes-like object is required, not 'str'

Due to work with Unicode in Python beginning from 3.5, the output string must be encoded, so I put the code to detect whether the script works on Python 3.5+ and use the encode() method if so. After this manipulations, the exploded-graph-rewriter.py script works fine on Windows and Python 3.7.

I haven't tried the script on Python 2

Diff Detail

Event Timeline

psamolysov created this revision.Dec 20 2019, 2:45 AM
psamolysov edited the summary of this revision. (Show Details)Dec 20 2019, 2:51 AM
NoQ accepted this revision.Dec 20 2019, 11:25 AM
NoQ edited reviewers, added: NoQ; removed: dergachev.a.
NoQ added a subscriber: NoQ.

Yay, thanks! It does seem to work on python2 after the fix.

Do you have commit access or should i commit it?

@NoQ Could you commit, please?

This revision was not accepted when it landed; it landed in state Needs Review.Dec 21 2019, 11:04 AM
This revision was automatically updated to reflect the committed changes.

Python makes a clear distinction between bytes and strings . Bytes objects contain raw data — a sequence of octets — whereas strings are Unicode sequences . Conversion between these two types is explicit: you encode a string to get bytes, specifying an encoding (which defaults to UTF-8); and you decode bytes to get a string. Clients of these functions should be aware that such conversions may fail, and should consider how failures are handled.

We can convert bytes to string using bytes class decode() instance method, So you need to decode the bytes object to produce a string. In Python 3 , the default encoding is "utf-8" , so you can use directly:

b"python byte to string".decode("utf-8")