This is an archive of the discontinued LLVM Phabricator instance.

Fix broken escaping of commands in the build
ClosedPublic

Authored by ldrumm on Nov 16 2016, 11:37 AM.

Details

Summary

A combination of broken escaping in CMake and in the python swig generation scripts meant that the swig generation step would fail whenever there were spaces or special characters in parameters passed to swig.

The fix for this in CMakeLists is to use the VERBATIM option on all COMMAND-based custom builders relying on CMake to properly escape each argument in the generated file.

Within the python swig scripts, the fix is to call subprocess.Popen with a list of raw argument strings rather than ones that are incorrectly manually escaped, then passed to a shell subprocess via subprocess.Popen(' '.join(params)). This also prevents nasty things happening such as accidental command-injection.

This allows us to have the swig / python executables in paths containing special chars and spaces, (or on shared storage on Win32, e.g \\some\path or C:\Program Files\swig\swig.exe).

Diff Detail

Repository
rL LLVM

Event Timeline

ldrumm updated this revision to Diff 78228.Nov 16 2016, 11:37 AM
ldrumm retitled this revision from to Fix broken escaping of commands in the build.
ldrumm updated this object.
ldrumm added reviewers: zturner, granata.enrico.
ldrumm added a subscriber: Restricted Project.

Chris, can you take a look at this? I am far from a CMake expert

scripts/Python/prepare_binding_Python.py
229 ↗(On Diff #78228)

This worries me a little bit.. Are we sure we are not in any way relying on the shell in executing the command line?

ldrumm added inline comments.Nov 16 2016, 11:43 AM
scripts/Python/prepare_binding_Python.py
229 ↗(On Diff #78228)

The features of the shell are not used in this expression at all, and the environment is identical to the previous invocation.

bryant added a subscriber: bryant.Nov 16 2016, 12:02 PM
bryant added inline comments.
CMakeLists.txt
46 ↗(On Diff #78228)

You can reduce diff noise by leaving formatting changes for a separate patch.

scripts/CMakeLists.txt
38 ↗(On Diff #78228)

Move this back up.

scripts/Python/prepare_binding_Python.py
222 ↗(On Diff #78228)

You can reduce diff noise by limiting your changes to removing the %s. So,

python
    # Build the SWIG args list
        options.swig_executable,
        "-c++",
        "-shadow",
        "-python",
        "-threads",
        "-I" + os.path.normcase(
            os.path.join(options.src_root, "include")),
        "-I" + os.path.normcase("./."),
        "-D__STDC_LIMIT_MACROS",
        "-D__STDC_CONSTANT_MACROS"]
    if options.target_platform == "Darwin":
        command.append("-D__APPLE__")
    if options.generate_dependency_file:
        command.extend(["-MMD", " -MF", temp_dep_file_path])
    command.extend([
        "-outdir", config_build_dir,
        "-o", settings.output_file,
        settings.input_file
    ])
    logging.info("running swig with: %s", command)
229 ↗(On Diff #78228)
bryant added inline comments.Nov 16 2016, 12:03 PM
CMakeLists.txt
53 ↗(On Diff #78228)

The indentation here could match the above.

ldrumm added inline comments.Nov 16 2016, 12:04 PM
scripts/Python/prepare_binding_Python.py
229 ↗(On Diff #78228)

@bryant

shell=False for both python 2 and 3: https://docs.python.org/2/library/subprocess.html#subprocess.Popen ; https://docs.python.org/3/library/subprocess.html#subprocess.Popen , unless I've missed your meaning.

Yes - the intended behaviour is to not use the shell on any python

ldrumm added inline comments.Nov 16 2016, 12:06 PM
CMakeLists.txt
46 ↗(On Diff #78228)

Seeing as I'm touching all these lines anyway, formatting seems appropriate. I tend to get pushback for formatting only commits, so I'd rather keep this as it is.

53 ↗(On Diff #78228)

Good point. Will fix

ldrumm added inline comments.Nov 16 2016, 12:09 PM
scripts/Python/prepare_binding_Python.py
222 ↗(On Diff #78228)

But logging.info is not a pretty printer - if the command fails for some reason we need to see why. repr allows this, and the diff noise is again minimal because that line is changing anyway and command is now a list, not a string

ldrumm added inline comments.Nov 16 2016, 12:11 PM
scripts/CMakeLists.txt
38 ↗(On Diff #78228)

Will do

bryant added inline comments.Nov 16 2016, 12:12 PM
scripts/Python/prepare_binding_Python.py
222 ↗(On Diff #78228)

Yes, that was a typo. Keep %r too.

ldrumm added inline comments.Nov 16 2016, 12:31 PM
scripts/Python/prepare_binding_Python.py
222 ↗(On Diff #78228)

Sorry - I just realised you meant the whole block (I only caught on slowly beyond the logging call).

I don't see a problem with the current implementation - it has some technical advantages over the existing code:

  • The list of args is built as a single expression, which is notionally nicer than modifying it piecemeal
  • The definition of the optional constants upfront is done such that they are always defined one way or another
  • os.path.normcase(os.path.join(options.src_root, "include")) is incorrect as there can be case-sensitive filesystems on win32
  • os.path.normcase("./."), doesn't make sense on any OS I’m aware of (though I need to change this to os.path.curdir as it's possible some OS or filesystem doesn't use ".")

Given that I'm fixing at least 4 broken assumptions here, I feel that the cleanup is worth a little diff noise

beanz edited edge metadata.Nov 16 2016, 4:06 PM

The CMake all LGTM.

ldrumm updated this revision to Diff 79578.Nov 29 2016, 8:24 AM
ldrumm edited edge metadata.
ldrumm added a subscriber: lldb-commits.

Addressed some unwanted whitespace changes as suggested by @bryant; use os.dir.curdir as instead of .

granata.enrico resigned from this revision.Nov 29 2016, 10:04 AM
granata.enrico removed a reviewer: granata.enrico.

I am not an Apple employee working on LLDB anymore

ldrumm marked 2 inline comments as done.Dec 15 2016, 9:46 AM

ping

This revision was automatically updated to reflect the committed changes.