This is an archive of the discontinued LLVM Phabricator instance.

Don't depend on psutil on AIX
ClosedPublic

Authored by daltenty on Jul 5 2019, 8:46 AM.

Details

Summary

On AIX psutil can run into problems with permissions to read the process
tree, which causes problems for python timeout tests which need to kill off
a test and it's children.

This patch adds a workaround by invoking shell via subprocess and using a
platform specific option to ps to list all the descendant processes so we can
kill them. We add some checks so lit can tell whether timeout tests are
supported with out exposing whether we are utilizing the psutil
implementation or the alternative.

Diff Detail

Repository
rL LLVM

Event Timeline

daltenty created this revision.Jul 5 2019, 8:46 AM
Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptJul 5 2019, 8:46 AM
llvm/utils/lit/lit/LitConfig.py
98 ↗(On Diff #208190)

Minor nit: The quote indentation no longer lines up with the previous line.

llvm/utils/lit/tests/lit.cfg
62 ↗(On Diff #208190)

Removing python-psutil as a feature entirely may be a bit aggressive. It has the potential of quietly disabling "out-of-tree" tests. I'm not sure that a Phabricator patch about AIX has the right level of visibility for making such a change. Can you send an RFC about the cleanup to the mailing list?

@daltenty The "This removes our dependency on psutil." text sounds broader than it actually is. Looking at the implementation the removal of the dependency is only for AIX. All other platforms still depends on the psutil module. I think the commit message should be made clearer on this point.

llvm/utils/lit/tests/lit.cfg
62 ↗(On Diff #208190)

The python-psutil "feature" is only available in lit's testsuite AFAIK. Hopefully that means that removing it shouldn't effect the testsuites of other projects apart from lit itself. However, giving the mailing list a heads up about this seems like a nice courtesy .

daltenty edited the summary of this revision. (Show Details)Jul 5 2019, 1:46 PM
delcypher added inline comments.Jul 5 2019, 1:47 PM
llvm/utils/lit/lit/LitConfig.py
93 ↗(On Diff #208190)

If we're going to start specializing how we support this feature on different problems then I think we need to refactor this so we don't have a platform specific code littered throughout the lit code base.

There probably should be a function that determines if killProcessAndChildren() is supported. Here's a sketch

def killProcessAndChildrenIsSupported():
   """
      Returns a tuple (<supported> , <error message>)
      where 
       `<supported>` is True if `killProcessAndChildren()` is supported on the current host, returns False otherwise.
      `<error message>` is an empty string if `<supported>` is True, otherwise is contains a string describing why the function is not supported.
    """
    if platform.system() == 'AIX':
       return (True, "")
     try:
       import psutil
       return (True, "")
     except ImportError:
        return (False, "Python psutil module could not be found")

This killProcessAndChildrenIsSupported() function can then called in any place where we currently try to do import psutil. This hides the platform specific logic that you are trying to introduce here.

llvm/utils/lit/lit/util.py
440 ↗(On Diff #208190)

This from statement seems unnecessary given that we already have subprocess imported and we just use call() once in this context. I'd rather this was just

subprocess.call('kill -kill $(ps -o pid= -L{})'.format(pid), shell=True)
daltenty updated this revision to Diff 208805.Jul 9 2019, 2:05 PM
  • Use subprocess.call instead of importing it
  • Create killProcessAndChildrenIsSupported check and rewrite things to use it
daltenty marked 3 inline comments as done.Jul 9 2019, 2:06 PM
delcypher added inline comments.Jul 9 2019, 4:27 PM
libcxx/utils/libcxx/util.py
256 ↗(On Diff #208805)

Wait what... why is this duplicated in libcxx/utils/libcxx/util.py?

lldb/lit/lit.cfg.py
81 ↗(On Diff #208805)

Shouldn't this be

if lit_config.killProcessAndChildrenIsSupported():

?

llvm/utils/lit/lit/LitConfig.py
80 ↗(On Diff #208805)

I'd prefer if this code lived in llvm/utils/lit/lit/util.py next to killProcessAndChildren() given that this function describes if that function works.

I realise you need to use lit_config.killProcessAndChildrenIsSupported() from inside a lit config but I'd suggest you add another level of indirection to hide this implementation detail so that the API of LitConfig doesn't leak it.

class LitConfig(object):
   ...
   @property
   def maxIndividualTestTimeIsSupported(self):
     """
            Returns a tuple (<supported> , <error message>)
            where
            `<supported>` is True if `killProcessAndChildren()` is supported on
                the current host, returns False otherwise.
            `<error message>` is an empty string if `<supported>` is True,
                otherwise is contains a string describing why the function is
                not supported.
      """
      return lit.util.killProcessAndChildrenIsSupported()

Alternatively you could change the implementation of LitObject.maxIndividualTestTime setter to throw a custom exception if an attempt is made to set the timeout when the platform doesn't support it. This custom exception would store the error message so it could be programmatically accessed.

davide requested changes to this revision.Jul 9 2019, 5:09 PM
davide added a subscriber: davide.

This is adding a fair amount of complexity on something that just works fine on basically every platform but AIX.
If AIX has issue with psutil, maybe the fix should be submitted to psutil upstream instead of having this dance here?

This revision now requires changes to proceed.Jul 9 2019, 5:09 PM

This is adding a fair amount of complexity on something that just works fine on basically every platform but AIX.
If AIX has issue with psutil, maybe the fix should be submitted to psutil upstream instead of having this dance here?

I appreciate the concern, however in the original change which added timeout functionality to lit (review D14706), it was already noted that depending on psutil wasn't supposed to be permanent. Removing this external dependency from lit would be desirable and hopefully other platforms will follow suit in the future.

daltenty updated this revision to Diff 209409.Jul 11 2019, 8:43 PM
  • Address review comments round 2
daltenty marked an inline comment as done.Jul 11 2019, 8:44 PM
delcypher added inline comments.Jul 11 2019, 9:21 PM
llvm/utils/lit/lit/LitConfig.py
81 ↗(On Diff #209409)

Sorry to be pedantic but this (LitConfig.killProcessAndChildrenIsSupported) should be called maxIndividualTestTimeIsSupported to be consistent with the rest of LitConfig's public interface. The name killProcessAndChildrenIsSupported is leaking implementation details.

llvm/utils/lit/lit/util.py
426 ↗(On Diff #209409)

I don't really like how we're now coupling this function with the LitConfig object just so we can print out the text Found python psutil module. Do we actually need to print out that message? If the tests don't rely on this I either suggest we remove this.

daltenty marked 6 inline comments as done.Jul 12 2019, 3:27 PM
daltenty added inline comments.
llvm/utils/lit/lit/util.py
426 ↗(On Diff #209409)

I don't really think it's necessary, this was just to mimic the previous behavior. I will remove it.

daltenty updated this revision to Diff 209619.Jul 12 2019, 3:27 PM
daltenty marked an inline comment as done.
  • Address review comments round 3
daltenty updated this revision to Diff 209621.Jul 12 2019, 3:30 PM
  • Add a space to warning in LLDB lit config

@daltenty Other than the minor nit, LGTM.

llvm/utils/lit/lit/util.py
449 ↗(On Diff #209621)

Minor nit. s/plaftorms/platforms/

daltenty updated this revision to Diff 211104.Jul 22 2019, 8:06 AM
  • Fix typo in comment
daltenty marked an inline comment as done.Jul 22 2019, 8:07 AM

Ping for an explicit acceptance.

delcypher accepted this revision.Jul 23 2019, 1:37 PM

LGTM.

@davide While I agree with sentiment of fixing psutil, the use of psutil was never meant to be permanent. As the owner of this code I don't mind making accommodations for other platforms provided we keep implementation details well contained as this patch now does.

daltenty edited the summary of this revision. (Show Details)Jul 24 2019, 7:50 AM
daltenty edited the summary of this revision. (Show Details)
This revision was not accepted when it landed; it landed in state Needs Review.Jul 24 2019, 8:05 AM
This revision was automatically updated to reflect the committed changes.