This is an archive of the discontinued LLVM Phabricator instance.

Refactor getHostCPUName to allow testing on non-native hardware.
ClosedPublic

Authored by kristof.beyls on Mar 22 2017, 2:49 AM.

Details

Summary

This refactors getHostCPUName so that for the architectures that get the
host cpu info on linux from /proc/cpuinfo, the /proc/cpuinfo parsing
logic is present in the build, even if it wasn't built on a linux system
for that architecture.

Since the code is present in the build, we can then test that code also
on other systems, i.e. we don't need to have buildbots setup for all
architectures on linux to be able to test this. Instead, developers will
test this as part of the regression test run.

As an example, a unit test is added to test getHostCPUName for a
Cortex-A9 processor running linux. A unit test is preferred over a
lit-based test, since the expectation is that in the future, the
functionality here will grow over what can be tested with "llc
-mcpu=native".

This is a preparation step to enable implementing the range of
improvements discussed on PR30516, such as adding AArch64 support,
support for big.LITTLE systems, reducing code duplication.

Diff Detail

Repository
rL LLVM

Event Timeline

kristof.beyls created this revision.Mar 22 2017, 2:49 AM
rengolin added inline comments.Mar 22 2017, 3:27 AM
include/llvm/Support/Host.h
83 ↗(On Diff #92607)

These don't need to be virtual, do they? They could even be static.

unittests/Support/Host.cpp
53 ↗(On Diff #92607)

Do we have tests for the other platforms?

90 ↗(On Diff #92607)

There are a number of different ways to find cpu names on ARM cpuinfo, and it would be good to know that they're all working. Maybe having a few different small snippets, instead of one large and redundant one?

kristof.beyls added inline comments.Mar 22 2017, 5:23 AM
include/llvm/Support/Host.h
83 ↗(On Diff #92607)

Ah right.
To be able to mock methods, they need to be virtual, so a Mock class can override them.
There are work-arounds to be able to mock non-virtual methods, but the work-arounds are worse than using virtual methods, IMHO.
See https://github.com/google/googletest/blob/master/googlemock/docs/CookBook.md, section "Mocking Nonvirtual Methods".

That being said, indeed, probably the getHostCPUName_xxx methods can be static non-virtual, as I don't see how these would need to be mocked.

unittests/Support/Host.cpp
50 ↗(On Diff #92607)

No need to mock getHostCPUName_powerpc here.

53 ↗(On Diff #92607)

Not at this point. I don't think this patch needs to introduce them, just make it easy for platform experts to add them in the test framework this patch introduces.

90 ↗(On Diff #92607)

I imagine that over time, we'll add more tests and that most tests will uses a small snippet.
But, for example, in the future to be able to test big.LITTLE, probably it's best to use most of the /proc/cpuinfo content.
Also, right now, the implementation only actually reads the first 1024 bytes. When fixing that, we'll need a test with cpuinfo content larger than 1024 bytes.

In summary, in this patch I'm just aiming to demonstrate that it becomes possible to test this functionality also on other platforms, where before that wasn't possible. As a first test, I thought a simple full /proc/cpuinfo from a real system makes it slightly easier to read and understand rather than a further cut down input.

rengolin added inline comments.Mar 22 2017, 5:32 AM
include/llvm/Support/Host.h
83 ↗(On Diff #92607)

Makes sense...

unittests/Support/Host.cpp
50 ↗(On Diff #92607)

Right.

53 ↗(On Diff #92607)

Ok.

90 ↗(On Diff #92607)

Sure, but in this case you just need to look at 0x09. :)

I mean, at this moment, it would be more value to have a few strings and testing for different CPUs than the whole string testing for a single CPU.

I'm curious why you chose to take this approach rather than add some option that allows us to change the file name being read? If we do that, then we can test this with lit tests. I generally think of our practice as using mocking, and unit tests in general, for cases where lit tests aren't practical (or, to put it another way, the infrastructure necessary to enable them is more complicated than unit testing in C++). This does not seem to be the case here. It seems straightforward to make -proc-cpuinfo-file=/foo/bar/cpuinfo.txt (modulo bikeshed) work.

As I recall, there are several other places in Clang where this is also a problem (we have hard-coded file names for /etc/lsb-release, /etc/redhat-release, etc.).

I'm curious why you chose to take this approach rather than add some option that allows us to change the file name being read? If we do that, then we can test this with lit tests. I generally think of our practice as using mocking, and unit tests in general, for cases where lit tests aren't practical (or, to put it another way, the infrastructure necessary to enable them is more complicated than unit testing in C++). This does not seem to be the case here. It seems straightforward to make -proc-cpuinfo-file=/foo/bar/cpuinfo.txt (modulo bikeshed) work.

As I recall, there are several other places in Clang where this is also a problem (we have hard-coded file names for /etc/lsb-release, /etc/redhat-release, etc.).

I honestly hadn't thought this could fit in our lit testing framework. But maybe it could be made to do so as you outline above.
I see that tools/clang/unittests/Driver/DistroTest.cpp uses unittests with vfs::InMemoryFileSystem objects to mock the contents of /etc/lsb-release etc. I wasn't aware of the approach taken there, I'll take a closer look.

I think the main issue with a -proc-cpuinfo-file=%s command line option might be in which tool to attach it to. I guess it would have to be llc, run with -mcpu=native, and then somehow detecting the cpu it would set for code generation.

I think that might work for current functionality (returning a single CPU), but when we're trying to extend this to big.LITTLE systems, and introducing a call like getHostCPUNames(), returning all different cores in the system, I'm not sure anymore if it would be easily tested using "llc -mcpu=native", as it's unclear which core to pick for "native". FWIW, I expect getHostCPUNames() to initially mainly be used by JIT engines and they may make different choices than ahead-of-time compilers with -mcpu=native.

I'll look into this a bit further, but at the moment my feel is that testing via llc -mcpu=native may be too indirect for e.g. extending this API to big.LITTLE systems. I'm not sure if there is another tool already where we could test closer to getHostCPUName, but I don't think so.

I'm curious why you chose to take this approach rather than add some option that allows us to change the file name being read? If we do that, then we can test this with lit tests. I generally think of our practice as using mocking, and unit tests in general, for cases where lit tests aren't practical (or, to put it another way, the infrastructure necessary to enable them is more complicated than unit testing in C++). This does not seem to be the case here. It seems straightforward to make -proc-cpuinfo-file=/foo/bar/cpuinfo.txt (modulo bikeshed) work.

As I recall, there are several other places in Clang where this is also a problem (we have hard-coded file names for /etc/lsb-release, /etc/redhat-release, etc.).

I honestly hadn't thought this could fit in our lit testing framework. But maybe it could be made to do so as you outline above.
I see that tools/clang/unittests/Driver/DistroTest.cpp uses unittests with vfs::InMemoryFileSystem objects to mock the contents of /etc/lsb-release etc. I wasn't aware of the approach taken there, I'll take a closer look.

I think the main issue with a -proc-cpuinfo-file=%s command line option might be in which tool to attach it to. I guess it would have to be llc, run with -mcpu=native, and then somehow detecting the cpu it would set for code generation.

I think that might work for current functionality (returning a single CPU), but when we're trying to extend this to big.LITTLE systems, and introducing a call like getHostCPUNames(), returning all different cores in the system, I'm not sure anymore if it would be easily tested using "llc -mcpu=native", as it's unclear which core to pick for "native". FWIW, I expect getHostCPUNames() to initially mainly be used by JIT engines and they may make different choices than ahead-of-time compilers with -mcpu=native.

I'll look into this a bit further, but at the moment my feel is that testing via llc -mcpu=native may be too indirect for e.g. extending this API to big.LITTLE systems. I'm not sure if there is another tool already where we could test closer to getHostCPUName, but I don't think so.

Okay. I agree, for heterogeneous environments where '-mcpu=native' does not fully express the data you need, this doing it this way can make more sense.

tstellar edited edge metadata.Mar 22 2017, 12:51 PM

I'm curious why you chose to take this approach rather than add some option that allows us to change the file name being read? If we do that, then we can test this with lit tests. I generally think of our practice as using mocking, and unit tests in general, for cases where lit tests aren't practical (or, to put it another way, the infrastructure necessary to enable them is more complicated than unit testing in C++). This does not seem to be the case here. It seems straightforward to make -proc-cpuinfo-file=/foo/bar/cpuinfo.txt (modulo bikeshed) work.

As I recall, there are several other places in Clang where this is also a problem (we have hard-coded file names for /etc/lsb-release, /etc/redhat-release, etc.).

I honestly hadn't thought this could fit in our lit testing framework. But maybe it could be made to do so as you outline above.
I see that tools/clang/unittests/Driver/DistroTest.cpp uses unittests with vfs::InMemoryFileSystem objects to mock the contents of /etc/lsb-release etc. I wasn't aware of the approach taken there, I'll take a closer look.

I think the main issue with a -proc-cpuinfo-file=%s command line option might be in which tool to attach it to. I guess it would have to be llc, run with -mcpu=native, and then somehow detecting the cpu it would set for code generation.

llc -version prints the Host CPU name, so you could just check the output of that.

I think that might work for current functionality (returning a single CPU), but when we're trying to extend this to big.LITTLE systems, and introducing a call like getHostCPUNames(), returning all different cores in the system, I'm not sure anymore if it would be easily tested using "llc -mcpu=native", as it's unclear which core to pick for "native". FWIW, I expect getHostCPUNames() to initially mainly be used by JIT engines and they may make different choices than ahead-of-time compilers with -mcpu=native.

I'll look into this a bit further, but at the moment my feel is that testing via llc -mcpu=native may be too indirect for e.g. extending this API to big.LITTLE systems. I'm not sure if there is another tool already where we could test closer to getHostCPUName, but I don't think so.

kristof.beyls edited the summary of this revision. (Show Details)
  • By refactoring a little bit more, it became possible to test using a regular unit test, not having to use a mock, which makes the test a lot cleaner.
  • During review discussions with Hal, it become clear that it's probably best to stick to unit testing rather than using lit-based testing for this feature.
kristof.beyls marked 8 inline comments as done.Mar 24 2017, 7:43 AM

. Added a few short tests for a few more cores.

kristof.beyls marked 3 inline comments as done.Mar 29 2017, 9:42 AM
rengolin accepted this revision.Mar 29 2017, 10:10 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Mar 29 2017, 10:10 AM
This revision was automatically updated to reflect the committed changes.
chandlerc added inline comments.Mar 30 2017, 1:29 AM
llvm/trunk/include/llvm/Support/Host.h
80–85

Capitalize 'helper' to make the comment prose.

Also, we more commonly use a 'detail' or 'internal' namespace. That would seem more consistent here.

And please follow the normal naming conventions rather than using a '_<arch>' suffix. Perhaps: getHostCPUNameForPowerPC, getHostCPUNameForARM, and getHostCPUNameForS390x.

kristof.beyls added inline comments.Mar 30 2017, 4:45 AM
llvm/trunk/include/llvm/Support/Host.h
80–85

Thanks for the feedback!
I tried to fix those issues in r299062, which resulted in the windows builds breaking, due to error messages like the following:

C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\include\llvm/ADT/DenseSet.h(215): error C2872: 'detail': ambiguous symbol
C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\include\llvm/Support/Chrono.h(78): note: could be 'llvm::detail'
C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\include\llvm/Support/Host.h(80): note: or       'llvm::sys::detail'

A few attempts at fixing the Windows builds by fixing the issues reported in the buildbot logs showed that every fix just uncovered more "ambiguous symbol errors", so I reverted the changes.
I'll probably need to investigate if introducing an llvm::sys::detail namespace is still a good idea (which means the "detail::xxx" syntaxes in quite a few places will need to be disambiguated to "llvm::detail::xxx"); or if an alternative solution needs to be found.

kristof.beyls marked 2 inline comments as done.Mar 31 2017, 11:32 AM
kristof.beyls added inline comments.
llvm/trunk/include/llvm/Support/Host.h
80–85

This has landed now, together with a series of namespace pollution cleanups that were necessary to unbreak the windows bots.
Main commit in r299211; namespace cleanups in r299203, r299218, r299222 and r299224.