This is an archive of the discontinued LLVM Phabricator instance.

Have lit preserve SOURCE_DATE_EPOCH
ClosedPublic

Authored by serge-sans-paille on Aug 18 2021, 2:57 PM.

Details

Summary

This environment variable has been standardized for reproducible builds. Setting
it can help to have reproducible tests too, so keep it as part of the testing
env when set.

See https://reproducible-builds.org/docs/source-date-epoch/

Diff Detail

Event Timeline

serge-sans-paille requested review of this revision.Aug 18 2021, 2:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2021, 2:57 PM

What sort of use would we have for this variable inside lit tests?

llvm/utils/lit/lit/TestingConfig.py
31

Looks like the quoting is incomplete - which makes it seem like maybe this patch is untested?

Fix typo + perform some manual validation

What sort of use would we have for this variable inside lit tests?

Some llvm and clang tests rely on compression through zlib (e.g. when testing compressed debug info). In some case, compression output may differ, e.g. when some hardware acceleration happen (see https://github.com/madler/zlib/pull/410) and that issue can be mitigated by setting SOURCE_DATE_EPOCH

This revision is now accepted and ready to land.Aug 19 2021, 1:13 AM
dblaikie accepted this revision.Aug 19 2021, 12:32 PM

What sort of use would we have for this variable inside lit tests?

Some llvm and clang tests rely on compression through zlib (e.g. when testing compressed debug info). In some case, compression output may differ, e.g. when some hardware acceleration happen (see https://github.com/madler/zlib/pull/410) and that issue can be mitigated by setting SOURCE_DATE_EPOCH

oh, neat - good to know!

Though this change would only have an effect on the test behavior, right? Is that a problem for some build systems? I know some are really pedantic about reproducibility even of the test results, but haven't seen any that pedantic outside of Google (& probably Facebook) - and at least at Google I don't think we've hit this, perhaps because we don't build zlib with that hardware support?

(

Though this change would only have an effect on the test behavior, right? Is that a problem for some build systems? I know some are really pedantic about reproducibility even of the test results, but haven't seen any that pedantic outside of Google (& probably Facebook) - and at least at Google I don't think we've hit this, perhaps because we don't build zlib with that hardware support?

That seems to be specific to SystemZ, which may not be used at Google / Facebook (?)

Though this change would only have an effect on the test behavior, right? Is that a problem for some build systems? I know some are really pedantic about reproducibility even of the test results, but haven't seen any that pedantic outside of Google (& probably Facebook) - and at least at Google I don't think we've hit this, perhaps because we don't build zlib with that hardware support?

That seems to be specific to SystemZ, which may not be used at Google / Facebook (?)

Yeah, I don't think it is - but it was specifically the overlap of SystemZ and a pedantically strict build system (that requires/values bit-identical reproducibility even for test artifacts). Do you have experience with such a build system that motivated this change?

I don't think Google's build system (at least the way we run LLVM's tests) requires bit-identical reproducibility for intermediate files created during test execution like this.

Though this change would only have an effect on the test behavior, right? Is that a problem for some build systems? I know some are really pedantic about reproducibility even of the test results, but haven't seen any that pedantic outside of Google (& probably Facebook) - and at least at Google I don't think we've hit this, perhaps because we don't build zlib with that hardware support?

That seems to be specific to SystemZ, which may not be used at Google / Facebook (?)

Yeah, I don't think it is - but it was specifically the overlap of SystemZ and a pedantically strict build system (that requires/values bit-identical reproducibility even for test artifacts). Do you have experience with such a build system that motivated this change?

It's not the build system that's being pedantic here, it is the reference output in the clang test cases:

FAIL: Clang :: Modules/embed-files-compressed.cpp (8052 of 29120)
FAIL: Clang :: Modules/stress1.cpp (8330 of 29120)

Though this change would only have an effect on the test behavior, right? Is that a problem for some build systems? I know some are really pedantic about reproducibility even of the test results, but haven't seen any that pedantic outside of Google (& probably Facebook) - and at least at Google I don't think we've hit this, perhaps because we don't build zlib with that hardware support?

That seems to be specific to SystemZ, which may not be used at Google / Facebook (?)

Yeah, I don't think it is - but it was specifically the overlap of SystemZ and a pedantically strict build system (that requires/values bit-identical reproducibility even for test artifacts). Do you have experience with such a build system that motivated this change?

It's not the build system that's being pedantic here, it is the reference output in the clang test cases:

FAIL: Clang :: Modules/embed-files-compressed.cpp (8052 of 29120)
FAIL: Clang :: Modules/stress1.cpp (8330 of 29120)

Oh, thanks!

If that's the case - maybe these tests are over-constrained? If zlib doesn't guarantee this behavior, but our tests are depending on it, the tests should probably be changed? (I can't read the regex straight up, but it looks like embed-files-compressed.cpp is at least trying to be a /bit/ fuzzy about the specific compressed size it's looking for, if it's not fuzzy enough perhaps it could be made more fuzzy? stress1.cpp doesn't immediately pop out to me - oh, it's looking for identical output from two compressed files & this comes up against the zlib hardware compression "and when the respective buffers have the same offset relative to the start of the page." condition, I take it? That does seem like a bit of a heavy burden in terms of reproducibility even on the same machine, making the same library calls...

@dblaikie are you suggesting we should drop this change and update the tests instead? Although I tend to agree that the tests above are over-constrained, I still think that keeping SOURCE_DAT_EPOCH is a good idea.

@dblaikie are you suggesting we should drop this change and update the tests instead? Although I tend to agree that the tests above are over-constrained, I still think that keeping SOURCE_DAT_EPOCH is a good idea.

Mixed feelings - if someone's got a super pedantic build system that requires tests to produce identical intermediate files, I could live with it. But without that requirement, I wonder if it's better to allow the variation - since that means that running tests will look more like the real world. (well, I guess this is only an issue for people who specifically set SOURCE_DATE_EPOCH and by setting it they're specifically asking for that behavior)

So, I guess: passing/respecting SOURCE_DATE_EPOCH seems like the right thing to do, but for those who are passing it I wonder if they'd be better off/it'd be more suitable not to do so, and to make the tests more robust & continue testing the way the compiler's likely to be run by users on that platform.

This revision was automatically updated to reflect the committed changes.