Page MenuHomePhabricator

Add support for computing SHA1 in LLVM

Authored by mehdi_amini on Jan 19 2016, 11:55 AM.



Provide a class to generate a SHA1 from a sequence of bytes, and
a convenience raw_ostream adaptor.
This is not used anywhere yet in LLVM, but I plan to base some
ThinLTO incremental scheme on this.

Diff Detail

Event Timeline

mehdi_amini retitled this revision from to Add support for computing SHA1 in LLVM.
mehdi_amini updated this object.
mehdi_amini added reviewers: rafael, tejohnson.
mehdi_amini added subscribers: dexonsmith, llvm-commits.
filcab added a subscriber: filcab.Jan 20 2016, 4:13 AM

I have some minor comments.
I think having a patch using this before it lands would be nice. :-)

Thank you for working on this.


I think usually (confirmed in OpenSSL and CommonCrypto) these functions are called Init(), Update(), and Final().
I don't care about the case, but we should probably use the same words (just like MD5.h is doing).


We probably want to tell people to not reuse SHA1 objects, and might want to assert in debug mode.
After all, you could end up hashing weird things like the message from the first call to result() + some padding (and not any bytes you added later).

With this, you could even re-use the internal state for the result, but that might not be worth it :-)


Nit: What does "override" mean in this context? Maybe "default" or "we assume little endian" or something else?


In the compilers we support, do we need to resort to this? I don't know about MSVC, but IIRC, gcc and clang always define {BIG,LITTLE}_ENDIAN, no?

We might not need so many tests.


Nit: Why do the round constants get names, but the seeds don't?



while(len--) //?

Why here?


Nit: I'd prefer a positive test (ifdef, like the one in line 102)

Thanks for the review.
I don't mind waiting for the use case to be ready before landing this.


Didn't know, I'll update.


You mean there is no interest in supporting this:

TEST(raw_sha1_ostreamTest, Intermediate) {
  llvm::raw_sha1_ostream Sha1Stream;
  Sha1Stream << "Hello";
  auto Hash = toHex(Sha1Stream.sha1());

  ASSERT_EQ("F7FF9E8B7BB2E09B70935A5D785E0CC5D9D0ABF0", Hash);

  Sha1Stream << " World!";
  Hash = toHex(Sha1Stream.sha1());

  ASSERT_EQ("2EF7BDE608CE5404E97D5F042F95F89F1C232871", Hash);

Is it because it is potentially allowing misuse?


All this file is almost copy pasted (and moved to a LLVM style) :

rafael edited edge metadata.Jan 20 2016, 9:36 AM
rafael added a subscriber: rafael.

We already have md5. Is it not strong enough for what you want?


filcab added inline comments.Jan 20 2016, 9:43 AM

As written we don't support that (you'll addUncounted one bit, and then the padding. Afterwards, adding to the stream won't do what it would need to do). I'd like that to be explicit in the documentation (and I'd like an assert or two to make it trigger in debug mode, if that's attempted).

mehdi_amini added inline comments.Jan 20 2016, 5:30 PM

I'm not sure what you mean, the InternalState is saved and restored in SHA1::result() so that we can resume as is. (this unittest is passing)


I would do 2 loops

mehdi_amini marked 14 inline comments as done.
mehdi_amini edited edge metadata.

Take comments into account

Thanks for all the comments, I tried to address them.
As long as we're OK with the API, we'll still be able to improve the implementation later performance-wise if necessary.


support both behavior through different API calls now.


not sure what you mean?

ruiu added a subscriber: ruiu.Mar 30 2016, 11:58 AM
ruiu added inline comments.

Why don't you pass ArrayRef<uint8_t>?


It is not clear in this comment whether this function returns a hex string or raw bytes.

Act on Rui's comments

mehdi_amini marked an inline comment as done.Mar 30 2016, 12:05 PM

Thanks Rui for the comments.


Just mimic the API I was wrapping. That's a good point, updated!



mehdi_amini marked an inline comment as done.Mar 31 2016, 8:54 AM

Anything else to change or can we move on and fix whatever comes up post-commit?
(this is blocking D18213 which is accepted)

No blocks on my side. Thank you.

P.S: It seems I had a comment to be submitted for a few weeks... Sorry for the lag. :-(


Sorry, didn't notice that line before pad() (and the one that restores it).



Nit: 20 bytes (not "characters"), since it's just the raw bits.

mehdi_amini accepted this revision.Mar 31 2016, 6:34 PM
mehdi_amini added a reviewer: mehdi_amini.

Will proceed and address further comment post-commit.

This revision is now accepted and ready to land.Mar 31 2016, 6:34 PM

Hi folks,

This change breaks on Linux X86 targetting X86, when running check-all or if LLVM_BUILD_TESTS is on:

15:53:55 [ 44%] Building CXX object unittests/Support/CMakeFiles/SupportTests.dir/raw_sha1_ostream_test.cpp.o
15:53:56 [ 44%] Linking CXX executable SupportTests
15:53:56 CMakeFiles/SupportTests.dir/formatted_raw_ostream_test.cpp.o: In function `(anonymous namespace)::formatted_raw_ostreamTest_Test_Tell_Test::TestBody()':
15:53:56 formatted_raw_ostream_test.cpp:(.text._ZN12_GLOBAL__N_140formatted_raw_ostreamTest_Test_Tell_Test8TestBodyEv+0x1ae): undefined reference to `llvm::formatted_raw_ostream::current_pos() const'
15:53:56 collect2: error: ld returned 1 exit status
15:53:56 make[3]: * [unittests/Support/SupportTests] Error 1
15:53:56 make[2]:
* [unittests/Support/CMakeFiles/SupportTests.dir/all] Error 2
15:53:56 make[1]: * [CMakeFiles/check-all.dir/rule] Error 2
15:53:56 make:
* [check-all] Error 2

Is there a bot that exhibit this issue?

Otherwise what is your config?

This came from a Jenkins job I'm setting up (I can reproduce it manually on a different machine too).

I've uploaded the full console log and build scripts here:

The main entrypoints into the scripts are and

The failure is strange, I suspect DLLVM_LINK_LLVM_DYLIB is playing tricks here...

Also the link failure is in formatted_raw_ostream_test.cpp, which this revision didn't touch. What makes you think this commit is responsible?

Sorry, I was barking up the wrong tree, this commit is unrelated.