This is an archive of the discontinued LLVM Phabricator instance.

[llvm][StringExtras] Use a lookup table for `hexDigitValue`
ClosedPublic

Authored by rriddle on Oct 28 2020, 10:50 AM.

Details

Summary

This method is at the core of the conversion from hex to binary, and using a lookup table great improves the compile time of hex conversions.

Context: In MLIR we use hex strings to represent very large constants in the textual format of the IR. These changes lead to a large decrease in compile time when parsing these constants (>1 second -> 350 miliseconds).

Depends On D90265

Diff Detail

Event Timeline

rriddle created this revision.Oct 28 2020, 10:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2020, 10:50 AM
rriddle requested review of this revision.Oct 28 2020, 10:50 AM
dblaikie accepted this revision.Oct 28 2020, 10:57 AM

If you like you could try rephrasing this as a programmatic initialization rather than writing it all out manually, might make it easier to read/look more like the original code (except applied as a loop over the range/table) and hopefully can be phrased in such a way that the compiler boils it down to the same values anyway.

llvm/include/llvm/ADT/StringExtras.h
70

const?

This revision is now accepted and ready to land.Oct 28 2020, 10:57 AM
dexonsmith accepted this revision.Oct 28 2020, 11:06 AM

LGTM, thanks for splitting this out.

llvm/include/llvm/ADT/StringExtras.h
72

I wonder if this would feel more ordered / be easier to read if each line of -1U was a (possibly incomplete) group of 8.

  • 0-47 would have 6 groups of 8;
  • 58-64 would have a group of 7;
  • 71-96 would have a group of 1, three groups of 8, and a group of 1;
  • 103-255 would have a group of 1, and 19 groups of 8.

I chose 8 because 16 will be too long for 80-column.

If you try it out and it looks worse, then what you have now is probably best.

If you like you could try rephrasing this as a programmatic initialization rather than writing it all out manually, might make it easier to read/look more like the original code (except applied as a loop over the range/table) and hopefully can be phrased in such a way that the compiler boils it down to the same values anyway.

Hmmm, yeah, that's better than my idea.

rriddle updated this revision to Diff 301358.Oct 28 2020, 11:45 AM

Reformat LUT.

rriddle marked 2 inline comments as done.Oct 28 2020, 11:54 AM

If you like you could try rephrasing this as a programmatic initialization rather than writing it all out manually, might make it easier to read/look more like the original code (except applied as a loop over the range/table) and hopefully can be phrased in such a way that the compiler boils it down to the same values anyway.

Thanks for the suggestion! That would definitely be much nicer. I tried about 4-5 different ways of doing the initialization and I could only get clang to generate a proper LUT in C++17(using constexpr lambda initialization with std::array), not C++14. I couldn't get GCC to do the right thing in either C++14 or C++17. Do you have any suggestions on how to do the initialization?

rriddle updated this revision to Diff 301373.Oct 28 2020, 12:10 PM

Simplify LUT

If you like you could try rephrasing this as a programmatic initialization rather than writing it all out manually, might make it easier to read/look more like the original code (except applied as a loop over the range/table) and hopefully can be phrased in such a way that the compiler boils it down to the same values anyway.

Thanks for the suggestion! That would definitely be much nicer. I tried about 4-5 different ways of doing the initialization and I could only get clang to generate a proper LUT in C++17(using constexpr lambda initialization with std::array), not C++14. I couldn't get GCC to do the right thing in either C++14 or C++17. Do you have any suggestions on how to do the initialization?

NVM, found something that works.

dblaikie accepted this revision.Oct 28 2020, 1:58 PM

Nice!

I guess in C++20 we could change this to something like:

constexpr std::array<int, 3> a = [] {
    std::array<int, 3> r;
    int i = 0;
    for (auto &v : r) {
        v = ++i;
    }
    return r;
}();

But based on godbolt testing at least, std::array's ctor isn't constexpr until 20. Oh well.

This revision was landed with ongoing or failed builds.Oct 28 2020, 5:04 PM
This revision was automatically updated to reflect the committed changes.