This is an archive of the discontinued LLVM Phabricator instance.

[libc] automemcpy README and main include file
ClosedPublic

Authored by gchatelet on Oct 11 2021, 8:40 AM.

Details

Reviewers
courbet
Summary

"automemcpy: A framework for automatic generation of fundamental memory operations"
https://research.google/pubs/pub50338/

This patch implements the concepts presented in the paper, the overall approach is the following:

  • Makes use of constraint programming to model the implementation of a memory function (memcpy, memset, memcmp, bzero, bcmp).
  • Generate the code for all valid implementations
  • Compile the implementations and benchmark them on a set of machines. The benchmark makes use of representative distributions for the function's arguments.
  • Analyze the result and pick "the best" performing function according to the specific environement.

Diff Detail

Event Timeline

gchatelet created this revision.Oct 11 2021, 8:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 11 2021, 8:40 AM
gchatelet requested review of this revision.Oct 11 2021, 8:40 AM
courbet added inline comments.Oct 12 2021, 2:50 AM
libc/benchmarks/automemcpy/FunctionDescriptor.h
2

This is supposed to include the filename: https://llvm.org/docs/CodingStandards.html#file-headers

24

Do you also want to add a static_assert for PODness ?

27

Can you comment on what the comparison is used for ?

34

doc ?

36

doc ?

53

Should the naming be ContiguousStrategy, OverlapStrategy, ... ?

I'm seeing this as describing a strategy to be applied to a given size range.

59–60

"an overlapping strategy" ?

61

"The span"

81

How ? (I mean which strategy)

118–119

Should that be an error ? The Z3 model should not be generating these given the constraints, right ?

libc/benchmarks/automemcpy/README.md
3

Maybe make it clear that this is not built be default:

This is not enabled by default, as it is mostly useful when working on tuning the library implementation. To build it, use `LIBC_BUILD_AUTOMEMCPY=ON`.
4

Did you mean to give a real link here ?

42

can you explain more about this ? (in particular, that the way to express this depends on the target)

73

information

77

is

84

picks

87

A

89

A, M

gchatelet updated this revision to Diff 379304.Oct 13 2021, 1:45 AM
gchatelet marked 16 inline comments as done.
  • Address comments
libc/benchmarks/automemcpy/FunctionDescriptor.h
2

The whole libc project would need to be fixed :-/
Can we fix it separately?

24

Actually I'd only need them to be std::trivial but I just figured out that the Optional fields in FunctionDescriptor are stepping in the way...
The generated code is still pretty efficient https://godbolt.org/z/zdoh5vWMj

53

In theory yes, but in practice the type is serialized in the autogenerated C++ file and adding a trailing "Strategy" on every type would really hinder readability.
e.g. of one of the several thousand lines of serialized NamedFunctionDescriptor

{"memcpy_0xE01C197FDF1FC6D3",{FunctionType::MEMCPY,Contiguous{{0,2}},Overlap{{2,64}},Loop{{64,128},16},AlignedLoop{Loop{{128,256},32},16,AlignArg::_1},Accelerator{{256,kMaxSize}},ElementTypeClass::NATIVE}},

Note: because the fields are Optional we have to write the type name so the compiler can distinguish between llvm::None construction and T construction, it also helps readability. Contrary to most TableGen files in LLVM, the autogenerated file will eventually be read to compare implementations.

81

I've reworded it, let me know what you think.

118–119

I wrote this to allow having an individual size in the middle of an overlap:
e.g.

...
if(size == 24) return Op<24>();
if(size < 16) return Op<Overlap<8>>();
if(size < 32) return Op<Overlap<16>>();

In this example the range 8-31 is handled by an overlapping strategy expect the size 24 which could be particularly hot and worth optimizing for. It is not in the automemcpy paper but though it would be a nice addition.
I'll remove it for now since it's not implemented and probably confusing.

libc/benchmarks/automemcpy/README.md
4

It really is the parent directory.
It has to be relative path so it works in github.

courbet accepted this revision.Oct 13 2021, 2:29 AM
courbet added inline comments.
libc/benchmarks/automemcpy/FunctionDescriptor.h
128

detail*

This revision is now accepted and ready to land.Oct 13 2021, 2:29 AM
gchatelet updated this revision to Diff 379358.Oct 13 2021, 6:07 AM
  • Fix typo
gchatelet updated this revision to Diff 379681.Oct 14 2021, 5:50 AM
  • Fix typo in README
  • move FunctionDescriptor in include folder
gchatelet closed this revision.Oct 28 2021, 4:32 AM

Submitted within D111801.