This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Define _m intrinsics as builtins, instead of macros.
ClosedPublic

Authored by HsiangKai on Sep 29 2021, 12:52 AM.

Details

Summary

In the original design, we levarage _mt intrinsics to define macros for
_m intrinsics. Such as,

#define vadd_vv_i8m1_m(op0, op1, op2, op3, op4) \
__builtin_rvv_vadd_vv_i8m1_mt((vbool8_t)(op0), (vint8m1_t)(op1), (vint8m1_t)(op2), (vint8m1_t)(op3), (size_t)(op4), (size_t)VE_TAIL_AGNOSTIC)

However, we could not define generic interface for mask intrinsics any
more due to clang_builtin_alias only accepts clang builtins as its
argument.

In the example,

__rvv_overloaded
__attribute__((clang_builtin_alias(__builtin_rvv_vadd_vv_i8m1_mt)))
 vint8m1_t vadd(vbool8_t op0, vint8m1_t op1, vint8m1_t op2, vint8m1_t
 op3, size_t op4, size_t op5);

op5 is the tail policy argument. When users want to use vadd generic
interface for masked vector add, they need to specify tail policy in the
previous design. In this patch, we define _m intrinsics as clang
builtins to solve the problem.

Diff Detail

Event Timeline

HsiangKai created this revision.Sep 29 2021, 12:52 AM
HsiangKai requested review of this revision.Sep 29 2021, 12:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 29 2021, 12:52 AM
HsiangKai edited the summary of this revision. (Show Details)Sep 29 2021, 12:56 AM
HsiangKai updated this revision to Diff 376890.Oct 4 2021, 7:24 AM

Update test cases.

This patch was uploaded without full context.

This revision is now accepted and ready to land.Oct 11 2021, 2:01 PM
nikic added a subscriber: nikic.Oct 12 2021, 1:13 AM

I noticed this a few time in the past already, but this is the worst one yet:

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=6641d29b70993bce6dbd7e0e0f1040753d38842f&to=97f0c63783f52389bd8842df205379ceade7a89d&stat=instructions
Max-rss: https://llvm-compile-time-tracker.com/compare.php?from=6641d29b70993bce6dbd7e0e0f1040753d38842f&to=97f0c63783f52389bd8842df205379ceade7a89d&stat=max-rss
Plus a >1% increase in clang total binary size.

This happens every time a new slew of RISCV intrinsic permutations gets added, which seem to suffer from an extreme degree of redundancy. Is it possible to do something about this?

I noticed this a few time in the past already, but this is the worst one yet:

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=6641d29b70993bce6dbd7e0e0f1040753d38842f&to=97f0c63783f52389bd8842df205379ceade7a89d&stat=instructions
Max-rss: https://llvm-compile-time-tracker.com/compare.php?from=6641d29b70993bce6dbd7e0e0f1040753d38842f&to=97f0c63783f52389bd8842df205379ceade7a89d&stat=max-rss
Plus a >1% increase in clang total binary size.

This happens every time a new slew of RISCV intrinsic permutations gets added, which seem to suffer from an extreme degree of redundancy. Is it possible to do something about this?

+1, it seems bad to regress overall Clang memory usage by 1% by adding some intrinsics, especially when we're not targeting RISCV.

I noticed this a few time in the past already, but this is the worst one yet:

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=6641d29b70993bce6dbd7e0e0f1040753d38842f&to=97f0c63783f52389bd8842df205379ceade7a89d&stat=instructions
Max-rss: https://llvm-compile-time-tracker.com/compare.php?from=6641d29b70993bce6dbd7e0e0f1040753d38842f&to=97f0c63783f52389bd8842df205379ceade7a89d&stat=max-rss
Plus a >1% increase in clang total binary size.

This happens every time a new slew of RISCV intrinsic permutations gets added, which seem to suffer from an extreme degree of redundancy. Is it possible to do something about this?

+1, it seems bad to regress overall Clang memory usage by 1% by adding some intrinsics, especially when we're not targeting RISCV.

Is the max-rss increase due to the binary size increase?

This has been the first time RISC-V added something and caused performance/memory/test execution regression for other users.

The problem should be considered seriously (I don't even mind reverting the patch.)

We are working on a patch, D111617, to reduce the large header size caused by abundant RISC-V vector intrinsics.
From the measurement depicted in D103228, it should be helpful for compile time.