This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Mask constant operands in bit permutation calculation
AbandonedPublic

Authored by qiucf on Nov 23 2022, 1:09 AM.

Details

Reviewers
nemanjai
lkail
shchenz
Esme
Group Reviewers
Restricted Project
Summary

This patch fixes issue 59074.

In IR or C code, left/right shift amount larger than value size is undefined behavior. But in practise, backend lowering for srl_parts/sra_parts/shl_parts produces add/sub of shift amounts, thus constant shift amounts might be negative or larger than value size. And the lowering depends on behavior in ISA.

PowerPC ISA says, the lowest 7 bits (6 bits if in 32-bit instruction) will be taken, and if the highest among them is 1, result will be zero, otherwise the low 6 bits (or 5 on 32-bit) are used as shift amount.

This patch emulates the behavior and avoids array overflow in bit permutation's value bits calculator.

Diff Detail

Event Timeline

qiucf created this revision.Nov 23 2022, 1:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 1:09 AM
qiucf requested review of this revision.Nov 23 2022, 1:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 23 2022, 1:09 AM
lkail added inline comments.Nov 23 2022, 11:43 PM
llvm/test/CodeGen/PowerPC/pr59074.ll
1

Any idea why it doesn't crash on pwr8?

103

Any idea why it doesn't crash for scalars?

tingwang added inline comments.
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
1487

If I understand correctly, ShiftAmt is unsigned so a "negative value" means very large shift amount here. Can we simply cap ShiftAmt to be no larger than NumBits, and then seems additional boundary checks are not necessary?

llvm/test/CodeGen/PowerPC/pr59074.ll
1
qiucf added inline comments.Sep 28 2023, 1:44 AM
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
1487

Although LLVM langref says when shift amount is larger than number bits, here the case also matches for PPCISD::SHL, which fully conforms to behavior of Power's shift instructions (sld, for example):

sld r3, r3, r4 when r4=0b110111111 or 0b0000111111 the result is the same (the instruction only cares about lowest 7 bits).

qiucf added inline comments.Sep 28 2023, 2:15 AM
llvm/test/CodeGen/PowerPC/pr59074.ll
1

Yes, -mcpu=pwr8 touches another codepath

103

We can't write shl i64 %a, 65 to reproduce it, because such simple pattern will be converted to poison value by combiner.

The crash is actually from srl_parts a, b, 0 in SDAG, which then legalizes into a group of PPCISD::SHL/SRL a, b, 0/64/-64.

Legalizer splits 2xi128 into two 1xi128, during the process some sub a, a is generated and consumed by srl_parts as shift amounts. This may expose some limitations of the combiner.