This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Fix incorrect usage of MCPhysReg for diff list elements
ClosedPublic

Authored by barannikov88 on May 20 2023, 1:44 PM.

Details

Summary

The lists contain differences between register numbers, not the register
numbers themselves. Since a difference can also be negative, this also
changes its type to signed.

Changing the type to signed exposed a "bug". For AMDGPU, which has many
registers, the first element of a sequence could be as big as ~45k.
The value does not fit into int16_t, but fits into uint16_t. The bug
didn't show up because of unsigned wrapping and truncation of the Val
field in the advance() method.

To fix the issue, I changed the way regunit difflists are encoded. The
4-bit 'scale' field of MCRegisterDesc::RegUnit was replaced by 12-bit
number of the first regunit, and the first element of each of the lists
was removed. The higher 20 bits of RegUnit field contain the initial
offset into DiffLists array.
AMDGPU has 1'409 regunits (2^12 = 4'096), and the biggest offset is
80'041 (2^20 = 1'048'576). That is, there is enough room.

Changing the encoding method also resulted in a smaller array size, the
numbers are below (I omitted targets with less than 100 elements).

AMDGPU   | 80052 | 78741 |  -1,6%
RISCV    |  6498 |  6297 |  -3,1%
ARM      |  4181 |  3966 |  -5,1%
AArch64  |  2770 |  2592 |  -6,4%
PPC      |  1578 |  1441 |  -8,7%
Hexagon  |   994 |   740 | -25,6%
R600     |   508 |   398 | -21,7%
VE       |   471 |   459 |  -2,5%
Sparc    |   381 |   363 |  -4,7%
X86      |   326 |   208 | -36,2%
Mips     |   253 |   200 | -20,9%
SystemZ  |   186 |   162 | -12,9%

Diff Detail

Event Timeline

barannikov88 created this revision.May 20 2023, 1:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2023, 1:44 PM

clang-format

barannikov88 edited the summary of this revision. (Show Details)May 20 2023, 1:48 PM
barannikov88 published this revision for review.May 20 2023, 1:50 PM
barannikov88 added reviewers: foad, arsenm.
arsenm accepted this revision.May 21 2023, 11:45 AM
This revision is now accepted and ready to land.May 21 2023, 11:45 AM
foad accepted this revision.May 22 2023, 8:15 AM

Nice!

llvm/include/llvm/MC/MCRegisterInfo.h
699

Maybe inline advance into its one remaining caller.

barannikov88 added inline comments.May 22 2023, 6:12 PM
llvm/include/llvm/MC/MCRegisterInfo.h
699

Changed prior to commit.

barannikov88 marked an inline comment as done.May 22 2023, 6:13 PM
This revision is now accepted and ready to land.May 22 2023, 7:20 PM
barannikov88 added a comment.EditedMay 22 2023, 9:00 PM

The issue with the first buildbot breakage is that needsStackFrame in HexagonFrameLowering.cpp calls subregs_inclusive for the zero register.
https://github.com/llvm/llvm-project/blob/14bc3748109c73adde7350da60366149ae0d4c3a/llvm/lib/Target/Hexagon/HexagonFrameLowering.cpp#L314
It used to work because MCRegisterDesc for the zero register referred to the beginning of the DiffList array, where it contained two consecutive zeros.
The effect was as if the zero register didn't have any sub- or super-registers.
After this patch, these two zero entries are no longer emitted.

It feels like subregs_inclusive and related methods (and maybe even MCRegisterInfo::get) should not allow zero register input,
but it appears that many places rely on this behavior (probably unintended). Fixing all this places is out of scope of this patch,
so I'll just restore the original behavior by special-casing zero register input with a fixme.

Add a couple of assertions and fix detected bugs

barannikov88 added a comment.EditedMay 23 2023, 11:17 AM

I added a couple of assertions.
Although there were many crashes, they narrowed down to just a few places, so I thought I might be able to fix them is this patch.

llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
568 ↗(On Diff #524807)

@arsenm @foad Please take a look. Is this a correct change?
MFI->getGITPtrLoReg(MF) above returns null register for !ST.isAmdPalOS().
Also, is it ok for this change to be part of this patch or should it be fixed separately?

Fix a couple more places

Fix another occurrence in bolt

Rebase on top of D151285

This revision is now accepted and ready to land.May 24 2023, 10:14 PM

Drop Hexagon changes (extracted into separate review)