This is an archive of the discontinued LLVM Phabricator instance.

Replace -print-whole-regmask with a threshold.
ClosedPublic

Authored by arsenm on Jun 30 2017, 10:14 AM.

Details

Reviewers
MatzeB
qcolombet
Summary

The previous flag/default of printing everything is
not helpful when there are thousands of registers
in the mask.

Diff Detail

Event Timeline

arsenm created this revision.Jun 30 2017, 10:14 AM
MatzeB edited edge metadata.Jun 30 2017, 11:56 AM

Some thoughts:

  • Is this motivated by AMDGPU having a huge number of registers in the mask?
  • For X86, AArch64, ARM the output certainly takes up a bunch of space but I find the information important enough that I'd like to see it by default
  • How about a compromise: We stop using an On/Off switch but instead influence the number of registers printed in the regmask instead of the hardcoded 10 at the moment. Then choose a big default value that easily fits X86, AArch64, ARM etc.

Some thoughts:

  • Is this motivated by AMDGPU having a huge number of registers in the mask?

Yes. It more than fills my fullscreen terminal window on any call. It wouldn't be so bad if it printed just regunits, but this also prints all of the register tuple combinations as well.

  • For X86, AArch64, ARM the output certainly takes up a bunch of space but I find the information important enough that I'd like to see it by default
  • How about a compromise: We stop using an On/Off switch but instead influence the number of registers printed in the regmask instead of the hardcoded 10 at the moment. Then choose a big default value that easily fits X86, AArch64, ARM etc.

The MIR printer comes up with a symbolic name for the regmask. I was wondering why the debug dump doesn't use that

Some thoughts:

  • Is this motivated by AMDGPU having a huge number of registers in the mask?

Yes. It more than fills my fullscreen terminal window on any call. It wouldn't be so bad if it printed just regunits, but this also prints all of the register tuple combinations as well.

  • For X86, AArch64, ARM the output certainly takes up a bunch of space but I find the information important enough that I'd like to see it by default
  • How about a compromise: We stop using an On/Off switch but instead influence the number of registers printed in the regmask instead of the hardcoded 10 at the moment. Then choose a big default value that easily fits X86, AArch64, ARM etc.

The MIR printer comes up with a symbolic name for the regmask. I was wondering why the debug dump doesn't use that

  • You could extend the debug printer to do the same.
  • I would find it inconvenient if I had to lookup what registers belong to a certain symbol name, so I wouldn't like this as default either.
  • There will be cases where no name exists as some x86 fastcall convention dynamically creates clobber masks depending on the arguments of the call.

Some thoughts:

  • Is this motivated by AMDGPU having a huge number of registers in the mask?

Yes. It more than fills my fullscreen terminal window on any call. It wouldn't be so bad if it printed just regunits, but this also prints all of the register tuple combinations as well.

  • Unfortunately regmasks cannot be expressed in regunits today. The canonical example is xmm0 and ymm0 on X86 occuying the exact same register units (because the extra bits of ymm0 cannot be addressed/named independently) but they are still separate registers having different clobbering behavior in some calling conventions.
arsenm updated this revision to Diff 105099.Jul 3 2017, 9:05 AM
arsenm retitled this revision from Change default for -print-whole-regmask to false to Replace -print-whole-regmask with a threshold..
arsenm edited the summary of this revision. (Show Details)

Replace with threshold

MatzeB accepted this revision.Jul 19 2017, 4:56 PM

LGTM

This revision is now accepted and ready to land.Jul 19 2017, 4:56 PM
arsenm closed this revision.Jul 19 2017, 5:38 PM

r308572