This is an archive of the discontinued LLVM Phabricator instance.

Tests for D121320
ClosedPublic

Authored by MatzeB on Mar 9 2022, 11:31 AM.

Diff Detail

Event Timeline

MatzeB created this revision.Mar 9 2022, 11:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 11:31 AM
MatzeB requested review of this revision.Mar 9 2022, 11:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2022, 11:31 AM
MatzeB retitled this revision from Precommit tests to Precommit tests for D121320.
MatzeB retitled this revision from Precommit tests for D121320 to Tests for D121320.
MatzeB updated this revision to Diff 414234.Mar 9 2022, 4:02 PM
spatel accepted this revision.Mar 10 2022, 5:18 AM
spatel added a subscriber: spatel.

LGTM - if there's any overlap between D121320 and D121147, it will be easier to see if all tests are committed.

This revision is now accepted and ready to land.Mar 10 2022, 5:18 AM
spatel added a comment.EditedMar 10 2022, 6:33 AM

I looked at the effect of the 2 proposed patches together, and we should add at least 2 more tests (can be added to this commit or separately):

define i1 @shifted_mask_testb_lower32(i64 %a) {
  %v0 = and i64 %a, 66846720  ; 0xff << 18
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}

define i1 @shifted_mask_testw_lower32(i64 %a) {
  %v0 = and i64 %a, 131072 ; 0xffff << 1
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}

The difference for these is that the shifted mask does not extend into the upper 32-bits of the value. We probably want to convert these to use shifts too, and we'll handle that with both patches applied.

I looked at the effect of the 2 proposed patches together, and we should add at least 2 more tests (can be added to this commit or separately):

define i1 @shifted_mask_testb_lower32(i64 %a) {
  %v0 = and i64 %a, 66846720  ; 0xff << 18
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}

This codegens as testl $66846720, %edi ; setne %al without any of our changes which to me seems to be as good as it gets. Using a shift for the "shifted_mask" cases only makes sense when the constant value becomes so big that it no longer fits the 32bit immediates on X86...

I will add the function anyway as an example for something that should not be optimized, does that make sense?

define i1 @shifted_mask_testw_lower32(i64 %a) {
  %v0 = and i64 %a, 131072 ; 0xffff << 1

I guess this should have been 131070 :)

%v1 = icmp ne i64 %v0, 0
ret i1 %v1

}

This ends up being just another variant of the other function where it is best to just use a testl with immediate, so I think I can skip this.

The difference for these is that the shifted mask does not extend into the upper 32-bits of the value. We probably want to convert these to use shifts too, and we'll handle that with both patches applied.

I think those should keep using testl + immediate.

MatzeB updated this revision to Diff 414405.Mar 10 2022, 9:22 AM

I looked at the effect of the 2 proposed patches together, and we should add at least 2 more tests (can be added to this commit or separately):

define i1 @shifted_mask_testb_lower32(i64 %a) {
  %v0 = and i64 %a, 66846720  ; 0xff << 18
  %v1 = icmp ne i64 %v0, 0
  ret i1 %v1
}

This codegens as testl $66846720, %edi ; setne %al without any of our changes which to me seems to be as good as it gets. Using a shift for the "shifted_mask" cases only makes sense when the constant value becomes so big that it no longer fits the 32bit immediates on X86...

I will add the function anyway as an example for something that should not be optimized, does that make sense?

define i1 @shifted_mask_testw_lower32(i64 %a) {
  %v0 = and i64 %a, 131072 ; 0xffff << 1

I guess this should have been 131070 :)

Oops - yes, I missed dropping the last bit.

The difference for these is that the shifted mask does not extend into the upper 32-bits of the value. We probably want to convert these to use shifts too, and we'll handle that with both patches applied.

I think those should keep using testl + immediate.

I'm not sure if there's a universal answer. As you mentioned, it may depend on throughput vs. saving on instruction size. Either way, the tests can be there to show current/expected codegen.

MatzeB updated this revision to Diff 415490.Mar 15 2022, 10:22 AM

Added test variants with more than 1 use for the constant/add.

This revision was landed with ongoing or failed builds.Mar 15 2022, 2:18 PM
Closed by commit rGbaae814377bc: Add tests for D121320 (authored by MatzeB). · Explain Why
This revision was automatically updated to reflect the committed changes.