This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][Localizer] Enable intra-block localization of already-local uses.
ClosedPublic

Authored by aemerson on Mar 3 2020, 11:50 AM.

Details

Summary

This changes the localizer to attempt intra-block localizer of instructions that have local uses. This is useful because sometimes the entry block itself has many uses of constant-like instructions, which would benefit from shortening live ranges. Previously if an inst had no non-local uses, we wouldn't add it to the list of instructions to attempt further intra-block localization.

This gives a 0.7% geomean code size improvement on CTMark.

Diff Detail

Event Timeline

aemerson created this revision.Mar 3 2020, 11:50 AM

Actually this is missing a bunch of test updates....

aemerson updated this revision to Diff 248028.Mar 3 2020, 2:10 PM

Now with test updates.

paquette accepted this revision.Mar 3 2020, 3:38 PM

LGTM

This revision is now accepted and ready to land.Mar 3 2020, 3:38 PM
This revision was automatically updated to reflect the committed changes.
omjavaid reopened this revision.Mar 4 2020, 2:14 PM
omjavaid added a subscriber: omjavaid.

Hi @aemerson this change caused a regression in lldb testsuite on AArch64/Linux. Apparently stepping gets affected by your change. Kindly revisit.

I have temporarily reverted this change.

http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/2182

This revision is now accepted and ready to land.Mar 4 2020, 2:14 PM
omjavaid requested changes to this revision.Mar 4 2020, 2:16 PM
This revision now requires changes to proceed.Mar 4 2020, 2:16 PM

Hi @aemerson this change caused a regression in lldb testsuite on AArch64/Linux. Apparently stepping gets affected by your change. Kindly revisit.

I have temporarily reverted this change.

http://lab.llvm.org:8011/builders/lldb-aarch64-ubuntu/builds/2182

That test looks...wrong. single_step_only_steps_one_instruction() looks to be checking that exactly *one* instruction is needed to step between each line of the test code:

static void swap_chars() {
  g_c1 = '1';
  g_c2 = '0';

  g_c1 = '0';
  g_c2 = '1';
}

Before it was just pure luck that the 4 stores were scheduled consecutively:

__ZL10swap_charsv:
  mov w8, #0x31
  adrp  x9, 9 ; 0x100010000
  add x9, x9, #0x240 
  mov w10, #0x30 
  adrp  x11, 9 ; 0x100010000 
  add x11, x11, #0x241 
  strb  w8, [x9] 
  strb  w10, [x11] 
  strb  w10, [x9] 
  strb  w8, [x11] 
  ret

with this patch, we emit some address computation after the first store:

__ZL10swap_charsv:
  adrp  x8, 9 ; 0x100010000
  add x8, x8, #0x240
  mov w9, #0x31
  strb  w9, [x8]
  adrp  x10, 9 ; 0x100010000
  add x10, x10, #0x241
  mov w11, #0x30
  strb  w11, [x10]
  strb  w11, [x8]
  strb  w9, [x10]
  ret

which is perfectly valid. I'm not sure how to fix this test because even if we change the assertion to be something like assert(step_count <= max_steps), it's still fragile.

aemerson updated this revision to Diff 248567.Mar 5 2020, 12:00 PM

@omjavaid can you look over the lldb changes? I don't have the hardware to be able to actually run this test but I've tried to relax the checks.

Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2020, 12:00 PM

@omjavaid can you look over the lldb changes? I don't have the hardware to be able to actually run this test but I've tried to relax the checks.

@aemerson I have moved these test to skip list as they dont seem to be valid at least for ARM/AArch64 as you pointed out above. As there already exist a bug against them I ll look into fixing them later.
Thanks for your help and apologies for inconvenience.

omjavaid accepted this revision.Mar 6 2020, 1:31 AM
This revision is now accepted and ready to land.Mar 6 2020, 1:31 AM
labath added a subscriber: labath.Mar 6 2020, 2:10 AM

Yep, that test really shouldn't be doing that. Historically, lldb has been avoiding architecture specific artifacts (like assembly) in its tests, but that didn't really work out here. That test has become a nightmare of architecture-specific assertions.

If we're going to be testing assembly-level properties, we really should be using assembly inputs. I have now rewritten it to use inline assembly to guarantee a known sequence of instructions. I think the arm assembly is correct (it compiles), but I don't have the hardware to try it on.

@omjavaid, if you're able to test that locally, can you remove the skip decorator and give this a spin? Otherwise, I can just remove the decorator and watch the bot...

This revision was automatically updated to reflect the committed changes.

Yep, that test really shouldn't be doing that. Historically, lldb has been avoiding architecture specific artifacts (like assembly) in its tests, but that didn't really work out here. That test has become a nightmare of architecture-specific assertions.

If we're going to be testing assembly-level properties, we really should be using assembly inputs. I have now rewritten it to use inline assembly to guarantee a known sequence of instructions. I think the arm assembly is correct (it compiles), but I don't have the hardware to try it on.

@omjavaid, if you're able to test that locally, can you remove the skip decorator and give this a spin? Otherwise, I can just remove the decorator and watch the bot...

Hi @labath thanks for the quick fix moving to inline assembly fixed this on AArch64 linux. I have removed skipIf decorators now.