Page MenuHomePhabricator

[GlobalISel] Import patterns containing INSERT_SUBREG

Authored by paquette on Aug 20 2019, 2:41 PM.



This teaches the importer to handle INSERT_SUBREG instructions. We were missing patterns involving INSERT_SUBREG in AArch64.

To meaningfully import it, the GlobalISelEmitter needs to know how to infer a super register class for a given register class.

This patch introduces the following to achieve this:

  • getSuperRegForSubReg, a function which finds the largest register class which supports a value type and subregister index
  • inferSuperRegisterClass, a function which finds the appropriate super register class for an INSERT_SUBREG
  • inferRegClassFromPattern, a function which allows for some trivial lookthrough into instructions
  • getRegClassFromLeaf, a helper function which returns the register class for a leaf TreePatternNode
  • Support for subregister index operands in importExplicitUseRenderer

It also

  • Updates tests in each backend which are impacted by the change
  • Adds to test that we import and skip the expected patterns

As a result of this patch, INSERT_SUBREG patterns in X86 may use the LOW32_ADDR_ACCESS_RBP register class instead of GR32. This is correct, since the register class contains the same registers as GR32 (except with the addition of RBP). So, this also teaches X86 to handle that register class. This is in line with X86ISelLowering, which treats this as a GR class.

Diff Detail


Event Timeline

paquette created this revision.Aug 20 2019, 2:41 PM
aemerson added inline comments.Aug 22 2019, 2:01 PM
4149 ↗(On Diff #216254)

getSuperRegForSubReg() seems to return an Optional, but here you get the value without checking if it's None.

paquette updated this revision to Diff 216933.Aug 23 2019, 1:14 PM
paquette edited the summary of this revision. (Show Details)
  • Check for None in inferSuperRegisterClass.
  • Rename test to I have some changes that implement SUBREG_TO_REG and I decided to reuse the test for it, so might as well do the rename here.
  • Remove unused flag from FileCheck line in the test.
This revision is now accepted and ready to land.Aug 26 2019, 11:43 AM
This revision was automatically updated to reflect the committed changes.
paquette added a comment.EditedAug 27 2019, 9:48 AM

@RKSimon, I'm trying to reproduce the test failures locally without much luck right now. For me, with EXPENSIVE_CHECKS enabled, the modified tests all pass.

Do the Windows builders build -mtriple=aarch64-- differently?

As for the Tablegen test, it looks like it's not redirecting the output properly on Windows.

After a bit of digging, it looks like we're not actually getting stable output here when the match tables are optimized. If they aren't optimized, it's stable.

Going to back this out and recommit once I figure out what's going on.

For some reason, I thought llvm::sort was stable. So, what was happening is that sometimes you would end up with different register classes in getSuperRegForSubReg. As a result, when match tables are optimized, sometimes you would end up missing entire cases in the generated C++. (It would sometimes take a few tries to get it to fail, which is why I wasn't seeing it.)

Changing the llvm::sort to llvm::stable_sort fixes this on my end.

Recommitted in r370084. Hopefully this will make the bots happy. :)

mgrang added inline comments.Aug 27 2019, 12:10 PM

If A and B have the same size then you would end up with a tie. This can result in non-deterministic sorting order. I guess you would need a tie breaker in that case.

paquette marked an inline comment as done.Aug 27 2019, 1:11 PM
paquette added inline comments.

Yes, that was a bug. :) I switched it to llvm::stable_sort in r370084 to fix that. I think that's sufficient, and it's passing on the EXPENSIVE_CHECKS bots now as far as I can see.

(I believe that getRegClasses() is stable, so I think llvm::stable_sort should produce a deterministic result.)