This is an archive of the discontinued LLVM Phabricator instance.

[ImplicitNullCheck] Handle Nonzero faulting pages and complex addressing
ClosedPublic

Authored by anna on Sep 4 2020, 8:43 AM.

Details

Summary

This patch:

  1. introduces a new TII API for calculating the address of a memory operation in complex addressing mode. This is a general function, which checks for both bas and index registers. This is used in isSuitableMemoryOp in ImplicitNullChecks and all the validity checks are done in this caller function. Earlier, we used to bail out if the memory operation had an index register or a scale != 1 (i.e. we supported offsets with simple addressing mode).
  2. handles memory accesses (and makes the null checks implicit) where the faulting page is a non-zero one as

specified through a named module level metadata startaddress_faulting_pages introduced in this patch. The optional metadata is added by front-ends.

See #2 in action with added MIR tests for positive and negative case. #1
allows us to handle a case which was not handled earlier (see
implicit-null-checks.ll).

Diff Detail

Event Timeline

anna created this revision.Sep 4 2020, 8:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2020, 8:43 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
anna requested review of this revision.Sep 4 2020, 8:43 AM
anna planned changes to this revision.Sep 4 2020, 1:35 PM

While working on a follow-up change, I realized we don't need to separate out whether we're handling simple or complex addressing mode based on whether we fault in zero page or not. Both are orthogonal issues. The reason I coupled them is because we still need a way to get the "total constant address" *if it exists* and make a distinction between when we succeeded to get one versus when there was no such constant address apart from the offset.

anna updated this revision to Diff 290395.Sep 7 2020, 7:50 PM

handles complex addressing mode using base and index operands. All test cases pass.
I plan to add more counter examples once the basic idea is agreed upon.

anna added inline comments.Sep 7 2020, 7:52 PM
llvm/test/CodeGen/X86/implicit-null-check-negative.ll
132

I added these to show that none of the negative test cases start failing once we add the metadata with "other faulting pages".

dantrushin added inline comments.Sep 8 2020, 7:29 AM
llvm/include/llvm/CodeGen/TargetInstrInfo.h
1305

These names are confusing. I've been scrolling back and forth for several minutes :)
Why not to pass and name them separately?

llvm/lib/CodeGen/ImplicitNullChecks.cpp
390

Do you really need to check that BaseRegUsedInConstAddr.first is not zero?
Would simple BaseRegUsedInConstAddr.first != PointerReg work here?

I also found positive names more readable, e.g. bool BaseRegIsNullChecked = BaseRegUsedInConstAddr.first == PointerReg

I think it is reasonable to assert Scale !=0 || IndexRegUsedInConstAddr.first == 0 and remove Scale check.

llvm/lib/Target/X86/X86InstrInfo.cpp
3680

Any reason to use auto * instead of auto & ?
If I read it correctly, BaseOp is used to get base register only. Using reference you'd have less characters to type :)

3693

ditto. auto &IndexOp = MI.getOperand() ?

anna edited reviewers, added: reames; removed: philip.Sep 8 2020, 8:04 AM
anna updated this revision to Diff 291011.Sep 10 2020, 10:04 AM

addressed review comments. Added more testcases (positive and countercases).

anna added inline comments.Sep 10 2020, 10:08 AM
llvm/lib/CodeGen/ImplicitNullChecks.cpp
390

Changed the names and hopefully more readable.
We need to check that the register exists before checking if it is equal to the null checked register because we don't want to fail when we don't have the register (in that case, we wont be using it in the addres calculation). Here we will state that the memory op is unsuitable and then fail. I realized this through some failing test case imp_null_check_load in implicit-null-checks.ll. There, we do not use the index register, so it is zero.

Thanks, it reads much better for me now.
Looks good.

anna marked 2 inline comments as done.Sep 10 2020, 10:44 AM
anna edited the summary of this revision. (Show Details)Sep 10 2020, 10:58 AM
reames requested changes to this revision.Sep 11 2020, 9:32 AM

Must fix:

  • Your change does not include a LangRef update for the new metadata.
  • The name chosen for the metadata is ambiguous. I'd also suggest phrasing it in terms of ranges, not page starts. Maybe: guaranteed_faulting_ranges, or implicit_check_faulting_ranges?
  • As you note in the review, this is mixing two sets of changes in a way which makes it hard to reason about either in isolation. Please split. Either order is fine as you should be able to test either piece in isolation without the other.
llvm/include/llvm/CodeGen/TargetInstrInfo.h
1301

I find the terminology used here to be very confusing. A constant address to me is either a literal constant or a constant expression. You appear to be referring to complex addressing instead. I believe that your ConstantAddr is the Displacement field in a x86 complex addressing, is that true?

If so, rename required. If not, please explain.

llvm/lib/CodeGen/ImplicitNullChecks.cpp
476

From your comment, it sounds like the existing code handles two built in ranges, not one.
0, PageSize
0x100..00, 0x111..11

(i.e. the zeroth page and kernel address range for Linux)

I'd suggest adjusting comments and code structure to reflect same, and to make range handling (not page starts) explicit.

llvm/lib/Target/X86/X86InstrInfo.cpp
3701

This really doesn't seem like it belongs here. I think this could be cleanly moved into the caller, simplifying the behaviour of this routine greatly.

This revision now requires changes to proceed.Sep 11 2020, 9:32 AM
anna updated this revision to Diff 294075.Sep 24 2020, 8:38 AM

addressed review comments about splitting out patches. This patch handles complex addressing mode.

Looks close to ready. Please address review comments, and we'll probably be ready for an LGTM.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
980

"is an instruction"

llvm/lib/CodeGen/ImplicitNullChecks.cpp
406

These should be impossible, use asserts.

424

There's a potential overflow case here which needs consideration. Consider 0xF...F as constant and scale as 2.

You probably need to return some information about the register width/wrapping of the addressing mode to do this in a target independent way.

428

Please return true or false, then set the per register flag. This is too much coupling.

anna added inline comments.Sep 24 2020, 9:49 AM
llvm/lib/CodeGen/ImplicitNullChecks.cpp
406

Will use assert for the multiplier and add a comment why RegUsedInAddr can be zero reg, i.e. something like: movq 8(,%rdi,8), %rax here the BaseReg is X86::NoRegister and ScaleReg is rdi.

424

good catch! will take a look.

428

yes, agreed.

reames requested changes to this revision.Sep 29 2020, 9:47 AM
This revision now requires changes to proceed.Sep 29 2020, 9:47 AM
anna updated this revision to Diff 295023.Sep 29 2020, 9:55 AM

addressed review comments. Bails out in case of overflow (added testcase for multiply overflow).

reames accepted this revision.Sep 29 2020, 11:41 AM

LGTM, but with required changes.

llvm/lib/CodeGen/ImplicitNullChecks.cpp
393

Please add a check here that the size of BaseReg, ScaledReg, and PointerReg are all equal.

I'm concerned about implicit sign extension of registers, and the interaction with the displacement code below. The code you have may be correct, but I'd prefer a explicit bail out just to make it easier to tell.

405

You could if you wished do Displacement mod 2^64, but I don't see any point in that.

411

Add comment here: "If a register used in the address is constant, fold it's effect into the displacement for ease of analysis."

432

Replace IsSignedImmVal with constant true.

This isn't "is negative", this is "interpret this as a signed number instead of unsigned".

434

You should be able to just define the APInt in the definition below. The bitwidth will come from the RHS.

441

e,g, APInt Product =

Also, remember to constant fold isSignedImmVal

llvm/test/CodeGen/X86/implicit-null-check.ll
625

You don't appear to actually have a test here which uses a displacement large enough to need a base register, please add one.

(i.e. modify this test to add not 3526, but 0xFFFFF0000 (or something similarly large)

This revision is now accepted and ready to land.Sep 29 2020, 11:41 AM
anna marked 5 inline comments as done.Oct 7 2020, 1:05 PM
anna added inline comments.
llvm/test/CodeGen/X86/implicit-null-check.ll
625

I have a test in the negative test cases file above where the displacement is large and outside the faulting page. I'll modify it to a very large value so that it is in a register and not an immediate.

anna updated this revision to Diff 296776.Oct 7 2020, 1:07 PM

addressed review comments, rebased and updated for AArch64 (this patch is NFC in behaviour for AArch64).

This revision was landed with ongoing or failed builds.Oct 7 2020, 5:55 PM
This revision was automatically updated to reflect the committed changes.

Hey JFYI this created some unused variables. I fixed with https://reviews.llvm.org/D89022