Page MenuHomePhabricator

[InstCombine] Disable insertRangeTest() optimization for BPF target

Authored by yonghong-song on Nov 17 2019, 8:43 PM.



The LLVM generated BPF byte codes need go through kernel
verifier before being allowed to execute in kernel.

Kernel verifier

performs path sensitive analysis to verify safety of the program.
The verification is done during bpf program loading time, and
typically right before the program starts to run.

Since verifier is executed in kernel space and it runs during program
loading time, there is a great effort to avoid introducing complexity
and running time overhead for it. Sometime, in order to add analysis
to verifier, user code hacking is conducted to workaround the issue.

Related to this patch, the following kernel patch is a workaround
for code generated by LLVM instcombine insertRangeTest().

   107:       w1 = w0
   108:       w1 += -1
   109:       if w1 > 6 goto -24 <LBB0_5>
   110:       w0 += w8

Basically since verifier does not record and propagate copy state
for performance and memory reasons. Register "w0" value range
will become conservative and later on may cause verification failure.

Another example is

People has to come up with weird ways to workaround this issue.
To improve user space usability, this patch proposed to disable
insertRangeTest() for bpf target. All other targets are not

Diff Detail

Event Timeline

yonghong-song created this revision.Nov 17 2019, 8:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 17 2019, 8:43 PM

I added a few reviewers based on the recent commits at InstCombine directory. Please let me know if I missed anybody who should review the patch. Thanks!

spatel requested changes to this revision.Nov 19 2019, 5:28 AM

I can't tell exactly what the motivation is from the description, but there are 2 major problems with this patch:

  1. Disabling a transform does not solve the problem completely. If the source was written in the form that you do not want, you would want the reverse transform to produce the desired IR/asm.
  2. By design, instcombine is a target-independent canonicalization pass. Adding the use of TTI defeats that goal.

There are (at least) 2 possible alternatives if you are determined to change this in the compiler rather than change the source code:

  1. Reverse the existing transform for all targets (this will be very difficult to justify if the transform that you are hoping to disable results in fewer IR instructions).
  2. Reverse the existing transform for BPF in a later pass (most likely candidates are DAGCombiner or CodeGenPrepare).
This revision now requires changes to proceed.Nov 19 2019, 5:28 AM

@spatel Thanks for the comments. I agree with your assessment. Looks like the best place is BPF backend to undo this optimization. I will look into implementation in that direction.

yonghong-song abandoned this revision.Nov 19 2019, 9:23 PM

Look into the implementation in BPF backend.

Look into the implementation in BPF backend.

I was going to suggest that, but the problem description read to me
as pretty hopeless situation with no guaranteed way out.
Hopefully i'm just misreading that.