This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
3
Bisection.md
-
include/llvm/Support/
-
llvm/
-
Support/
-
Bisector.h
-
RemoteBisectorClient.h
-
lib/Support/
-
Support/
1
Bisector.cpp
-
CMakeLists.txt
5
RemoteBisectorClient.cpp
-
test/
-
CMakeLists.txt
-
tools/llvm-bisectd/
-
llvm-bisectd/
1
CMakeLists.txt
2
llvm-bisectd.cpp
-
unittests/Support/
-
Support/
-
BisectTest.cpp
-
CMakeLists.txt

Differential D113030

Add a new tool for parallel safe bisection, "llvm-bisectd".
Needs RevisionPublic

Authored by aemerson on Nov 2 2021, 10:17 AM.

Download Raw Diff

Details

Reviewers

paquette
arsenm
foad
qcolombet
dsanders
lattner
compnerd

Summary

Excerpt from Bisection.md document:

Introduction

The llvm-bisectd tool allows LLVM developers to rapidly bisect miscompiles in
clang or other tools running in parallel.

Bisection as a general debugging technique can be done in multiple ways. We can
bisect across the *time* dimension, which usually means that we're bisecting
commits made to LLVM. We could instead bisect across the dimension of the LLVM
codebase itself, disabling some optimizations and leaving others enabled, to
narrow down the configuration that reproduces the issue. We can also bisect in
the dimension of the target program being compiled, e.g. compiling some parts
with a known good configuration to narrow down the problematic location in the
program. The llvm-bisectd tool is intended to help with this last approach to
debugging: finding the place where a bug is introduced. It does so with the aim
of being minimally intrusive to the build system of the target program.

High level design

The bisection process with llvm-bisectd uses a client/server model, where all
the state about the bisection is maintained by the llvm-bisectd daemon. The
compilation tools (e.g. clang) send requests and get responses back telling
them what to do. As a developer, debugging using this methodology is intended
to be simple, with the daemon taking care of most of the complexity.

[End excerpt]

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aemerson created this revision.Nov 2 2021, 10:17 AM

Herald added subscribers: dexonsmith, hiraditya, fedor.sergeev, mgorny. · View Herald TranscriptNov 2 2021, 10:17 AM

aemerson requested review of this revision.Nov 2 2021, 10:17 AM

Herald added a subscriber: wdng. · View Herald TranscriptNov 2 2021, 10:17 AM

aemerson mentioned this in D113031: [GlobalISel] Add a bisection point after instruction selection..Nov 2 2021, 10:18 AM

aemerson added a child revision: D113031: [GlobalISel] Add a bisection point after instruction selection..

D113031 contains an example of a client for this in GlobalISel.

dblaikie added a subscriber: dblaikie.Nov 2 2021, 10:39 AM

paquette added inline comments.Nov 2 2021, 10:43 AM

llvm/docs/Bisection.md
41	In the GISel example patch, you disambiguate further using the target. Should you mention that here?
77	"into a vector" seems like an implementation detail
119	Should bisector support be only included in assert/debug builds?

Harbormaster completed remote builds in B132003: Diff 384150.Nov 2 2021, 10:51 AM

Hi Amara,

I am kind of repeating what I said in https://reviews.llvm.org/D113031, but putting it here for better visibility.

I think the thing that drives the bisection shouldn't trickle down in the optimizations themselves, instead I would rather this information to be encoded in the IR itself (like a generalization of optnone).

For instance, for us a daemon based approach doesn't work at all because we are running a JIT compiler that runs in its own sandbox and we cannot query an external process from it.

Internally, we developed a bisect tool that annotates the IR upfront (to make an analogy with clang, this is as if clang would add a bunch of "bisect" attribute on the IR) before sending it for compilation instead of having the backend query a bisection client.
Note: technically we didn't modify clang, we instead had a specific pass at the beginning of the LLVM pipeline that would make all the bisection decisions, but also perform some transformation to isolate the bugs (like creating functions out of basic blocks, splitting the blocks to make the functions smaller, and blocking inlining.)

Cheers,
-Quentin

In D113030#3103632, @qcolombet wrote:

Hi Amara,

I am kind of repeating what I said in https://reviews.llvm.org/D113031, but putting it here for better visibility.

I think the thing that drives the bisection shouldn't trickle down in the optimizations themselves, instead I would rather this information to be encoded in the IR itself (like a generalization of optnone).

For instance, for us a daemon based approach doesn't work at all because we are running a JIT compiler that runs in its own sandbox and we cannot query an external process from it.

Internally, we developed a bisect tool that annotates the IR upfront (to make an analogy with clang, this is as if clang would add a bunch of "bisect" attribute on the IR) before sending it for compilation instead of having the backend query a bisection client.
Note: technically we didn't modify clang, we instead had a specific pass at the beginning of the LLVM pipeline that would make all the bisection decisions, but also perform some transformation to isolate the bugs (like creating functions out of basic blocks, splitting the blocks to make the functions smaller, and blocking inlining.)

Cheers,
-Quentin

How does that work when you have parallel builds? When multiple clang processes are running simultaneously, and you want to bisect to a specific translation unit, and then within that TU to a specific point in the module, don't you need some co-ordination?

Let me just step back a little bit and say that now that I think about what we did, having something that answers "should I run in this instance" is desirable, the implementation doesn't really matter. We did it with function attributes, but having a bisect client API like you're introducing is fine.
My only complain is that the client interface should not have remote in the name :P.

From an abstraction level, we need two things:

Something that tells if an optimization needs to run (the remote bisect client here)
Something that drives the on/off of the optimizations based on the previous state (here your daemon)

The way we did that was:
For #1 we added annotations in the IR
For #2 we implemented the driver directly in our JIT daemon

Essentially, that boils down to something that formulates a plan and something that executes the plan. At one point I was thinking that formulating the plan could be changing the pass pipeline (don't insert what you don't want to run), but that look like too much work :).

How does that work when you have parallel builds?

Each module is assigned an ID and the bisect plan and previous state are mapped to this ID.
The ID is saved in the module metadata but for now we didn't use it since all we needed was added to the IR via annotations (i.e., we didn't need to come up with a key to ask information about specific pass). In that regard, you're approach is more general.
For the ID, we used a hash of the module before the bisect annotations were added, i.e., as long as you don't change the front-end the ID are stable between runs.

To summarize:
Compute the module ID -> add annotation based on past information -> run the backend (at this point, the backend runs by itself.)

When multiple clang processes are running simultaneously, and you want to bisect to a specific translation unit, and then within that TU to a specific point in the module, don't you need some co-ordination?

At a high level here is what the driver was doing:

Bisect optnone on each module
Find the module(s) that creates the problem (the minimal set of modules that needs optimizations turn on)
Do the same on each function (the minimal set may involve more than one function)
Try to "outline" the basic blocks of each problematic function and do the same process on the newly created functions
Split the problematic basic block to make them smaller and continue
When you're happy with the size of the basic blocks, start bisecting the optimizations on the problematic functions (possibly basic block extracted). Right now we were only bisecting a handful of optimization because the final diff with the basic block splitting usually made the faulty optimization easy to find by hand.

The way it worked is all that state was saved in a file <shaderID>-bisect-info. You could bootstrap the process by populating the file by hand, i.e., by telling the JIT process which module you want to bisect.

As far as bisecting to a specific point in the TU, we were always going all the way down to the executable then you had to supply a script that tells whether or not the program is working. That's similar to what git bisect is doing (if the script runs 0, the program works, if that's one it doesn't). In your script you could check for whatever (specific sequence of asm, executable producing some results, etc.)

Note: The pass I was talking about in my previous reply that we insert in the LLVM pipeline, is generating all the information to start the bisect process (e.g., the shader ID, the list of all the functions, the list of all basic blocks). Then the driver was using this information to tell that pass to add some annotation on some function (e.g., optnone, noinline, etc.), but also to split some basic block and outline them (and attach some annotation on them).

Cheers,
-Quentin

In D113030#3103808, @qcolombet wrote:

Let me just step back a little bit and say that now that I think about what we did, having something that answers "should I run in this instance" is desirable, the implementation doesn't really matter. We did it with function attributes, but having a bisect client API like you're introducing is fine.
My only complain is that the client interface should not have remote in the name :P.

From an abstraction level, we need two things:

Something that tells if an optimization needs to run (the remote bisect client here)

Something that drives the on/off of the optimizations based on the previous state (here your daemon)

The way we did that was:
For #1 we added annotations in the IR
For #2 we implemented the driver directly in our JIT daemon

Essentially, that boils down to something that formulates a plan and something that executes the plan. At one point I was thinking that formulating the plan could be changing the pass pipeline (don't insert what you don't want to run), but that look like too much work :).

How does that work when you have parallel builds?

Each module is assigned an ID and the bisect plan and previous state are mapped to this ID.
The ID is saved in the module metadata but for now we didn't use it since all we needed was added to the IR via annotations (i.e., we didn't need to come up with a key to ask information about specific pass). In that regard, you're approach is more general.
For the ID, we used a hash of the module before the bisect annotations were added, i.e., as long as you don't change the front-end the ID are stable between runs.

To summarize:
Compute the module ID -> add annotation based on past information -> run the backend (at this point, the backend runs by itself.)

When multiple clang processes are running simultaneously, and you want to bisect to a specific translation unit, and then within that TU to a specific point in the module, don't you need some co-ordination?

At a high level here is what the driver was doing:

Bisect optnone on each module

Find the module(s) that creates the problem (the minimal set of modules that needs optimizations turn on)

Do the same on each function (the minimal set may involve more than one function)

Try to "outline" the basic blocks of each problematic function and do the same process on the newly created functions

Split the problematic basic block to make them smaller and continue

When you're happy with the size of the basic blocks, start bisecting the optimizations on the problematic functions (possibly basic block extracted). Right now we were only bisecting a handful of optimization because the final diff with the basic block splitting usually made the faulty optimization easy to find by hand.

The way it worked is all that state was saved in a file <shaderID>-bisect-info. You could bootstrap the process by populating the file by hand, i.e., by telling the JIT process which module you want to bisect.

As far as bisecting to a specific point in the TU, we were always going all the way down to the executable then you had to supply a script that tells whether or not the program is working. That's similar to what git bisect is doing (if the script runs 0, the program works, if that's one it doesn't). In your script you could check for whatever (specific sequence of asm, executable producing some results, etc.)

Note: The pass I was talking about in my previous reply that we insert in the LLVM pipeline, is generating all the information to start the bisect process (e.g., the shader ID, the list of all the functions, the list of all basic blocks). Then the driver was using this information to tell that pass to add some annotation on some function (e.g., optnone, noinline, etc.), but also to split some basic block and outline them (and attach some annotation on them).

Cheers,
-Quentin

Ok I think I sort of understand your flow now. I agree that it doesn't sound like our approaches are really conflicting. The remote bisection client code could certainly be hidden behind a more generic interface, and for your approach we could select an implementation that just queries the function attributes instead. For the bisection co-ordination with the files I'm not sure how that impacts this tooling, if it all. One reason I went for a daemon was that for some build systems, persistent files across builds are difficult to keep due to build sandboxing (sockets themselves need some workarounds to work with sandboxes).

I'll see what kind of abstraction I can come up with for the client in this patch, but it won't be tested since your tooling isn't upstream. I'm also guessing that you'd want to avoid using string keys for the function-attribute implementation?

I would like to note that there's some rudimentary support
for previous generation of this scattered throught the codebase,
namely DebugCounters for bugpoint.

I don't really have an opinion on the proposal at large,
but i think it may be important to not just introduce a yet another variant
of dealing with the same issue, but only have a single good modern way.

In D113030#3103903, @lebedev.ri wrote:

I would like to note that there's some rudimentary support
for previous generation of this scattered throught the codebase,
namely DebugCounters for bugpoint.

I don't really have an opinion on the proposal at large,
but i think it may be important to not just introduce a yet another variant
of dealing with the same issue, but only have a single good modern way.

Yes, I've seen DebugCounter. It's useful when you're already identified the file where something is going wrong, and you can enable the counters and pass counter values to opt. It doesn't however work across multiple TUs or support parallel debugging, so it's not solving the same problem.

I'm also guessing that you'd want to avoid using string keys for the function-attribute implementation?

Yeah the string keys are not ideal since that's difficult to know what we're dealing with at that point. If I were to switch to this API, I would need to know the module ID (that we insert before hand), the function, and the instance of the optimization (e.g., is it the first invocation of InstCombine, the second etc.) to read the right annotation.
That may prove difficult to have something general here.

Perhaps we could give the bisect client the context of what is being optimized (e.g., the module for module passes, the module and the function for function passes (or just the function since the module can be found from the function), the module, the function and the loop for loop passes, etc.) and where in the pipeline we are (e.g., the pass ID) and hide the formation of a string key within the bisect client API.

The bonus side effect, is that the bisect client could handle the multi instances of passes by itself (e.g., assuming the module as a unique ID set just once, we could use this ID and the passID to know which instance of the pass is running: the bisect client could keep a count of how many times it saw that pair.)

Note: For the context, for us, the path of a file is not a good discriminant because we compile directly in-memory (the front-end generates the IR in memory and don't write it to a file) so all modules have the same path (empty). Therefore, that context, the string keys, as shown in the GISel pass (https://reviews.llvm.org/D113031), would just yield the same key over and over across different modules for all the functions with the same name (like main).

Cheers,
-Quentin

In D113030#3103971, @aemerson wrote:

In D113030#3103903, @lebedev.ri wrote:

I would like to note that there's some rudimentary support
for previous generation of this scattered throught the codebase,
namely DebugCounters for bugpoint.

I don't really have an opinion on the proposal at large,
but i think it may be important to not just introduce a yet another variant
of dealing with the same issue, but only have a single good modern way.

Yes, I've seen DebugCounter. It's useful when you're already identified the file where something is going wrong, and you can enable the counters and pass counter values to opt. It doesn't however work across multiple TUs or support parallel debugging, so it's not solving the same problem.

Perhaps it'd be worth considering the overlap here, and maybe avoiding it: What would it be like if this new thing /only/ bisected at the granularity of a whole compilation action (ie: one invocation of clang), rather than specific optimizations? (possibly even only at the "whole compiler" granularity - using two compilers - and choosing between one or the other for each compilation)

Then once the specific file has been identified, use the existing sub-file granularity tools (a wrapper script could help cover both of these for the user so it wasn't a bunch more manual work).

I think that might help prevent overlap of functionality/re-implementing similar/the same functionality through different mechanisms for reducing specific optimization applications?

(also: probably an add-on for the future: Have you considered not requiring a clean build for each bisection step? Would it be feasible to have the bisect tool delete (or produce insturctions the user can copy/paste, if preferred) specific output files it knows it's going to change - then letting the user rerun an incremental build that will cause those files to be regenerated and allowing the bisect tool to intercept their building to adjust how they're built - it seems like this only requires a reliable build system (so it could be optional, so it could be turned off in cases where a build system doesn't reliably check for existing outputs) and could significantly improve performance of such a tool?)

russell.gallop added a subscriber: russell.gallop.Nov 9 2021, 1:50 AM

In D113030#3109509, @dblaikie wrote:

In D113030#3103971, @aemerson wrote:

In D113030#3103903, @lebedev.ri wrote:

I would like to note that there's some rudimentary support
for previous generation of this scattered throught the codebase,
namely DebugCounters for bugpoint.

I don't really have an opinion on the proposal at large,
but i think it may be important to not just introduce a yet another variant
of dealing with the same issue, but only have a single good modern way.

Yes, I've seen DebugCounter. It's useful when you're already identified the file where something is going wrong, and you can enable the counters and pass counter values to opt. It doesn't however work across multiple TUs or support parallel debugging, so it's not solving the same problem.

Perhaps it'd be worth considering the overlap here, and maybe avoiding it: What would it be like if this new thing /only/ bisected at the granularity of a whole compilation action (ie: one invocation of clang), rather than specific optimizations? (possibly even only at the "whole compiler" granularity - using two compilers - and choosing between one or the other for each compilation)

Then once the specific file has been identified, use the existing sub-file granularity tools (a wrapper script could help cover both of these for the user so it wasn't a bunch more manual work).

I think that might help prevent overlap of functionality/re-implementing similar/the same functionality through different mechanisms for reducing specific optimization applications?

I think that restricting it to only allow bisection across translation units would be artificially constraining the functionality to not step on the toes of other features. That in itself doesn't seem the right approach, because the simplest thing right now is to support arbitrary bisection granularities. If we constrained it to only allow bisection down to TUs, then we haven't really simplified or re-used any code, all we've done is to make the user experience worse.

I'll re-iterate that the purpose of using a daemon and not the existing file-level granularity tools like DebugCounter is to support parallelism and minimal intrusion into the build system. That's the guiding principle behind this patch. If other tools use file-granularity bisection using simple counters, then they should stick with those features, they're solving different problems.

(also: probably an add-on for the future: Have you considered not requiring a clean build for each bisection step? Would it be feasible to have the bisect tool delete (or produce insturctions the user can copy/paste, if preferred) specific output files it knows it's going to change - then letting the user rerun an incremental build that will cause those files to be regenerated and allowing the bisect tool to intercept their building to adjust how they're built - it seems like this only requires a reliable build system (so it could be optional, so it could be turned off in cases where a build system doesn't reliably check for existing outputs) and could significantly improve performance of such a tool?)

This is a really nice idea that I hadn't considered. If we can improve the API so that bisection daemon knows the semantics of the keys, then it should in theory be simple to have the daemon touch those translation unit sources to trigger the build system to rebuild. Deleting the object files seems a bit harder, since we'd have to propagate the argument to -o through to the bisection point, whereas the TU input is available from the IR module.

In D113030#3125984, @aemerson wrote:

In D113030#3109509, @dblaikie wrote:

In D113030#3103971, @aemerson wrote:

In D113030#3103903, @lebedev.ri wrote:

I would like to note that there's some rudimentary support
for previous generation of this scattered throught the codebase,
namely DebugCounters for bugpoint.

I don't really have an opinion on the proposal at large,
but i think it may be important to not just introduce a yet another variant
of dealing with the same issue, but only have a single good modern way.

Yes, I've seen DebugCounter. It's useful when you're already identified the file where something is going wrong, and you can enable the counters and pass counter values to opt. It doesn't however work across multiple TUs or support parallel debugging, so it's not solving the same problem.

Perhaps it'd be worth considering the overlap here, and maybe avoiding it: What would it be like if this new thing /only/ bisected at the granularity of a whole compilation action (ie: one invocation of clang), rather than specific optimizations? (possibly even only at the "whole compiler" granularity - using two compilers - and choosing between one or the other for each compilation)

Then once the specific file has been identified, use the existing sub-file granularity tools (a wrapper script could help cover both of these for the user so it wasn't a bunch more manual work).

I think that might help prevent overlap of functionality/re-implementing similar/the same functionality through different mechanisms for reducing specific optimization applications?

I think that restricting it to only allow bisection across translation units would be artificially constraining the functionality to not step on the toes of other features. That in itself doesn't seem the right approach, because the simplest thing right now is to support arbitrary bisection granularities. If we constrained it to only allow bisection down to TUs, then we haven't really simplified or re-used any code, all we've done is to make the user experience worse.

I'll re-iterate that the purpose of using a daemon and not the existing file-level granularity tools like DebugCounter is to support parallelism and minimal intrusion into the build system. That's the guiding principle behind this patch. If other tools use file-granularity bisection using simple counters, then they should stick with those features, they're solving different problems.

Part of the direction is that it could influence/change the design of the feature, possibly to simplify it - for instance this could be implemented as a compiler wrapper (it could then even be compiler agnostic (maybe even tool agnostic/could be reused for other tooling investigations)) - for instance doing an A/B compiler comparison - the wrapper could do the network thing to check the key (keys would only be either input or output files (probably output files - input files could be ambiguous - the same input file might be rebuilt into multiple output targets with different arguments) and then run either one compiler or the other compiler - or otherwise massage the compiler arguments (disable an optimization, etc).

(also: probably an add-on for the future: Have you considered not requiring a clean build for each bisection step? Would it be feasible to have the bisect tool delete (or produce insturctions the user can copy/paste, if preferred) specific output files it knows it's going to change - then letting the user rerun an incremental build that will cause those files to be regenerated and allowing the bisect tool to intercept their building to adjust how they're built - it seems like this only requires a reliable build system (so it could be optional, so it could be turned off in cases where a build system doesn't reliably check for existing outputs) and could significantly improve performance of such a tool?)

This is a really nice idea that I hadn't considered. If we can improve the API so that bisection daemon knows the semantics of the keys, then it should in theory be simple to have the daemon touch those translation unit sources to trigger the build system to rebuild. Deleting the object files seems a bit harder, since we'd have to propagate the argument to -o through to the bisection point, whereas the TU input is available from the IR module.

i'd worry the input could be ambiguous (same file built with different -D flags - admittedly not /really/ common) but the output is /probably/ less ambiguous (build system's unlikely to rebuild/replace a given output)

paquette added a reviewer: compnerd.Nov 16 2021, 2:26 PM

Tyker added a subscriber: Tyker.Nov 26 2021, 2:00 PM

noajshu mentioned this in D114415: [llvm] [Debuginfod] Add HTTP Server to Debuginfod library..Nov 30 2021, 3:22 PM

compnerd requested changes to this revision.Dec 4 2021, 11:54 AM

compnerd added inline comments.

llvm/lib/Support/RemoteBisectorClient.cpp
23	This doesn't cover the non-Unix path, where you need to add an include for `WinSock2.h`, `Windows.h`, and possibly `WinSock.h`.
37	Can you please introduce some wrappers for all the BSD socket functions? On Windows, `GetAddrInfoW` would be preferable over `getaddrinfo`. Additionally, where do you initialize the sockets library? (Yes, that is not a thing on Linux/macOS, but on Windows, before you can use any socket function, you need to invoke `WSAStartup`. This needs a proper hook point.
llvm/tools/llvm-bisectd/CMakeLists.txt
20	Please add a case for WIndows, where you need to link against `Ws2_32`.
llvm/tools/llvm-bisectd/llvm-bisectd.cpp
119	I don't know if there is a `struct linger` available on WIndows, it may need to be spelt `LINGER`.
187	`GetAddrInfoW` should be preferred over `getaddrinfo` on Windows IIRC

This revision now requires changes to proceed.Dec 4 2021, 11:54 AM

tschuett added a subscriber: tschuett.Dec 4 2021, 12:11 PM

arsenm added inline comments.Dec 4 2021, 12:24 PM

llvm/lib/Support/Bisector.cpp
51–53	Merge these into one LLVM_DEBUG() block?
llvm/lib/Support/RemoteBisectorClient.cpp
40–42	Can you directly use Twine to produce the full error message from report_fatal_error
59–62	Ditto
65–67	Ditto

ychen added a subscriber: ychen.Dec 4 2021, 2:01 PM

Thanks for the feedback folks. To be honest I don't have the time right now to discuss and redesign the whole thing with David (have some parental leave coming up as well). If anyone else wants to pick this up and continue it feel free to do so, I published the patches to help other people with their debugging problems, but unless someone else picks this up and works to reach a consensus on design, it will have to lie unmaintained as a patch for Q1/Q2 next year at least.

How does this differ from opt-bisect?

Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 3:36 PM

dexonsmith removed a subscriber: dexonsmith.Nov 16 2022, 3:38 PM

In D113030#3932164, @arsenm wrote:

How does this differ from opt-bisect?

It's cross-process/whole-build. Imagine opt-bisect, but the action tracking is for a whole build. My hope was that maybe we could do something more coarse-grained on the build level (without the need for the compiler itself to be communicating with a service) in a wrapper - eg: good/bad compiler (or flag set) and each action gets one or the other, bisect down to the fewest actions that need the bad compiler, then use opt-bisect within a single action, holding others constant, etc. So we didn't have two different ways of doing fine-grained bisection.

In D113030#3934444, @dblaikie wrote:

In D113030#3932164, @arsenm wrote:

How does this differ from opt-bisect?

It's cross-process/whole-build. Imagine opt-bisect, but the action tracking is for a whole build. My hope was that maybe we could do something more coarse-grained on the build level (without the need for the compiler itself to be communicating with a service) in a wrapper - eg: good/bad compiler (or flag set) and each action gets one or the other, bisect down to the fewest actions that need the bad compiler, then use opt-bisect within a single action, holding others constant, etc. So we didn't have two different ways of doing fine-grained bisection.

Right, there are two schools of thought on this. I think they're both valuable, the coarse grain wrapper approach for when you don't want to modify the compiler/want to just find a failing object file, and this one which relies on existing clients coded into the compiler.

I've been using an internal version of this tool over the past year, and it's proven invaluable in rapidly bisecting miscompiles. It was also useful to find a miscompute in the entirety of chromium, but it could do with improvements to avoid having to do an entire rebuild on each bisection step. Maybe there's a more holistic higher level design that would subsume other bisection techniques but I don't see the effort being worth it myself.

Matt added a subscriber: Matt.Jan 17 2023, 11:01 AM

Herald added a subscriber: StephenFan. · View Herald TranscriptJan 17 2023, 11:01 AM

Revision Contents

Path

Size

llvm/

docs/

Bisection.md

129 lines

include/

llvm/

Support/

Bisector.h

96 lines

RemoteBisectorClient.h

46 lines

lib/

Support/

Bisector.cpp

76 lines

CMakeLists.txt

2 lines

RemoteBisectorClient.cpp

100 lines

test/

CMakeLists.txt

1 line

tools/

llvm-bisectd/

CMakeLists.txt

20 lines

llvm-bisectd.cpp

345 lines

unittests/

Support/

BisectTest.cpp

115 lines

CMakeLists.txt

1 line

Diff 384150

llvm/docs/Bisection.md

This file was added.

# Bisection with llvm-bisectd

## Introduction

The `llvm-bisectd` tool allows LLVM developers to rapidly bisect miscompiles in

clang or other tools running in parallel. This document explains how the tool

works and how to leverage it for bisecting your own specific issues.

Bisection as a general debugging technique can be done in multiple ways. We can

bisect across the *time* dimension, which usually means that we're bisecting

commits made to LLVM. We could instead bisect across the dimension of the LLVM

codebase itself, disabling some optimizations and leaving others enabled, to

narrow down the configuration that reproduces the issue. We can also bisect in

the dimension of the target program being compiled, e.g. compiling some parts

with a known good configuration to narrow down the problematic location in the

program. The `llvm-bisectd` tool is intended to help with this last approach to

debugging: finding the place where a bug is introduced. It does so with the aim

of being minimally intrusive to the build system of the target program.

## High level design

The bisection process with `llvm-bisectd` uses a client/server model, where all

the state about the bisection is maintained by the `llvm-bisectd` daemon. The

compilation tools (e.g. clang) send requests and get responses back telling

them what to do. As a developer, debugging using this methodology is intended

to be simple, with the daemon taking care of most of the complexity.

### Bisection keys

This process relies on a user-defined key that's used to represent a particular

action being done at a unique place in the target program's build. The key is a

string to allow the most flexibility of data representation. `llvm-bisectd`

doesn't care what the meaning of the key is, as long as has the following

properties:

1. The key maps onto a specific place in the source program in a stable manner.

Even if the software is being built with multiple compilers running

concurrently, the key should not be affected.

2. Between one build of the target software and the next (clean) build, the

same set of keys should be generated exactly.

For our example of bisecting a novel optimization pass, a good choice of key

paquetteUnsubmitted

Not Done

In the GISel example patch, you disambiguate further using the target. Should you mention that here?

paquette: In the GISel example patch, you disambiguate further using the target. Should you mention that…

would be the module + function name of the target program being compiled. The

function name meets requirement 1. because each module + function string refers

to a unique place in the target program. (A module may not have two functions

with the same symbol name). The inclusion of the module name in the key helps

to disambiguate two local linkage functions with the same name in two different

translation units. The key also satisfies requirement 2. because the function

names are static between one build and the next (e.g. no random auto-generation

going on).

## Bisection workflow

The bisection process has two stages. The first is called the *learning* stage,

and the second is the main *bisection* stage. The purpose of the learning

stage is for the bisection daemon to *learn* about all the keys that will be

bisected through during each bisection round.

The first thing that needs to be done is that `llvm-bisectd` needs to be

started as a daemon.

```console

$ llvm-bisectd

bisectd > _

```

On start, `llvm-bisectd` initializes into the learning phase, so nothing else

needs to be done.

Then, the software project being debugged is built with the client tools like

clang having the bisection mode enabled. This can be a compiler flag or some

other mechanism. For example, to bisect GlobalISel across target functions,

we can pass `-mllvm -gisel-bisect-selection` to clang.

During the first build of the project, the client tools are sending a bisection

request to `llvm-bisectd` for each key. `llvm-bisectd` in the learning phase

just replies to the clients with the answer "YES". In the background, it's

storing each unique key it receives into a vector for later.

paquetteUnsubmitted

Not Done

just replies to the clients with the answer "YES". In the background, it's

- storing each unique key it receives into a vector for later.

+ storing each unique key it receives for later.

### Bisection phase

"into a vector" seems like an implementation detail

paquette: "into a vector" seems like an implementation detail

### Bisection phase

After the first build is done, the learning phase is over, and `llvm-bisectd`

should know about all the keys that will be requested in future builds.

We can start the bisection phase now by using the `start-bisection` command in

the `llvm-bisectd` command interpreter.

```

bisectd > start-bisect

Starting bisection with 17306 total keys to search

bisectd > _

```

We're now in the bisection phase. Now, we perform the following actions in a

repeatedly until `llvm-bisectd` terminates with an answer.

1. Do a clean build of the project (with the bisection flags as before)

2. Test the resulting build to see if it still exhibits the bug.

3. If the bug remains, then we type the command `bad` into the `llvm-bisectd`

interpreter. If the bug has disappeared, we type the `good` command instead.

And that's it! Eventually the bisection will finish and `llvm-bisectd` will

print the *key* that, when enabled, triggers the bug.

``` console

Bisection completed. Failing key was: /work/testing/llvm-test-suite/CTMark/tramp3d-v4/tramp3d-v4.cpp _ZN17MultiArgEvaluatorI16MainEvaluatorTagE13createIterateI9MultiArg3I5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10BrickViewUESC_SC_EN4Adv51Z13MomentumfluxZILi3EEELi3E15EvaluateLocLoopISH_Li3EEEEvRKT_RKT0_RK8IntervalIXT1_EER14ScalarCodeInfoIXT1_EXsrSK_4sizeEERKT2_

Exiting...

```

## Adding bisection support in clients

Adding support for bisecting a new type of action is simple. The client only

needs to generate a key at the point where bisection is needed, and then use

client utilities in `lib/Support/RemoteBisectorClient.cpp` to talk to the

daemon. For example, if the bisection is to done for a `FunctionPass`

optimization, then one place to add the code would be to the `runOnFunction()`

method, using the function name as a key.

```C++

bool runOnFunction(Function &F) {

// ...

if (EnableBisectForNewOptimization) {

paquetteUnsubmitted

Not Done

Should bisector support be only included in assert/debug builds?

paquette: Should bisector support be only included in assert/debug builds?

std::string Key = F.getParent()->getSourceFileName() + " "

+ F.getName().str();

RemoteBisectClient BisectClient;

if (!BisectClient.shouldPerformAction(Key))

return false; // Bisector daemon told us to skip this action.

}

// Continue with the optimization

// ...

}

```

llvm/include/llvm/Support/Bisector.h

This file was added.

				//===- Bisector.h - Bisector implementation for llvm-bisectd ----- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// A generic bisector implementation that bisects across user-defined keys.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_SUPPORT_BISECTOR_H
				#define LLVM_SUPPORT_BISECTOR_H

				#include "llvm/ADT/MapVector.h"
				#include "llvm/ADT/Optional.h"
				#include "llvm/Support/ErrorHandling.h"
				#include "llvm/Support/raw_ostream.h"
				#include <cassert>
				#include <map>
				namespace llvm {

				/// Main class to manage a bisection process.
				template <class KeyT> class Bisector {
				public:
				Bisector() = default;
				Bisector(const Bisector &) = delete;

				void resetAndStartLearning() {
				KeyStateMap.clear();
				BisectHistory.clear();
				LearningMode = true;
				}

				/// End the learning and start the bisection process.
				void startBisect();

				/// In learning mode, add the key \p K to the database of expected future
				/// keys in a bisection run.
				void learnKey(const KeyT &K) {
				assert(LearningMode && "Bisector should be in learning mode!");
				KeyStateMap.insert({K, true});
				}

				/// If during bisection the given key \p K should have the bisection action
				/// done, then return true.
				bool shouldPerformActionOnKey(KeyT K) {
				assert(!LearningMode && "Should not be in learning mode for queries!");
				return KeyStateMap.lookup(K);
				}

				/// Finish the current bisection round, with a pass/fail status. If finishing
				/// this round concludes the bisection process, then the key that causes the
				/// failure is returned. Otherwise, returns None.
				Optional<KeyT> finishBisectionRound(bool Passed);

				unsigned getNumKeys() const { return KeyStateMap.size(); }
				KeyT &getCurrentCounterKey() {
				assert(isBisecting() && "Should be bisecting!");
				assert(!BisectHistory.empty() && "Expected a current key");
				auto It = KeyStateMap.begin() + (BisectHistory.back() - 1);
				return It->first;
				}
				int getLastFailCounter() const { return LastFailCounter; }
				int getLastPassCounter() const { return LastPassCounter; }

				/// \return true if in bisecting mode, not learning mode.
				bool isBisecting() const { return !LearningMode; }

				private:
				/// Map between the keys and a bool representing whether or not that key
				/// should have a bisected action done for it. It represents the choices for
				/// the keys at a particular point in the bisection process.
				MapVector<KeyT, bool, std::map<KeyT, unsigned>> KeyStateMap;

				/// A vector with all of the counters used in the bisection process.
				/// A counter is index which marks the last bisection action to be done,
				/// before the rest are skipped.
				SmallVector<int> BisectHistory;
				/// The counter value of the last known failing round.
				int LastFailCounter;
				/// Same for passing round.
				int LastPassCounter;

				/// The mode of the bisector. We should not be having any queries done in
				/// learning mode. Inversely, we should not be learning any keys in bisect
				/// mode.
				bool LearningMode = true;

				void updateMapForNewCounter(int NewCounter);
				};

				} // end namespace llvm

				#endif // LLVM_SUPPORT_BISECTOR_H

llvm/include/llvm/Support/RemoteBisectorClient.h

This file was added.

				//===- RemoteBisectorClient.h - Client remote bisection ---------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file contains a remote bisector client that allows tools to connect to
				// a bisector service, send queries and get responses back. It's intended to be
				// used in conjunction with llvm-bisectd.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_SUPPORT_REMOTEBISECTORCLIENT_H
				#define LLVM_SUPPORT_REMOTEBISECTORCLIENT_H

				#include "llvm/ADT/Optional.h"
				#include <cassert>
				namespace llvm {

				class RemoteBisectClient {
				public:
				/// Construct a client to connect to localhost:7777.
				RemoteBisectClient() : RemoteBisectClient("", 7777) {}
				RemoteBisectClient(std::string Hostname, unsigned PortNumber)
				: Hostname(Hostname), PortNumber(PortNumber) {}

				/// Connect to the bisection service and send a bisection with with key \p
				/// Key. \returns true if the bisection action should be performed.
				/// If there was any problem with communicating with the service, this method
				/// calls report_fatal_error(), as the entire bisection process will be
				/// compromised anyway.
				bool shouldPerformBisectAction(std::string &Key);

				private:
				/// Set up the socket connection to the server and return the socket descriptor.
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /// Set up the socket connection to the server and return the socket descriptor. + /// Set up the socket connection to the server and return the socket + /// descriptor. Lint: Pre-merge checks: clang-format: please reformat the code ``` - /// Set up the socket connection to the server…
				int setupConnection();
				std::string Hostname;
				unsigned PortNumber;
				};


				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Lint: Pre-merge checks: clang-format: please reformat the code ``` - ```
				} // end namespace llvm

				#endif // LLVM_SUPPORT_REMOTEBISECTORCLIENT_H

llvm/lib/Support/Bisector.cpp

This file was added.

				//===---- Bisector.cpp - Bisector implementation -------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Support/Bisector.h"
				#include "llvm/Support/Debug.h"

				#define DEBUG_TYPE "bisector"

				namespace llvm {

				template <class KeyT>
				Optional<KeyT> Bisector<KeyT>::finishBisectionRound(bool Passed) {
				assert(!LearningMode && "Should not be in learning mode!");
				// If the current status is Passed, then we bump the counter to the midpoint
				// between the current one and the last known fail counter.
				// If the current status is Fail, then we halve the current counter.

				int CurrentCounter = BisectHistory.back();
				assert(
				!(Passed && static_cast<unsigned>(CurrentCounter) == KeyStateMap.size()));
				int NewCounter;
				if (Passed)
				LastPassCounter = CurrentCounter;
				else
				LastFailCounter = CurrentCounter;

				if (LastPassCounter + 1 == LastFailCounter) {
				// We know the first failing counter. The counters are 1 based.
				auto It = KeyStateMap.begin() + (LastFailCounter - 1);
				return It->first;
				}

				// There are more configs to try. Pick the next midpoint.
				NewCounter = (LastPassCounter + LastFailCounter) / 2;
				BisectHistory.push_back(NewCounter);
				updateMapForNewCounter(NewCounter);

				return None;
				}

				template <class KeyT> void Bisector<KeyT>::startBisect() {
				LearningMode = false;
				assert(KeyStateMap.size() > 1 && "Bisection started with a single key!");

				unsigned Counter = KeyStateMap.size() / 2;
				LLVM_DEBUG(dbgs() << "Starting bisect with a key map of size "
				<< KeyStateMap.size() << "\n");
				LLVM_DEBUG(dbgs() << "Initial bisect index = " << Counter << "\n");
				arsenmUnsubmitted Not Done Reply Inline Actions Merge these into one LLVM_DEBUG() block? arsenm: Merge these into one LLVM_DEBUG() block?
				BisectHistory.push_back(Counter);
				updateMapForNewCounter(Counter);
				LastPassCounter = 0;
				LastFailCounter = KeyStateMap.size();
				}

				template <class KeyT>
				void Bisector<KeyT>::updateMapForNewCounter(int NewCounter) {
				assert(!LearningMode && "Should be in bisection mode!");
				assert(NewCounter < static_cast<int>(KeyStateMap.size()) &&
				"Invalid new counter!");

				// Update the state map to reflect the new counter value.
				for (int64_t Idx = 0, E = KeyStateMap.size(); Idx < E; ++Idx) {
				auto It = KeyStateMap.begin() + Idx;
				assert(It != KeyStateMap.end());
				It->second = Idx < NewCounter;
				}
				}

				template class Bisector<std::string>;

				} // namespace llvm

llvm/lib/Support/CMakeLists.txt

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMSupport
ARMAttributeParser.cpp		ARMAttributeParser.cpp
ARMWinEH.cpp		ARMWinEH.cpp
Allocator.cpp		Allocator.cpp
AutoConvert.cpp		AutoConvert.cpp
BinaryStreamError.cpp		BinaryStreamError.cpp
BinaryStreamReader.cpp		BinaryStreamReader.cpp
BinaryStreamRef.cpp		BinaryStreamRef.cpp
BinaryStreamWriter.cpp		BinaryStreamWriter.cpp
		Bisector.cpp
BlockFrequency.cpp		BlockFrequency.cpp
BranchProbability.cpp		BranchProbability.cpp
BuryPointer.cpp		BuryPointer.cpp
CachePruning.cpp		CachePruning.cpp
Caching.cpp		Caching.cpp
circular_raw_ostream.cpp		circular_raw_ostream.cpp
Chrono.cpp		Chrono.cpp
COM.cpp		COM.cpp
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMSupport
NativeFormatting.cpp		NativeFormatting.cpp
OptimizedStructLayout.cpp		OptimizedStructLayout.cpp
Optional.cpp		Optional.cpp
Parallel.cpp		Parallel.cpp
PluginLoader.cpp		PluginLoader.cpp
PrettyStackTrace.cpp		PrettyStackTrace.cpp
RandomNumberGenerator.cpp		RandomNumberGenerator.cpp
Regex.cpp		Regex.cpp
		RemoteBisectorClient.cpp
RISCVAttributes.cpp		RISCVAttributes.cpp
RISCVAttributeParser.cpp		RISCVAttributeParser.cpp
RISCVISAInfo.cpp		RISCVISAInfo.cpp
ScaledNumber.cpp		ScaledNumber.cpp
ScopedPrinter.cpp		ScopedPrinter.cpp
SHA1.cpp		SHA1.cpp
SHA256.cpp		SHA256.cpp
Signposts.cpp		Signposts.cpp
▲ Show 20 Lines • Show All 114 Lines • Show Last 20 Lines

llvm/lib/Support/RemoteBisectorClient.cpp

This file was added.

				//===---- RemoteBisectorClient.cpp - Client remote bisection --------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Support/RemoteBisectorClient.h"
				#include "llvm/Config/llvm-config.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/Endian.h"
				#include "llvm/Support/ErrorHandling.h"
				#include "llvm/Support/raw_ostream.h"
				#include <cstring>
				#include <string>

				#ifdef LLVM_ON_UNIX
				#include <netdb.h>
				#include <netinet/in.h>
				#include <sys/socket.h>
				#include <unistd.h>
				#endif // LLVM_ON_UNIX
				compnerdUnsubmitted Not Done Reply Inline Actions This doesn't cover the non-Unix path, where you need to add an include for `WinSock2.h`, `Windows.h`, and possibly `WinSock.h`. compnerd: This doesn't cover the non-Unix path, where you need to add an include for `WinSock2.h`…

				#define DEBUG_TYPE "bisector"

				namespace llvm {

				int RemoteBisectClient::setupConnection() {
				struct addrinfo hints;
				struct addrinfo *servinfo;

				memset(&hints, 0, sizeof(hints));
				hints.ai_family = AF_UNSPEC;
				hints.ai_socktype = SOCK_STREAM;
				hints.ai_flags = AI_PASSIVE;
				int Stat = ::getaddrinfo(NULL, std::to_string(PortNumber).c_str(), &hints,
				compnerdUnsubmitted Not Done Reply Inline Actions Can you please introduce some wrappers for all the BSD socket functions? On Windows, `GetAddrInfoW` would be preferable over `getaddrinfo`. Additionally, where do you initialize the sockets library? (Yes, that is not a thing on Linux/macOS, but on Windows, before you can use any socket function, you need to invoke `WSAStartup`. This needs a proper hook point. compnerd: Can you please introduce some wrappers for all the BSD socket functions? On Windows…
				&servinfo);
				if (Stat) {
				errs() << "RemoteBisectClient: getaddrinfo() failed: "
				<< std::strerror(errno) << "\n";
				report_fatal_error("Fatal error.");
				arsenmUnsubmitted Not Done Reply Inline Actions Can you directly use Twine to produce the full error message from report_fatal_error arsenm: Can you directly use Twine to produce the full error message from report_fatal_error
				}

				int Socket = ::socket(servinfo->ai_family, servinfo->ai_socktype,
				servinfo->ai_protocol);
				int yes = 1;
				if (setsockopt(Socket, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof yes))
				report_fatal_error("RemoteBisectClient: setsockopt() failed");

				Stat = connect(Socket, servinfo->ai_addr, servinfo->ai_addrlen);
				if (Stat == -1) {
				if (errno == EADDRINUSE) {
				// If we run into address already in use, try waiting a short time before
				// retrying.
				sleep(1);
				Stat = connect(Socket, servinfo->ai_addr, servinfo->ai_addrlen);
				if (Stat == -1) {
				errs() << "RemoteBisectClient: could not bind() to the socket after "
				"waiting: "
				<< std::strerror(errno) << "\n";
				report_fatal_error("Fatal error.");
				arsenmUnsubmitted Not Done Reply Inline Actions Ditto arsenm: Ditto
				}
				}
				errs() << "RemoteBisectClient: could not bind() to the socket: "
				<< std::strerror(errno) << "\n";
				report_fatal_error("Fatal error.");
				arsenmUnsubmitted Not Done Reply Inline Actions Ditto arsenm: Ditto
				}
				return Socket;
				}

				bool RemoteBisectClient::shouldPerformBisectAction(std::string &Key) {
				int Socket = setupConnection();
				SmallVector<char> Sendbuf = {0, 0, 0, 0, 0, 0};
				if (Key.empty())
				report_fatal_error("Empty key given to bisection query!");

				uint32_t Len = Key.size();
				Len += 2;
				support::endian::write32le(Sendbuf.data(), Len);
				Sendbuf[4] = 'Q';
				Sendbuf[5] = ' ';
				Sendbuf.insert(Sendbuf.end(), Key.begin(), Key.end());

				int BytesSent = ::send(Socket, Sendbuf.data(), Sendbuf.size(), 0);
				if (static_cast<size_t>(BytesSent) != Sendbuf.size())
				report_fatal_error("RemoteBisectClient: couldn't send query");

				const int RECVBUF_LEN = 1;
				char Recvbuf[RECVBUF_LEN];
				int BytesRecv = ::recv(Socket, Recvbuf, RECVBUF_LEN, 0);
				if (BytesRecv != 1)
				report_fatal_error(
				"RemoteBisectClient: didn't receive response from bisect service");

				close(Socket);
				return Recvbuf[0];
				}

				} // namespace llvm

llvm/test/CMakeLists.txt

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	set(LLVM_TEST_DEPENDS
count		count
llc		llc
lli		lli
lli-child-target		lli-child-target
llvm-addr2line		llvm-addr2line
llvm-ar		llvm-ar
llvm-as		llvm-as
llvm-bcanalyzer		llvm-bcanalyzer
		llvm-bisectd
llvm-bitcode-strip		llvm-bitcode-strip
llvm-c-test		llvm-c-test
llvm-cat		llvm-cat
llvm-cfi-verify		llvm-cfi-verify
llvm-config		llvm-config
llvm-cov		llvm-cov
llvm-cvtres		llvm-cvtres
llvm-cxxdump		llvm-cxxdump
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

llvm/tools/llvm-bisectd/CMakeLists.txt

This file was added.

				set(LLVM_LINK_COMPONENTS
				Core
				LineEditor
				Support
				)

				add_llvm_tool(llvm-bisectd
				llvm-bisectd.cpp

				DEPENDS
				intrinsics_gen
				)

				if(${CMAKE_SYSTEM_NAME} MATCHES "Haiku")
				target_link_libraries(llvm-bisectd PRIVATE network)
				endif()

				if(${CMAKE_SYSTEM_NAME} MATCHES "SunOS")
				target_link_libraries(llvm-bisectd PRIVATE socket)
				endif()
				compnerdUnsubmitted Not Done Reply Inline Actions Please add a case for WIndows, where you need to link against `Ws2_32`. compnerd: Please add a case for WIndows, where you need to link against `Ws2_32`.

llvm/tools/llvm-bisectd/llvm-bisectd.cpp

This file was added.

				//===- llvm-bisectd.cpp - LLVM function extraction utility ----------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This tool starts a daemon that co-ordinates bisection of actions of
				// compilation clients like clang, or other tools, which may be running in
				// parallel.
				//
				// The idea is that this daemon service accepts connections from the client
				// tool to answer requests. In the first part of bisection, llvm-bisectd is put
				// into "learning" mode. In this mode, clients request to do a bisected action
				// of a given key, which represents a particular action at a point in the code
				// being compiled. The daemon always answers with "yes", because in learning
				// mode the goal is not to change the behaviour of the clients, but simply to
				// collect all the different keys that will be checked in the actual bisection
				// rounds. Once the clients are finished building, llvm-bisectd is put into
				// "bisection" mode.
				//
				// In bisection mode, the build is then performed again but this time
				// llvm-bisectd will be answering "yes" or "no" to the queries, depending on
				// whether the key is on the left or right of the bisection midpoint.
				//
				// By doing this with a daemon service, with an appopriately chosen key, the
				// client tools can be run in parallel easily.
				//
				// For more detailed documentation, see docs/Bisection.md
				//
				// Client request payload format:
				// * Bytes 0-3: a little-endian 32 bit integer, representing the number of
				// bytes that will be sent after this.
				// * Byte 4: The ASCII character 'Q'.
				// * Byte 5: The ASCII space character ' '.
				// * The rest of the byte byte stream is the key being sent.
				// If the length specified in the prefix and the length of "Q " + key does
				// not match, then llvm-bisectd will exit with an error.
				//
				//
				// In future, it would be good to have the following additional features:
				// * Have the process become a full daemon that runs in the background.
				// Use subsequent calls to llvm-bisectd to send commands to the daemon.
				// This would allow this tool to be used in automation systems.
				// * Save the bisection state to a file to allow loading & resuming.
				// * Undo a bisect good/bad command to go back to a previous state.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/StringExtras.h"
				#include "llvm/IR/LLVMContext.h"
				#include "llvm/LineEditor/LineEditor.h"
				#include "llvm/Support/Bisector.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/Endian.h"
				#include "llvm/Support/Error.h"
				#include "llvm/Support/ErrorHandling.h"
				#include "llvm/Support/InitLLVM.h"
				#include "llvm/Support/Mutex.h"
				#include "llvm/Support/ScopedPrinter.h"
				#include "llvm/Support/raw_ostream.h"
				#include "llvm/Support/thread.h"
				#include <string>

				#ifdef LLVM_ON_UNIX
				#include <netdb.h>
				#include <netinet/in.h>
				#include <sys/socket.h>
				#include <unistd.h>
				#endif // LLVM_ON_UNIX

				#define DEBUG_TYPE "llvm-bisectd"

				using namespace llvm;

				llvm::sys::Mutex Lock;

				cl::OptionCategory BisectCat("llvm-extract Options");
				cl::opt<std::string> ListenPort("listen-port",
				cl::desc("Port to listen for connections"),
				cl::init("7777"), cl::cat(BisectCat));

				const size_t RECV_BUFLEN = 2048;
				/// The maximum message prefix. Clients shouldn't need to send huge messages
				/// anyway.
				const uint32_t MAX_PREFIX_LEN = 4096;

				/// Setup the initial socket for accepting connections and return it.
				int setupConnection();
				/// Return a single byte 0 or 1, as a response to the query from a client.
				char getResponse(Bisector<std::string> &BisectService,
				std::vector<char> &Payload);

				/// Starts the socket to listen for connections from compiler clients.
				void startService(Bisector<std::string> *BisectService) {
				LLVM_DEBUG(dbgs() << "llvm-bisectd listening for connections on localhost:"
				<< ListenPort << "\n");
				int MainSocket = setupConnection();

				// Now we've successfully set up a listening socket. Wait for connections.
				char recvbuf[RECV_BUFLEN];

				// The payload data without the prefix.
				std::vector<char> Payload;

				while (true) {
				// Accept new connections (blocking).
				struct sockaddr_storage their_addr;
				socklen_t addr_size;
				int NewSocket =
				accept(MainSocket, (struct sockaddr *)&their_addr, &addr_size);
				if (NewSocket == -1) {
				errs() << "Error: could not accept() connections at socket\n";
				errs() << "errno = " << std::strerror(errno) << "\n";
				std::exit(1);
				}
				// Linger on close.
				struct linger Linger;
				compnerdUnsubmitted Not Done Reply Inline Actions I don't know if there is a `struct linger` available on WIndows, it may need to be spelt `LINGER`. compnerd: I don't know if there is a `struct linger` available on WIndows, it may need to be spelt…
				Linger.l_onoff = 1;
				Linger.l_linger = 2; // Linger for 2 seconds.
				if (setsockopt(NewSocket, SOL_SOCKET, SO_LINGER, &Linger, sizeof(Linger))) {
				errs() << "Error: setsockopt() for SO_LINGERfailed\n";
				std::exit(1);
				}

				// Do an initial read.
				int BytesRead = ::recv(NewSocket, recvbuf, RECV_BUFLEN, 0);
				if (!BytesRead) {
				// Connection was closed by client.
				close(NewSocket);
				}
				// Our protocol has a 4 byte little-endian length prefix, so minimum
				// message size is 5 bytes. If we got less than something bad probably
				// happened.
				if (BytesRead < 5) {
				errs() << "Received data was too small!\n";
				std::exit(1);
				}

				uint32_t PrefixLen = support::endian::read32le(recvbuf);
				if (PrefixLen > MAX_PREFIX_LEN) {
				errs() << "Client sent a prefix length larger than the max of "
				<< MAX_PREFIX_LEN << ": " << PrefixLen << "\n";
				std::exit(1);
				}

				for (int I = 4; I < BytesRead; ++I)
				Payload.push_back(recvbuf[I]);
				assert(static_cast<int>(Payload.size()) == BytesRead - 4);
				// Loop recv() until we read the full message.
				while (Payload.size() < PrefixLen) {
				BytesRead = ::recv(NewSocket, recvbuf, RECV_BUFLEN, 0);
				if (!BytesRead) {
				// Connection was closed by client.
				errs() << "Client closed connection unexpectedly.\n";
				std::exit(1);
				}
				for (int I = 0; I < BytesRead; ++I)
				Payload.push_back(recvbuf[I]);
				}
				if (Payload.size() != PrefixLen) {
				errs() << "Payload received did not match prefix length!\n";
				std::exit(1);
				}

				char Response = getResponse(*BisectService, Payload);
				char sendbuf[1] = {Response};
				int BytesSent = ::send(NewSocket, sendbuf, 1, 0);
				if (BytesSent != 1) {
				errs() << "Error sending response to client!\n";
				std::exit(1);
				}
				Payload.clear();
				close(NewSocket);
				}
				}

				int setupConnection() {
				struct addrinfo hints;
				struct addrinfo *servinfo;

				memset(&hints, 0, sizeof hints);
				hints.ai_family = AF_UNSPEC;
				hints.ai_socktype = SOCK_STREAM;
				hints.ai_flags = AI_PASSIVE;
				int Stat = ::getaddrinfo(NULL, ListenPort.c_str(), &hints, &servinfo);
				compnerdUnsubmitted Not Done Reply Inline Actions `GetAddrInfoW` should be preferred over `getaddrinfo` on Windows IIRC compnerd: `GetAddrInfoW` should be preferred over `getaddrinfo` on Windows IIRC
				if (Stat) {
				errs() << "Error: could not set up connection on port " << ListenPort
				<< "\n";
				errs() << "getaddrinfo() failed\n";
				std::exit(1);
				}

				int Socket = ::socket(servinfo->ai_family, servinfo->ai_socktype,
				servinfo->ai_protocol);
				int yes = 1;
				if (setsockopt(Socket, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes))) {
				errs() << "Error: setsockopt() failed\n";
				std::exit(1);
				}

				Stat = bind(Socket, servinfo->ai_addr, servinfo->ai_addrlen);
				if (Stat) {
				errs() << "Error: could not bind() to the socket\n";
				errs() << "errno = " << errno << "\n";
				std::exit(1);
				}

				Stat = ::listen(Socket, /* backlog */ 128);
				if (Stat) {
				errs() << "Error: could not listen() to the socket\n";
				errs() << "errno = " << std::strerror(errno) << "\n";
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - errs() << "errno = " << std::strerror(errno) << "\n"; + errs() << "errno = " << std::strerror(errno) << "\n"; Lint: Pre-merge checks: clang-format: please reformat the code ``` - errs() << "errno = " << std::strerror(errno)…
				std::exit(1);
				}
				::freeaddrinfo(servinfo);
				return Socket;
				}

				char getResponse(Bisector<std::string> &BisectService,
				std::vector<char> &Payload) {
				// Parse the payload. We expect the payload to be just a "Q", a space, and
				// then the string representing the key.
				if (Payload.size() < 3) {
				errs() << "Invalid payload. Too short.\n";
				std::exit(1);
				}
				if (Payload[0] != 'Q' \|\| Payload[1] != ' ') {
				errs() << "Invalid payload. First two characters should be {'Q', ' '}\n";
				std::exit(1);
				}
				std::string Key(Payload.data() + 2, Payload.size() - 2);
				{
				sys::ScopedLock ScopeLock(Lock);
				if (BisectService.isBisecting())
				return BisectService.shouldPerformActionOnKey(Key) ? 1 : 0;
				BisectService.learnKey(Key);
				return 1; // Learning mode always returns 1;
				}

				Payload.insert(Payload.end(), Key.begin(), Key.end());
				}

				void printHelp() {
				outs() << "Available commands: \n";
				outs() << "\texit - exit llvm-bisectd\n";
				outs() << "\treset - reset state and start learning mode\n";
				outs() << "\tstart-bisect - end learning mode and start bisection\n";
				outs() << "\tgood - if in bisect mode, signal the last build was good\n";
				outs() << "\tbad - if in bisect mode, signal the last build was bad\n";
				outs() << "\tstatus - print bisection state\n";
				}

				void dumpStatus(Bisector<std::string> &BisectService) {
				outs() << "Bisect status: "
				<< (BisectService.isBisecting() ? "bisecting\n" : "learning\n");
				outs() << "Total keys: " << BisectService.getNumKeys() << "\n";
				if (!BisectService.isBisecting())
				return;
				int LastPass = BisectService.getLastPassCounter();
				outs() << "Counter for last known passed key: ";
				if (LastPass == 0)
				outs() << "No known passed key\n";
				else
				outs() << LastPass << "\n";
				outs() << "Counter for last known failed key: "
				<< BisectService.getLastFailCounter() << "\n";
				outs() << "Current counter (mid-point) key: "
				<< BisectService.getCurrentCounterKey() << "\n";
				}

				int main(int argc, char **argv) {
				InitLLVM X(argc, argv);

				LLVMContext Context;
				cl::HideUnrelatedOptions(BisectCat);
				cl::ParseCommandLineOptions(argc, argv, "llvm bisection daemon\n");

				// For now we'll just support std::string as the keys for bisection.
				// Could expand this to other datatypes in future, but I'm not sure it's
				// necessary given that we could serialize them all to strings anyway.
				Bisector<std::string> BisectService;
				// One thread will be started to handle initial connection set up and
				// management.
				llvm::thread ServerThread(startService, &BisectService);
				ServerThread.detach();

				// Accept user commands.
				LineEditor LE("bisectd ");
				while (true) {
				auto MaybeLine = LE.readLine();
				// Intercept ^D to make sure the user actually wants to kill bisectd. Doing
				// so will probably cause any running builds to hang or die.
				if (!MaybeLine) {
				outs() << "\nUse ctrl-c or type \"exit\" to quit. This may cause any "
				"running builds to fail.\n";
				continue;
				}
				std::string Input = *MaybeLine;
				if (Input == "")
				continue;
				if (Input == "exit")
				return 0;
				if (Input == "help") {
				printHelp();
				} else if (Input == "reset") {
				sys::ScopedLock SL(Lock);
				BisectService.resetAndStartLearning();
				} else if (Input == "good" \|\| Input == "bad") {
				bool Passed = Input == "good";
				sys::ScopedLock SL(Lock);
				if (!BisectService.isBisecting()) {
				errs() << "Error: command not valid in learning mode. Use "
				"\"start-bisect\" to begin bisect mode.\n";
				continue;
				}
				auto MaybeKey = BisectService.finishBisectionRound(Passed);
				if (MaybeKey) {
				outs() << "Bisection completed. Failing key was: " << *MaybeKey << "\n";
				outs() << "Exiting...\n";
				std::exit(0);
				}
				outs() << "New bisection round started\n";
				} else if (Input == "start-bisect") {
				sys::ScopedLock SL(Lock);
				if (BisectService.isBisecting()) {
				errs() << "Bisect mode is already enabled.\n";
				continue;
				}
				if (BisectService.getNumKeys() < 2) {
				errs() << "Error: not enough keys to bisect across.\n";
				continue;
				}
				outs() << "Starting bisection with " << BisectService.getNumKeys()
				<< " total keys to search\n";
				BisectService.startBisect();
				} else if (Input == "status") {
				sys::ScopedLock SL(Lock);
				dumpStatus(BisectService);
				} else {
				outs() << "Unknown command. Type \"help\" for possible commands\n";
				}
				}
				return 0;
				}

llvm/unittests/Support/BisectTest.cpp

This file was added.

				//===- llvm/unittest/Support/BisectTest.cpp - Bisector tests --------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements unit tests for the Bisector class.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/Support/Bisector.h"
				#include "llvm/Support/Debug.h"
				#include "gtest/gtest.h"

				using namespace llvm;

				namespace {

				template<class T>
				Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template<class T> -void learnKeys(Bisector<T> &Bisector, ArrayRef<T> Keys) { +template <class T> void learnKeys(Bisector<T> &Bisector, ArrayRef<T> Keys) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -template<class T> -void learnKeys(Bisector<T>…
				void learnKeys(Bisector<T> &Bisector, ArrayRef<T> Keys) {
				for (auto &K : Keys)
				Bisector.learnKey(K);
				}

				TEST(BisectTest, TestLearn) {
				Bisector<std::string> TestBisect;

				learnKeys(TestBisect, {"foo", "bar", "zoo"});
				EXPECT_EQ(TestBisect.getNumKeys(), 3U);
				}

				TEST(BisectTest, TestSimpleBisect) {
				Bisector<std::string> TestBisect;

				learnKeys(TestBisect, {"a", "b", "c", "d", "e", "f"});
				EXPECT_EQ(TestBisect.getNumKeys(), 6U);

				TestBisect.startBisect();
				// Midpoint == idx 6 / 2 == 3, 3rd key == "c".
				EXPECT_TRUE(TestBisect.shouldPerformActionOnKey("c"));
				EXPECT_FALSE(TestBisect.shouldPerformActionOnKey("d"));

				// From now on use the counters as a shorter way to test.
				// We're simulating a failure on "b".
				EXPECT_FALSE(TestBisect.finishBisectionRound(false));
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "a");
				EXPECT_FALSE(TestBisect.finishBisectionRound(true));
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "b");
				auto MaybeKey = TestBisect.finishBisectionRound(false);
				EXPECT_TRUE(MaybeKey.hasValue());
				if (!MaybeKey)
				return;
				EXPECT_EQ(*MaybeKey, "b");
				}

				TEST(BisectTest, TestFinalKeyIsFail) {
				Bisector<std::string> TestBisect;

				learnKeys(TestBisect, {"a", "b", "c", "d", "e", "f", "g"});
				EXPECT_EQ(TestBisect.getNumKeys(), 7U);

				TestBisect.startBisect();
				// We're simulating a failure on "g".
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "c");
				EXPECT_FALSE(TestBisect.finishBisectionRound(true));
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "e");
				EXPECT_FALSE(TestBisect.finishBisectionRound(true));
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "f");
				// We already know that choosing all the keys is a failure, so a pass on "f"
				// let's us terminate.
				auto MaybeKey = TestBisect.finishBisectionRound(true);
				EXPECT_TRUE(MaybeKey.hasValue());
				if (!MaybeKey)
				return;
				EXPECT_EQ(*MaybeKey, "g");
				}

				TEST(BisectTest, TestFirstKeyIsFail) {
				Bisector<std::string> TestBisect;

				learnKeys(TestBisect, {"a", "b", "c", "d", "e", "f"});
				EXPECT_EQ(TestBisect.getNumKeys(), 6U);

				TestBisect.startBisect();
				// We're simulating a failure on "a".
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "c");
				EXPECT_FALSE(TestBisect.finishBisectionRound(false));
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "a");
				auto MaybeKey = TestBisect.finishBisectionRound(false);
				EXPECT_TRUE(MaybeKey.hasValue());
				if (!MaybeKey)
				return;
				EXPECT_EQ(*MaybeKey, "a");
				}

				TEST(BisectTest, TestTrivialTwoKeys) {
				Bisector<std::string> TestBisect;

				learnKeys(TestBisect, {"a", "b"});
				EXPECT_EQ(TestBisect.getNumKeys(), 2U);

				TestBisect.startBisect();
				// We're simulating a failure on "b".
				EXPECT_EQ(TestBisect.getCurrentCounterKey(), "a");
				auto MaybeKey = TestBisect.finishBisectionRound(true);
				EXPECT_TRUE(MaybeKey.hasValue());
				if (!MaybeKey)
				return;
				EXPECT_EQ(*MaybeKey, "b");
				}

				} // namespace

llvm/unittests/Support/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	Support			Support
	)			)

	add_llvm_unittest(SupportTests			add_llvm_unittest(SupportTests
	AlignmentTest.cpp			AlignmentTest.cpp
	AlignOfTest.cpp			AlignOfTest.cpp
	AllocatorTest.cpp			AllocatorTest.cpp
	AnnotationsTest.cpp			AnnotationsTest.cpp
	ARMAttributeParser.cpp			ARMAttributeParser.cpp
	ArrayRecyclerTest.cpp			ArrayRecyclerTest.cpp
	Base64Test.cpp			Base64Test.cpp
	BinaryStreamTest.cpp			BinaryStreamTest.cpp
				BisectTest.cpp
	BlockFrequencyTest.cpp			BlockFrequencyTest.cpp
	BranchProbabilityTest.cpp			BranchProbabilityTest.cpp
	CachePruningTest.cpp			CachePruningTest.cpp
	CrashRecoveryTest.cpp			CrashRecoveryTest.cpp
	Casting.cpp			Casting.cpp
	CheckedArithmeticTest.cpp			CheckedArithmeticTest.cpp
	Chrono.cpp			Chrono.cpp
	CommandLineTest.cpp			CommandLineTest.cpp
	▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Add a new tool for parallel safe bisection, "llvm-bisectd".Needs RevisionPublic

Details

Introduction

High level design

Diff Detail

Event Timeline

Revision Contents

Diff 384150

llvm/docs/Bisection.md

llvm/include/llvm/Support/Bisector.h

llvm/include/llvm/Support/RemoteBisectorClient.h

llvm/lib/Support/Bisector.cpp

llvm/lib/Support/CMakeLists.txt

llvm/lib/Support/RemoteBisectorClient.cpp

llvm/test/CMakeLists.txt

llvm/tools/llvm-bisectd/CMakeLists.txt

llvm/tools/llvm-bisectd/llvm-bisectd.cpp

llvm/unittests/Support/BisectTest.cpp

llvm/unittests/Support/CMakeLists.txt

Add a new tool for parallel safe bisection, "llvm-bisectd".
Needs RevisionPublic