diff --git a/llvm/docs/Bisection.md b/llvm/docs/Bisection.md new file mode 100644 --- /dev/null +++ b/llvm/docs/Bisection.md @@ -0,0 +1,129 @@ +# Bisection with llvm-bisectd + +## Introduction + +The `llvm-bisectd` tool allows LLVM developers to rapidly bisect miscompiles in +clang or other tools running in parallel. This document explains how the tool +works and how to leverage it for bisecting your own specific issues. + +Bisection as a general debugging technique can be done in multiple ways. We can +bisect across the *time* dimension, which usually means that we're bisecting +commits made to LLVM. We could instead bisect across the dimension of the LLVM +codebase itself, disabling some optimizations and leaving others enabled, to +narrow down the configuration that reproduces the issue. We can also bisect in +the dimension of the target program being compiled, e.g. compiling some parts +with a known good configuration to narrow down the problematic location in the +program. The `llvm-bisectd` tool is intended to help with this last approach to +debugging: finding the place where a bug is introduced. It does so with the aim +of being minimally intrusive to the build system of the target program. + +## High level design + +The bisection process with `llvm-bisectd` uses a client/server model, where all +the state about the bisection is maintained by the `llvm-bisectd` daemon. The +compilation tools (e.g. clang) send requests and get responses back telling +them what to do. As a developer, debugging using this methodology is intended +to be simple, with the daemon taking care of most of the complexity. + +### Bisection keys + +This process relies on a user-defined key that's used to represent a particular +action being done at a unique place in the target program's build. The key is a +string to allow the most flexibility of data representation. `llvm-bisectd` +doesn't care what the meaning of the key is, as long as has the following +properties: + 1. The key maps onto a specific place in the source program in a stable manner. + Even if the software is being built with multiple compilers running + concurrently, the key should not be affected. + 2. Between one build of the target software and the next (clean) build, the + same set of keys should be generated exactly. + +For our example of bisecting a novel optimization pass, a good choice of key +would be the module + function name of the target program being compiled. The +function name meets requirement 1. because each module + function string refers +to a unique place in the target program. (A module may not have two functions +with the same symbol name). The inclusion of the module name in the key helps +to disambiguate two local linkage functions with the same name in two different +translation units. The key also satisfies requirement 2. because the function +names are static between one build and the next (e.g. no random auto-generation +going on). + +## Bisection workflow + +The bisection process has two stages. The first is called the *learning* stage, +and the second is the main *bisection* stage. The purpose of the learning +stage is for the bisection daemon to *learn* about all the keys that will be +bisected through during each bisection round. + +The first thing that needs to be done is that `llvm-bisectd` needs to be +started as a daemon. + +```console +$ llvm-bisectd +bisectd > _ +``` + +On start, `llvm-bisectd` initializes into the learning phase, so nothing else +needs to be done. + +Then, the software project being debugged is built with the client tools like +clang having the bisection mode enabled. This can be a compiler flag or some +other mechanism. For example, to bisect GlobalISel across target functions, +we can pass `-mllvm -gisel-bisect-selection` to clang. + +During the first build of the project, the client tools are sending a bisection +request to `llvm-bisectd` for each key. `llvm-bisectd` in the learning phase +just replies to the clients with the answer "YES". In the background, it's +storing each unique key it receives into a vector for later. + +### Bisection phase + +After the first build is done, the learning phase is over, and `llvm-bisectd` +should know about all the keys that will be requested in future builds. +We can start the bisection phase now by using the `start-bisection` command in +the `llvm-bisectd` command interpreter. + +``` +bisectd > start-bisect +Starting bisection with 17306 total keys to search +bisectd > _ +``` + +We're now in the bisection phase. Now, we perform the following actions in a +repeatedly until `llvm-bisectd` terminates with an answer. + 1. Do a clean build of the project (with the bisection flags as before) + 2. Test the resulting build to see if it still exhibits the bug. + 3. If the bug remains, then we type the command `bad` into the `llvm-bisectd` + interpreter. If the bug has disappeared, we type the `good` command instead. + +And that's it! Eventually the bisection will finish and `llvm-bisectd` will +print the *key* that, when enabled, triggers the bug. + +``` console +Bisection completed. Failing key was: /work/testing/llvm-test-suite/CTMark/tramp3d-v4/tramp3d-v4.cpp _ZN17MultiArgEvaluatorI16MainEvaluatorTagE13createIterateI9MultiArg3I5FieldI22UniformRectilinearMeshI10MeshTraitsILi3Ed21UniformRectilinearTag12CartesianTagLi3EEEd10BrickViewUESC_SC_EN4Adv51Z13MomentumfluxZILi3EEELi3E15EvaluateLocLoopISH_Li3EEEEvRKT_RKT0_RK8IntervalIXT1_EER14ScalarCodeInfoIXT1_EXsrSK_4sizeEERKT2_ +Exiting... +``` + +## Adding bisection support in clients + +Adding support for bisecting a new type of action is simple. The client only +needs to generate a key at the point where bisection is needed, and then use +client utilities in `lib/Support/RemoteBisectorClient.cpp` to talk to the +daemon. For example, if the bisection is to done for a `FunctionPass` +optimization, then one place to add the code would be to the `runOnFunction()` +method, using the function name as a key. + +```C++ +bool runOnFunction(Function &F) { + // ... + if (EnableBisectForNewOptimization) { + std::string Key = F.getParent()->getSourceFileName() + " " + + F.getName().str(); + RemoteBisectClient BisectClient; + if (!BisectClient.shouldPerformAction(Key)) + return false; // Bisector daemon told us to skip this action. + } + // Continue with the optimization + // ... +} +``` diff --git a/llvm/include/llvm/Support/Bisector.h b/llvm/include/llvm/Support/Bisector.h new file mode 100644 --- /dev/null +++ b/llvm/include/llvm/Support/Bisector.h @@ -0,0 +1,96 @@ +//===- Bisector.h - Bisector implementation for llvm-bisectd ----*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// A generic bisector implementation that bisects across user-defined keys. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_SUPPORT_BISECTOR_H +#define LLVM_SUPPORT_BISECTOR_H + +#include "llvm/ADT/MapVector.h" +#include "llvm/ADT/Optional.h" +#include "llvm/Support/ErrorHandling.h" +#include "llvm/Support/raw_ostream.h" +#include +#include +namespace llvm { + +/// Main class to manage a bisection process. +template class Bisector { +public: + Bisector() = default; + Bisector(const Bisector &) = delete; + + void resetAndStartLearning() { + KeyStateMap.clear(); + BisectHistory.clear(); + LearningMode = true; + } + + /// End the learning and start the bisection process. + void startBisect(); + + /// In learning mode, add the key \p K to the database of expected future + /// keys in a bisection run. + void learnKey(const KeyT &K) { + assert(LearningMode && "Bisector should be in learning mode!"); + KeyStateMap.insert({K, true}); + } + + /// If during bisection the given key \p K should have the bisection action + /// done, then return true. + bool shouldPerformActionOnKey(KeyT K) { + assert(!LearningMode && "Should not be in learning mode for queries!"); + return KeyStateMap.lookup(K); + } + + /// Finish the current bisection round, with a pass/fail status. If finishing + /// this round concludes the bisection process, then the key that causes the + /// failure is returned. Otherwise, returns None. + Optional finishBisectionRound(bool Passed); + + unsigned getNumKeys() const { return KeyStateMap.size(); } + KeyT &getCurrentCounterKey() { + assert(isBisecting() && "Should be bisecting!"); + assert(!BisectHistory.empty() && "Expected a current key"); + auto It = KeyStateMap.begin() + (BisectHistory.back() - 1); + return It->first; + } + int getLastFailCounter() const { return LastFailCounter; } + int getLastPassCounter() const { return LastPassCounter; } + + /// \return true if in bisecting mode, not learning mode. + bool isBisecting() const { return !LearningMode; } + +private: + /// Map between the keys and a bool representing whether or not that key + /// should have a bisected action done for it. It represents the choices for + /// the keys at a particular point in the bisection process. + MapVector> KeyStateMap; + + /// A vector with all of the counters used in the bisection process. + /// A counter is index which marks the last bisection action to be done, + /// before the rest are skipped. + SmallVector BisectHistory; + /// The counter value of the last known failing round. + int LastFailCounter; + /// Same for passing round. + int LastPassCounter; + + /// The mode of the bisector. We should not be having any queries done in + /// learning mode. Inversely, we should not be learning any keys in bisect + /// mode. + bool LearningMode = true; + + void updateMapForNewCounter(int NewCounter); +}; + +} // end namespace llvm + +#endif // LLVM_SUPPORT_BISECTOR_H diff --git a/llvm/include/llvm/Support/RemoteBisectorClient.h b/llvm/include/llvm/Support/RemoteBisectorClient.h new file mode 100644 --- /dev/null +++ b/llvm/include/llvm/Support/RemoteBisectorClient.h @@ -0,0 +1,46 @@ +//===- RemoteBisectorClient.h - Client remote bisection ---------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file contains a remote bisector client that allows tools to connect to +// a bisector service, send queries and get responses back. It's intended to be +// used in conjunction with llvm-bisectd. +// +//===----------------------------------------------------------------------===// + +#ifndef LLVM_SUPPORT_REMOTEBISECTORCLIENT_H +#define LLVM_SUPPORT_REMOTEBISECTORCLIENT_H + +#include "llvm/ADT/Optional.h" +#include +namespace llvm { + +class RemoteBisectClient { +public: + /// Construct a client to connect to localhost:7777. + RemoteBisectClient() : RemoteBisectClient("", 7777) {} + RemoteBisectClient(std::string Hostname, unsigned PortNumber) + : Hostname(Hostname), PortNumber(PortNumber) {} + + /// Connect to the bisection service and send a bisection with with key \p + /// Key. \returns true if the bisection action should be performed. + /// If there was any problem with communicating with the service, this method + /// calls report_fatal_error(), as the entire bisection process will be + /// compromised anyway. + bool shouldPerformBisectAction(std::string &Key); + +private: + /// Set up the socket connection to the server and return the socket descriptor. + int setupConnection(); + std::string Hostname; + unsigned PortNumber; +}; + + +} // end namespace llvm + +#endif // LLVM_SUPPORT_REMOTEBISECTORCLIENT_H diff --git a/llvm/lib/Support/Bisector.cpp b/llvm/lib/Support/Bisector.cpp new file mode 100644 --- /dev/null +++ b/llvm/lib/Support/Bisector.cpp @@ -0,0 +1,76 @@ +//===---- Bisector.cpp - Bisector implementation -------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "llvm/Support/Bisector.h" +#include "llvm/Support/Debug.h" + +#define DEBUG_TYPE "bisector" + +namespace llvm { + +template +Optional Bisector::finishBisectionRound(bool Passed) { + assert(!LearningMode && "Should not be in learning mode!"); + // If the current status is Passed, then we bump the counter to the midpoint + // between the current one and the last known fail counter. + // If the current status is Fail, then we halve the current counter. + + int CurrentCounter = BisectHistory.back(); + assert( + !(Passed && static_cast(CurrentCounter) == KeyStateMap.size())); + int NewCounter; + if (Passed) + LastPassCounter = CurrentCounter; + else + LastFailCounter = CurrentCounter; + + if (LastPassCounter + 1 == LastFailCounter) { + // We know the first failing counter. The counters are 1 based. + auto It = KeyStateMap.begin() + (LastFailCounter - 1); + return It->first; + } + + // There are more configs to try. Pick the next midpoint. + NewCounter = (LastPassCounter + LastFailCounter) / 2; + BisectHistory.push_back(NewCounter); + updateMapForNewCounter(NewCounter); + + return None; +} + +template void Bisector::startBisect() { + LearningMode = false; + assert(KeyStateMap.size() > 1 && "Bisection started with a single key!"); + + unsigned Counter = KeyStateMap.size() / 2; + LLVM_DEBUG(dbgs() << "Starting bisect with a key map of size " + << KeyStateMap.size() << "\n"); + LLVM_DEBUG(dbgs() << "Initial bisect index = " << Counter << "\n"); + BisectHistory.push_back(Counter); + updateMapForNewCounter(Counter); + LastPassCounter = 0; + LastFailCounter = KeyStateMap.size(); +} + +template +void Bisector::updateMapForNewCounter(int NewCounter) { + assert(!LearningMode && "Should be in bisection mode!"); + assert(NewCounter < static_cast(KeyStateMap.size()) && + "Invalid new counter!"); + + // Update the state map to reflect the new counter value. + for (int64_t Idx = 0, E = KeyStateMap.size(); Idx < E; ++Idx) { + auto It = KeyStateMap.begin() + Idx; + assert(It != KeyStateMap.end()); + It->second = Idx < NewCounter; + } +} + +template class Bisector; + +} // namespace llvm diff --git a/llvm/lib/Support/CMakeLists.txt b/llvm/lib/Support/CMakeLists.txt --- a/llvm/lib/Support/CMakeLists.txt +++ b/llvm/lib/Support/CMakeLists.txt @@ -119,6 +119,7 @@ BinaryStreamReader.cpp BinaryStreamRef.cpp BinaryStreamWriter.cpp + Bisector.cpp BlockFrequency.cpp BranchProbability.cpp BuryPointer.cpp @@ -183,6 +184,7 @@ PrettyStackTrace.cpp RandomNumberGenerator.cpp Regex.cpp + RemoteBisectorClient.cpp RISCVAttributes.cpp RISCVAttributeParser.cpp RISCVISAInfo.cpp diff --git a/llvm/lib/Support/RemoteBisectorClient.cpp b/llvm/lib/Support/RemoteBisectorClient.cpp new file mode 100644 --- /dev/null +++ b/llvm/lib/Support/RemoteBisectorClient.cpp @@ -0,0 +1,100 @@ +//===---- RemoteBisectorClient.cpp - Client remote bisection --------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// + +#include "llvm/Support/RemoteBisectorClient.h" +#include "llvm/Config/llvm-config.h" +#include "llvm/Support/Debug.h" +#include "llvm/Support/Endian.h" +#include "llvm/Support/ErrorHandling.h" +#include "llvm/Support/raw_ostream.h" +#include +#include + +#ifdef LLVM_ON_UNIX +#include +#include +#include +#include +#endif // LLVM_ON_UNIX + +#define DEBUG_TYPE "bisector" + +namespace llvm { + +int RemoteBisectClient::setupConnection() { + struct addrinfo hints; + struct addrinfo *servinfo; + + memset(&hints, 0, sizeof(hints)); + hints.ai_family = AF_UNSPEC; + hints.ai_socktype = SOCK_STREAM; + hints.ai_flags = AI_PASSIVE; + int Stat = ::getaddrinfo(NULL, std::to_string(PortNumber).c_str(), &hints, + &servinfo); + if (Stat) { + errs() << "RemoteBisectClient: getaddrinfo() failed: " + << std::strerror(errno) << "\n"; + report_fatal_error("Fatal error."); + } + + int Socket = ::socket(servinfo->ai_family, servinfo->ai_socktype, + servinfo->ai_protocol); + int yes = 1; + if (setsockopt(Socket, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof yes)) + report_fatal_error("RemoteBisectClient: setsockopt() failed"); + + Stat = connect(Socket, servinfo->ai_addr, servinfo->ai_addrlen); + if (Stat == -1) { + if (errno == EADDRINUSE) { + // If we run into address already in use, try waiting a short time before + // retrying. + sleep(1); + Stat = connect(Socket, servinfo->ai_addr, servinfo->ai_addrlen); + if (Stat == -1) { + errs() << "RemoteBisectClient: could not bind() to the socket after " + "waiting: " + << std::strerror(errno) << "\n"; + report_fatal_error("Fatal error."); + } + } + errs() << "RemoteBisectClient: could not bind() to the socket: " + << std::strerror(errno) << "\n"; + report_fatal_error("Fatal error."); + } + return Socket; +} + +bool RemoteBisectClient::shouldPerformBisectAction(std::string &Key) { + int Socket = setupConnection(); + SmallVector Sendbuf = {0, 0, 0, 0, 0, 0}; + if (Key.empty()) + report_fatal_error("Empty key given to bisection query!"); + + uint32_t Len = Key.size(); + Len += 2; + support::endian::write32le(Sendbuf.data(), Len); + Sendbuf[4] = 'Q'; + Sendbuf[5] = ' '; + Sendbuf.insert(Sendbuf.end(), Key.begin(), Key.end()); + + int BytesSent = ::send(Socket, Sendbuf.data(), Sendbuf.size(), 0); + if (static_cast(BytesSent) != Sendbuf.size()) + report_fatal_error("RemoteBisectClient: couldn't send query"); + + const int RECVBUF_LEN = 1; + char Recvbuf[RECVBUF_LEN]; + int BytesRecv = ::recv(Socket, Recvbuf, RECVBUF_LEN, 0); + if (BytesRecv != 1) + report_fatal_error( + "RemoteBisectClient: didn't receive response from bisect service"); + + close(Socket); + return Recvbuf[0]; +} + +} // namespace llvm diff --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt --- a/llvm/test/CMakeLists.txt +++ b/llvm/test/CMakeLists.txt @@ -62,6 +62,7 @@ llvm-ar llvm-as llvm-bcanalyzer + llvm-bisectd llvm-bitcode-strip llvm-c-test llvm-cat diff --git a/llvm/tools/llvm-bisectd/CMakeLists.txt b/llvm/tools/llvm-bisectd/CMakeLists.txt new file mode 100644 --- /dev/null +++ b/llvm/tools/llvm-bisectd/CMakeLists.txt @@ -0,0 +1,20 @@ +set(LLVM_LINK_COMPONENTS + Core + LineEditor + Support + ) + +add_llvm_tool(llvm-bisectd + llvm-bisectd.cpp + + DEPENDS + intrinsics_gen + ) + +if(${CMAKE_SYSTEM_NAME} MATCHES "Haiku") + target_link_libraries(llvm-bisectd PRIVATE network) +endif() + +if(${CMAKE_SYSTEM_NAME} MATCHES "SunOS") + target_link_libraries(llvm-bisectd PRIVATE socket) +endif() diff --git a/llvm/tools/llvm-bisectd/llvm-bisectd.cpp b/llvm/tools/llvm-bisectd/llvm-bisectd.cpp new file mode 100644 --- /dev/null +++ b/llvm/tools/llvm-bisectd/llvm-bisectd.cpp @@ -0,0 +1,345 @@ +//===- llvm-bisectd.cpp - LLVM function extraction utility ----------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This tool starts a daemon that co-ordinates bisection of actions of +// compilation clients like clang, or other tools, which may be running in +// parallel. +// +// The idea is that this daemon service accepts connections from the client +// tool to answer requests. In the first part of bisection, llvm-bisectd is put +// into "learning" mode. In this mode, clients request to do a bisected action +// of a given key, which represents a particular action at a point in the code +// being compiled. The daemon always answers with "yes", because in learning +// mode the goal is not to change the behaviour of the clients, but simply to +// collect all the different keys that will be checked in the actual bisection +// rounds. Once the clients are finished building, llvm-bisectd is put into +// "bisection" mode. +// +// In bisection mode, the build is then performed again but this time +// llvm-bisectd will be answering "yes" or "no" to the queries, depending on +// whether the key is on the left or right of the bisection midpoint. +// +// By doing this with a daemon service, with an appopriately chosen key, the +// client tools can be run in parallel easily. +// +// For more detailed documentation, see docs/Bisection.md +// +// Client request payload format: +// * Bytes 0-3: a little-endian 32 bit integer, representing the number of +// bytes that will be sent after this. +// * Byte 4: The ASCII character 'Q'. +// * Byte 5: The ASCII space character ' '. +// * The rest of the byte byte stream is the key being sent. +// If the length specified in the prefix and the length of "Q " + key does +// not match, then llvm-bisectd will exit with an error. +// +// +// In future, it would be good to have the following additional features: +// * Have the process become a full daemon that runs in the background. +// Use subsequent calls to llvm-bisectd to send commands to the daemon. +// This would allow this tool to be used in automation systems. +// * Save the bisection state to a file to allow loading & resuming. +// * Undo a bisect good/bad command to go back to a previous state. +// +//===----------------------------------------------------------------------===// + +#include "llvm/ADT/StringExtras.h" +#include "llvm/IR/LLVMContext.h" +#include "llvm/LineEditor/LineEditor.h" +#include "llvm/Support/Bisector.h" +#include "llvm/Support/CommandLine.h" +#include "llvm/Support/Endian.h" +#include "llvm/Support/Error.h" +#include "llvm/Support/ErrorHandling.h" +#include "llvm/Support/InitLLVM.h" +#include "llvm/Support/Mutex.h" +#include "llvm/Support/ScopedPrinter.h" +#include "llvm/Support/raw_ostream.h" +#include "llvm/Support/thread.h" +#include + +#ifdef LLVM_ON_UNIX +#include +#include +#include +#include +#endif // LLVM_ON_UNIX + +#define DEBUG_TYPE "llvm-bisectd" + +using namespace llvm; + +llvm::sys::Mutex Lock; + +cl::OptionCategory BisectCat("llvm-extract Options"); +cl::opt ListenPort("listen-port", + cl::desc("Port to listen for connections"), + cl::init("7777"), cl::cat(BisectCat)); + +const size_t RECV_BUFLEN = 2048; +/// The maximum message prefix. Clients shouldn't need to send huge messages +/// anyway. +const uint32_t MAX_PREFIX_LEN = 4096; + +/// Setup the initial socket for accepting connections and return it. +int setupConnection(); +/// Return a single byte 0 or 1, as a response to the query from a client. +char getResponse(Bisector &BisectService, + std::vector &Payload); + +/// Starts the socket to listen for connections from compiler clients. +void startService(Bisector *BisectService) { + LLVM_DEBUG(dbgs() << "llvm-bisectd listening for connections on localhost:" + << ListenPort << "\n"); + int MainSocket = setupConnection(); + + // Now we've successfully set up a listening socket. Wait for connections. + char recvbuf[RECV_BUFLEN]; + + // The payload data without the prefix. + std::vector Payload; + + while (true) { + // Accept new connections (blocking). + struct sockaddr_storage their_addr; + socklen_t addr_size; + int NewSocket = + accept(MainSocket, (struct sockaddr *)&their_addr, &addr_size); + if (NewSocket == -1) { + errs() << "Error: could not accept() connections at socket\n"; + errs() << "errno = " << std::strerror(errno) << "\n"; + std::exit(1); + } + // Linger on close. + struct linger Linger; + Linger.l_onoff = 1; + Linger.l_linger = 2; // Linger for 2 seconds. + if (setsockopt(NewSocket, SOL_SOCKET, SO_LINGER, &Linger, sizeof(Linger))) { + errs() << "Error: setsockopt() for SO_LINGERfailed\n"; + std::exit(1); + } + + // Do an initial read. + int BytesRead = ::recv(NewSocket, recvbuf, RECV_BUFLEN, 0); + if (!BytesRead) { + // Connection was closed by client. + close(NewSocket); + } + // Our protocol has a 4 byte little-endian length prefix, so minimum + // message size is 5 bytes. If we got less than something bad probably + // happened. + if (BytesRead < 5) { + errs() << "Received data was too small!\n"; + std::exit(1); + } + + uint32_t PrefixLen = support::endian::read32le(recvbuf); + if (PrefixLen > MAX_PREFIX_LEN) { + errs() << "Client sent a prefix length larger than the max of " + << MAX_PREFIX_LEN << ": " << PrefixLen << "\n"; + std::exit(1); + } + + for (int I = 4; I < BytesRead; ++I) + Payload.push_back(recvbuf[I]); + assert(static_cast(Payload.size()) == BytesRead - 4); + // Loop recv() until we read the full message. + while (Payload.size() < PrefixLen) { + BytesRead = ::recv(NewSocket, recvbuf, RECV_BUFLEN, 0); + if (!BytesRead) { + // Connection was closed by client. + errs() << "Client closed connection unexpectedly.\n"; + std::exit(1); + } + for (int I = 0; I < BytesRead; ++I) + Payload.push_back(recvbuf[I]); + } + if (Payload.size() != PrefixLen) { + errs() << "Payload received did not match prefix length!\n"; + std::exit(1); + } + + char Response = getResponse(*BisectService, Payload); + char sendbuf[1] = {Response}; + int BytesSent = ::send(NewSocket, sendbuf, 1, 0); + if (BytesSent != 1) { + errs() << "Error sending response to client!\n"; + std::exit(1); + } + Payload.clear(); + close(NewSocket); + } +} + +int setupConnection() { + struct addrinfo hints; + struct addrinfo *servinfo; + + memset(&hints, 0, sizeof hints); + hints.ai_family = AF_UNSPEC; + hints.ai_socktype = SOCK_STREAM; + hints.ai_flags = AI_PASSIVE; + int Stat = ::getaddrinfo(NULL, ListenPort.c_str(), &hints, &servinfo); + if (Stat) { + errs() << "Error: could not set up connection on port " << ListenPort + << "\n"; + errs() << "getaddrinfo() failed\n"; + std::exit(1); + } + + int Socket = ::socket(servinfo->ai_family, servinfo->ai_socktype, + servinfo->ai_protocol); + int yes = 1; + if (setsockopt(Socket, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes))) { + errs() << "Error: setsockopt() failed\n"; + std::exit(1); + } + + Stat = bind(Socket, servinfo->ai_addr, servinfo->ai_addrlen); + if (Stat) { + errs() << "Error: could not bind() to the socket\n"; + errs() << "errno = " << errno << "\n"; + std::exit(1); + } + + Stat = ::listen(Socket, /* backlog */ 128); + if (Stat) { + errs() << "Error: could not listen() to the socket\n"; + errs() << "errno = " << std::strerror(errno) << "\n"; + std::exit(1); + } + ::freeaddrinfo(servinfo); + return Socket; +} + +char getResponse(Bisector &BisectService, + std::vector &Payload) { + // Parse the payload. We expect the payload to be just a "Q", a space, and + // then the string representing the key. + if (Payload.size() < 3) { + errs() << "Invalid payload. Too short.\n"; + std::exit(1); + } + if (Payload[0] != 'Q' || Payload[1] != ' ') { + errs() << "Invalid payload. First two characters should be {'Q', ' '}\n"; + std::exit(1); + } + std::string Key(Payload.data() + 2, Payload.size() - 2); + { + sys::ScopedLock ScopeLock(Lock); + if (BisectService.isBisecting()) + return BisectService.shouldPerformActionOnKey(Key) ? 1 : 0; + BisectService.learnKey(Key); + return 1; // Learning mode always returns 1; + } + + Payload.insert(Payload.end(), Key.begin(), Key.end()); +} + +void printHelp() { + outs() << "Available commands: \n"; + outs() << "\texit - exit llvm-bisectd\n"; + outs() << "\treset - reset state and start learning mode\n"; + outs() << "\tstart-bisect - end learning mode and start bisection\n"; + outs() << "\tgood - if in bisect mode, signal the last build was good\n"; + outs() << "\tbad - if in bisect mode, signal the last build was bad\n"; + outs() << "\tstatus - print bisection state\n"; +} + +void dumpStatus(Bisector &BisectService) { + outs() << "Bisect status: " + << (BisectService.isBisecting() ? "bisecting\n" : "learning\n"); + outs() << "Total keys: " << BisectService.getNumKeys() << "\n"; + if (!BisectService.isBisecting()) + return; + int LastPass = BisectService.getLastPassCounter(); + outs() << "Counter for last known passed key: "; + if (LastPass == 0) + outs() << "No known passed key\n"; + else + outs() << LastPass << "\n"; + outs() << "Counter for last known failed key: " + << BisectService.getLastFailCounter() << "\n"; + outs() << "Current counter (mid-point) key: " + << BisectService.getCurrentCounterKey() << "\n"; +} + +int main(int argc, char **argv) { + InitLLVM X(argc, argv); + + LLVMContext Context; + cl::HideUnrelatedOptions(BisectCat); + cl::ParseCommandLineOptions(argc, argv, "llvm bisection daemon\n"); + + // For now we'll just support std::string as the keys for bisection. + // Could expand this to other datatypes in future, but I'm not sure it's + // necessary given that we could serialize them all to strings anyway. + Bisector BisectService; + // One thread will be started to handle initial connection set up and + // management. + llvm::thread ServerThread(startService, &BisectService); + ServerThread.detach(); + + // Accept user commands. + LineEditor LE("bisectd "); + while (true) { + auto MaybeLine = LE.readLine(); + // Intercept ^D to make sure the user actually wants to kill bisectd. Doing + // so will probably cause any running builds to hang or die. + if (!MaybeLine) { + outs() << "\nUse ctrl-c or type \"exit\" to quit. This may cause any " + "running builds to fail.\n"; + continue; + } + std::string Input = *MaybeLine; + if (Input == "") + continue; + if (Input == "exit") + return 0; + if (Input == "help") { + printHelp(); + } else if (Input == "reset") { + sys::ScopedLock SL(Lock); + BisectService.resetAndStartLearning(); + } else if (Input == "good" || Input == "bad") { + bool Passed = Input == "good"; + sys::ScopedLock SL(Lock); + if (!BisectService.isBisecting()) { + errs() << "Error: command not valid in learning mode. Use " + "\"start-bisect\" to begin bisect mode.\n"; + continue; + } + auto MaybeKey = BisectService.finishBisectionRound(Passed); + if (MaybeKey) { + outs() << "Bisection completed. Failing key was: " << *MaybeKey << "\n"; + outs() << "Exiting...\n"; + std::exit(0); + } + outs() << "New bisection round started\n"; + } else if (Input == "start-bisect") { + sys::ScopedLock SL(Lock); + if (BisectService.isBisecting()) { + errs() << "Bisect mode is already enabled.\n"; + continue; + } + if (BisectService.getNumKeys() < 2) { + errs() << "Error: not enough keys to bisect across.\n"; + continue; + } + outs() << "Starting bisection with " << BisectService.getNumKeys() + << " total keys to search\n"; + BisectService.startBisect(); + } else if (Input == "status") { + sys::ScopedLock SL(Lock); + dumpStatus(BisectService); + } else { + outs() << "Unknown command. Type \"help\" for possible commands\n"; + } + } + return 0; +} diff --git a/llvm/unittests/Support/BisectTest.cpp b/llvm/unittests/Support/BisectTest.cpp new file mode 100644 --- /dev/null +++ b/llvm/unittests/Support/BisectTest.cpp @@ -0,0 +1,115 @@ +//===- llvm/unittest/Support/BisectTest.cpp - Bisector tests --------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file implements unit tests for the Bisector class. +// +//===----------------------------------------------------------------------===// + +#include "llvm/ADT/ArrayRef.h" +#include "llvm/Support/Bisector.h" +#include "llvm/Support/Debug.h" +#include "gtest/gtest.h" + +using namespace llvm; + +namespace { + +template +void learnKeys(Bisector &Bisector, ArrayRef Keys) { + for (auto &K : Keys) + Bisector.learnKey(K); +} + +TEST(BisectTest, TestLearn) { + Bisector TestBisect; + + learnKeys(TestBisect, {"foo", "bar", "zoo"}); + EXPECT_EQ(TestBisect.getNumKeys(), 3U); +} + +TEST(BisectTest, TestSimpleBisect) { + Bisector TestBisect; + + learnKeys(TestBisect, {"a", "b", "c", "d", "e", "f"}); + EXPECT_EQ(TestBisect.getNumKeys(), 6U); + + TestBisect.startBisect(); + // Midpoint == idx 6 / 2 == 3, 3rd key == "c". + EXPECT_TRUE(TestBisect.shouldPerformActionOnKey("c")); + EXPECT_FALSE(TestBisect.shouldPerformActionOnKey("d")); + + // From now on use the counters as a shorter way to test. + // We're simulating a failure on "b". + EXPECT_FALSE(TestBisect.finishBisectionRound(false)); + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "a"); + EXPECT_FALSE(TestBisect.finishBisectionRound(true)); + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "b"); + auto MaybeKey = TestBisect.finishBisectionRound(false); + EXPECT_TRUE(MaybeKey.hasValue()); + if (!MaybeKey) + return; + EXPECT_EQ(*MaybeKey, "b"); +} + +TEST(BisectTest, TestFinalKeyIsFail) { + Bisector TestBisect; + + learnKeys(TestBisect, {"a", "b", "c", "d", "e", "f", "g"}); + EXPECT_EQ(TestBisect.getNumKeys(), 7U); + + TestBisect.startBisect(); + // We're simulating a failure on "g". + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "c"); + EXPECT_FALSE(TestBisect.finishBisectionRound(true)); + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "e"); + EXPECT_FALSE(TestBisect.finishBisectionRound(true)); + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "f"); + // We already know that choosing all the keys is a failure, so a pass on "f" + // let's us terminate. + auto MaybeKey = TestBisect.finishBisectionRound(true); + EXPECT_TRUE(MaybeKey.hasValue()); + if (!MaybeKey) + return; + EXPECT_EQ(*MaybeKey, "g"); +} + +TEST(BisectTest, TestFirstKeyIsFail) { + Bisector TestBisect; + + learnKeys(TestBisect, {"a", "b", "c", "d", "e", "f"}); + EXPECT_EQ(TestBisect.getNumKeys(), 6U); + + TestBisect.startBisect(); + // We're simulating a failure on "a". + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "c"); + EXPECT_FALSE(TestBisect.finishBisectionRound(false)); + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "a"); + auto MaybeKey = TestBisect.finishBisectionRound(false); + EXPECT_TRUE(MaybeKey.hasValue()); + if (!MaybeKey) + return; + EXPECT_EQ(*MaybeKey, "a"); +} + +TEST(BisectTest, TestTrivialTwoKeys) { + Bisector TestBisect; + + learnKeys(TestBisect, {"a", "b"}); + EXPECT_EQ(TestBisect.getNumKeys(), 2U); + + TestBisect.startBisect(); + // We're simulating a failure on "b". + EXPECT_EQ(TestBisect.getCurrentCounterKey(), "a"); + auto MaybeKey = TestBisect.finishBisectionRound(true); + EXPECT_TRUE(MaybeKey.hasValue()); + if (!MaybeKey) + return; + EXPECT_EQ(*MaybeKey, "b"); +} + +} // namespace diff --git a/llvm/unittests/Support/CMakeLists.txt b/llvm/unittests/Support/CMakeLists.txt --- a/llvm/unittests/Support/CMakeLists.txt +++ b/llvm/unittests/Support/CMakeLists.txt @@ -11,6 +11,7 @@ ArrayRecyclerTest.cpp Base64Test.cpp BinaryStreamTest.cpp + BisectTest.cpp BlockFrequencyTest.cpp BranchProbabilityTest.cpp CachePruningTest.cpp