This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Debuginfod/
1/1
Debuginfod.h
-
Support/
10/12
HTTPClient.h
-
lib/
-
CMakeLists.txt
-
Debuginfod/
-
CMakeLists.txt
6/6
Debuginfod.cpp
-
Support/
4/4
CMakeLists.txt
32/32
HTTPClient.cpp
-
test/
-
CMakeLists.txt
-
lit.cfg.py
1/1
lit.site.cfg.py.in
-
tools/llvm-debuginfod/
-
llvm-debuginfod/
-
Inputs/buildid/fake_build_id/
-
buildid/
-
fake_build_id/
-
debuginfo
-
executable
-
source/directory/
-
directory/
-
file.c
1/2
client-server-test.py
3
debuginfod-find.test
2/5
llvm-debuginfod-find.test
-
tools/llvm-debuginfod/
-
llvm-debuginfod/
2/2
CMakeLists.txt
9/9
llvm-debuginfod-find.cpp
-
unittests/
-
CMakeLists.txt
-
Debuginfod/
-
CMakeLists.txt
1/1
DebuginfodTests.cpp
-
Support/
-
CMakeLists.txt
3/6
HTTPClient.cpp

Differential D111252

[llvm] [Support] [Debuginfo] Add http and debuginfod client libraries and llvm-debuginfod-find tool
AbandonedPublic

Authored by noajshu on Oct 6 2021, 11:27 AM.

Download Raw Diff

Details

Reviewers

phosek
haowei
leonardchan
gulfem
dblaikie
labath
mcgrathr

Summary

This patch implements debuginfod and http client libraries with local caching. A few unit and lit tests are included.
The client libraries are only built if the cmake configuration flag is passed:
cmake [...] -DLLVM_ENABLE_DEBUGINFOD_CLIENT=1
LibCURL is required to build the clients.
The standalone tool llvm-debuginfod-find is also provided; it wraps the debuginfod client library.
The lit tests specific to this tool can be run e.g. with:
LIT_FILTER='.*llvm-debuginfod.*' ninja -C build check-llvm
Thanks in advance for comments and critiques!

This has been split into the following sub-diffs:

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

phosek added inline comments.Oct 14 2021, 1:39 AM

llvm/lib/Debuginfod/Debuginfod.cpp
39	Nit: no empty line.
43–52	This could be a `switch`.
81	Nit: no empty line.
llvm/lib/Support/HTTPClient.cpp
25	This is still not addressed.
45	This is still not addressed.
53	This is still not addressed.

Refactor HTTP Client to use std::vector & improve code formatting.

Harbormaster completed remote builds in B128912: Diff 379782.Oct 14 2021, 11:11 AM

noajshu marked 10 inline comments as done.Oct 14 2021, 11:18 AM

noajshu added inline comments.

llvm/lib/Support/HTTPClient.cpp
25	Thanks, I have refactored to use std::vector and eliminated casts entirely.

rebase against main

Harbormaster completed remote builds in B128915: Diff 379785.Oct 14 2021, 11:21 AM

phosek added inline comments.Oct 14 2021, 11:27 AM

llvm/lib/Support/HTTPClient.cpp
32	One potential downside of this approach is you may have to resize the buffer multiple times as the data comes in which would be inefficient. I was reading about alternatives, one possibility would be check a priori what's the size of the resource you're fetching is by performing a HEAD request (via `CURLOPT_NOBODY` and `CURLOPT_HEADER` set to 1) asking for the HTTP headers to be written via `CURLOPT_WRITEHEADER` and `CURLOPT_WRITEFUNCTION` and parsing the `Content-Length` value. Then you can allocate an appropriately sized buffer and fetch the actual content. You might even use `WritableMemoryBuffer::getNewUninitMemBuffer` instead of `std::vector` since you don't need to resize.

noajshu marked 3 inline comments as done.Oct 14 2021, 11:33 AM

Replaced lit.util.pythonize_bool with llvm_canonicalize_cmake_booleans in lit.site.cfg.py.in.

Harbormaster completed remote builds in B128963: Diff 379853.Oct 14 2021, 2:49 PM

noajshu marked an inline comment as done.Oct 14 2021, 2:54 PM

noajshu added inline comments.Oct 14 2021, 4:13 PM

llvm/lib/Support/HTTPClient.cpp
32	Thanks, this is a very interesting idea! I will try to do the same thing using just a single request, by triggering the resize in CURLOPT_HEADERFUNCTION when the content-length is available.

Parse Content-Length header to avoid HTTP response buffer reallocations, convert HTTP Body to MemoryBuffer.

Harbormaster completed remote builds in B129108: Diff 380075.Oct 15 2021, 11:58 AM

Cherry-pick to fix extraneous delete in patch, minor style / formatting changes.

Harbormaster completed remote builds in B129113: Diff 380087.Oct 15 2021, 1:07 PM

noajshu marked an inline comment as done.Oct 15 2021, 1:12 PM

noajshu added inline comments.

llvm/lib/Support/HTTPClient.cpp
32	Thanks for this suggestion. I have implemented this single-allocation scheme and switched the HTTP buffer type to a LLVM MemoryBuffer. Please let me know if you have any comments on this. Thanks!

phosek added inline comments.Oct 15 2021, 3:09 PM

llvm/lib/Support/HTTPClient.cpp
46	We usually also try to provide a message explaining what went wrong: `assert(Size == 1 && "...")`.
51–53	I'd avoid the macros which we don't usually use in C++ and instead use the `StringRef` methods, so you can do something like: StringRef Header(Contents, NMemb); if (Header.starts_with("Content-Length: ")) { StringRef Length = Header.substr(16); size_t ContentLength; if (!Length.getAsInteger(10, ContentLength)) { ... } } We should also make the code more resilient to handle potential issues, for example could the header look like `Content-Length : N` or is it guaranteed to always be `Content-Length: N`?
53	If the header doesn't contain `Content-Length`, would we fail in `writeMemoryCallback` because we haven't allocated the buffer? I think the should be more resilient to handle that situation somehow (even if just by returning an error).

Perform Content-Length header parsing with StringRef's startswith_insensitive.
Also made minor formatting changes and remove C macros.

Harbormaster completed remote builds in B129250: Diff 380260.Oct 17 2021, 12:03 PM

noajshu added inline comments.Oct 17 2021, 12:21 PM

llvm/lib/Support/HTTPClient.cpp
51–53	Thanks for pointing out the variations in header formatting. I took a look at the HTTP/1.1 specification and changed the parser to use `startswith_insensitive` as header names are case insensitive as well. To handle extra whitespace perhaps we could use a case-insensitive regex match like `content-length\s?:\s?(\d+)`. This would also give us the digits of the length as a match group.

Use Regex to parse Content-Length header.
Comments for assertions. Gracefully produce an error if the HTTP response lacks Content-Length header.

noajshu marked 2 inline comments as done.Oct 18 2021, 10:35 AM

noajshu added inline comments.

llvm/lib/Support/HTTPClient.cpp
51–53	I have changed this to use a Regex instead of a `startswith` test.
53	Instead of asserting the buffer has been allocated, I changed it to terminate the download and return an Error. However I agree we ought to be more resilient in this case especially since Transfer-Encoding is a valid alternative to Content-Length. Perhaps we could fall back to repeated re-allocations in the case there is no Content-Length header?

Harbormaster completed remote builds in B129391: Diff 380464.Oct 18 2021, 10:59 AM

Gracefully fall back to std::vector buffer if Content-Length is not specified in HTTP response.
Note that this is ~24x slower for a 3GB download. libCURL executes the callback once per 1KB on average.
Also moved static callback functions and OffsetBuffer inside an anonymous namespace in HTTPClient.cpp.

Harbormaster completed remote builds in B129445: Diff 380544.Oct 18 2021, 4:38 PM

noajshu marked an inline comment as done.Oct 18 2021, 4:38 PM

noajshu added inline comments.

llvm/lib/Support/HTTPClient.cpp
53	The latest diff handles a missing `Content-Length` by falling back to a `std::vector` buffer. This has the disadvantage of a `resize` on each write callback and an additional allocation + copy to convert to the returned `MemoryBuffer`. In my test it downloads ~24x slower without the Content-Length header.

phosek added inline comments.Oct 18 2021, 10:59 PM

llvm/lib/Support/HTTPClient.cpp
52	According to the spec, `Content-Length` is the only acceptable spelling so I think we could be a little more strict here and require that particular case.
92	Is it possible to use `StringRef::getAsInteger` instead?
llvm/test/tools/llvm-debuginfod/find-debuginfod-asset.py
8–12 ↗	(On Diff #380544)	I'd consider using `argparse` and give the arguments more descriptive names.
33 ↗	(On Diff #380544)	The path of the tool should be passed as a variable to this script from the test (which should get it as a lit substitution from CMake). Otherwise, you might accidentally pick up the system version which is undesirable.
llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test
3	This would ideally be a substitution.
llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
72–80	I think this should use `XDG_CACHE_HOME` when set instead (which typically defaults to `$HOME/.cache`), see https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html. We should also check what the macOS and Windows alternatives are. This implementation should only be used as a fallback.
llvm/unittests/Debuginfod/DebuginfodTests.cpp
14	I don't think this is a particularly useful test, but we should be able to improve this once we also have a server.
llvm/unittests/Support/HTTPClient.cpp
16	Ditto for this one.

In D111252#3071626, @noajshu wrote:

Gracefully fall back to std::vector buffer if Content-Length is not specified in HTTP response.
Note that this is ~24x slower for a 3GB download. libCURL executes the callback once per 1KB on average.
Also moved static callback functions and OffsetBuffer inside an anonymous namespace in HTTPClient.cpp.

Given that Content-Length is really required for efficient reads, I think we should consider it mandatory and treat its absence as a protocol error rather than providing an inefficient fallback.

This is what clangd does as well, see https://github.com/llvm/llvm-project/blob/8189c4eee74959882f4f31c6c5f969cec5cca7eb/clang-tools-extra/clangd/JSONTransport.cpp#L242. You could also use that implementation as an inspiration for how to parse the header using LLVM libraries and avoid regular expressions.

Generalize / rewrite lit testing script to handle multiple clients and servers and simplify content-length parsing + HTTP buffer allocation.

noajshu marked 3 inline comments as done.Oct 19 2021, 3:19 PM

noajshu added inline comments.

llvm/lib/Support/HTTPClient.cpp
52	Agreed, it seems a safe bet and is much simpler than using a regex. I've switched it out to use nearly the exact same Content-Length parsing as clangd.
92	Switched to use the clangd style parser which uses `llvm::getAsUnsignedInteger`.
llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test
3	Thanks, I didn't realize this wasn't being expanded! I added the tool name in `llvm/test/lit.cfg.py` so that it gets substituted with the full path. If you'd like I can make this more explicit by using a % prefix and `ToolSubst`.

Harbormaster completed remote builds in B129614: Diff 380786.Oct 19 2021, 3:23 PM

Use $XDG_DATA_HOME to set cache dir, fall back to $HOME and then current directory.

Harbormaster completed remote builds in B129632: Diff 380810.Oct 19 2021, 4:32 PM

Use sys::path::cache_directory to find where to put cached debuginfod assets.

noajshu added inline comments.Oct 19 2021, 4:49 PM

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
72–80	Thanks, I've switched it out to use LLVM's platform-independent `cache_directory` function, which in turn uses `XDG_CACHE_HOME` where applicable.

Harbormaster completed remote builds in B129641: Diff 380822.Oct 19 2021, 4:59 PM

phosek added reviewers: dblaikie, labath.Oct 19 2021, 5:55 PM

phosek added inline comments.

llvm/test/tools/llvm-debuginfod/client-server-test.py
58–109	I'd move this to a `main` function which is more conventional.

Let me just say outright that I don't feel qualified to say whether this belongs in llvm (I think it does) or not, so I am not going to click approve no matter how well you respond to my comments. However, I do have some experience with sockets and funky tests, so I think I can say something useful about those aspects of the patch.

llvm/lib/Debuginfod/Debuginfod.cpp
2–3	wrap fail
llvm/lib/Support/CMakeLists.txt
79	Who sets `CURL_LIBRARIES` ?
llvm/lib/Support/HTTPClient.cpp
16–24	This isn't super important, but the best way to ensure you follow the rules for include ordering, is to put them in a single block (with no empty lines), as then clang-format will order them for you.
35	The anonymous namespace should end here, and the rest should be static functions, per https://llvm.org/docs/CodingStandards.html#anonymous-namespaces
120–121	This isn't explicitly spelled out anywhere (and probably not used entirely consistently), but I believe the prevalent style, and one most consistent with the "make namespaces as small as possible" in the anonymous namespace section is to slap a `using namespace llvm;` at the start of the file, and then to explicitly prefix with `llvm::` the header functions that you're defining.
llvm/test/tools/llvm-debuginfod/client-server-test.py
24	Can we get rid of these sleeps? Like, maybe the server could somehow signal (create a file, write to stdout/other fd, ...) that it has initialized and is ready to serve).. I can speak from experience that getting sleep-based tests to work correctly is very tricky. You need to set the timeout very high to ensure you cover the final 0.01 percentile, and when you do that, you end up needlessly slowing down the other 99.99% of the test runs.
llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test
2	Since, presumably, every test in this folder is going to have this clause, it would be better to do this via a `lit.local.cfg` file.
5	There's absolutely no chance this will pass reliably with a hard-coded port. You'll need a mechanism to select a free port at runtime. You can do that by passing port 0 to `http.server.HTTPServer` and then fetching the actual port via `instance.server_port`. This can then be combined with the sleep comment, as you can take the notification of the listening port as a positive acknowledgement that the server is ready to accept connections.
6–14	I don't think this is a good way to design a test. I expect it will be very hard (or outright impossible) to get this working on windows because of the differences in quoting behavior. I'd recommend a pattern like: RUN: python %S/client-server-test.py %s // Or the server could be started automatically, possibly within the same process, as that makes it easier to pass the actual port number SERVER: ??? // DEBUGINFOD_URLS can probably be set automatically CLIENT: llvm-debuginfod-find fake_build_id --executable CLIENT: llvm-debuginfod-find fake_build_id --source /directory/file.c ... Only the RUN stanza would be parsed by lit. The rest would be processed by the python script. The thing you lose this way is the built-in lit substitutions (they'll only work on the RUN line), but it's not clear to me how useful would those actually be. OTOH, this means you can implement custom substitutions, tailored to your use case. If you're going to have a lot of these tests, you could also consider implementing a custom test format, which would give you an even greater flexibility on how to write and run these tests, but that's probably premature at this point. (BTW: The `#` in front of all the lines is completely redundant. The reason it's normally present is because test file is also a valid source file for some language, but that is not the case here.)
llvm/tools/llvm-debuginfod/CMakeLists.txt
10	What's the purpose of this?
llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
77	s/!size()/empty()
79	When does the current directory fallback kick in? Is it actually useful? Should you exit instead?
93	llvm_unreachable is not appropriate for user error (wrong command line arguments).
llvm/unittests/Support/HTTPClient.cpp
16	This would be better done as `EXPECT_THAT_EXPECTED(httpGet(...), Failed<StringError>())` or even `FailedWithMessage("whatever")`, though I agree that this is not very useful. I think the interesting question is whether we're ok with not having any coverage for the lower level apis, and I'm not really sure what the answer to that is.

noajshu marked 4 inline comments as done.Oct 20 2021, 11:00 AM

noajshu added inline comments.

llvm/lib/Support/CMakeLists.txt
79	When `LLVM_ENABLE_CURL` is on, `llvm/cmake/config-ix.cmake` calls `find_package(CURL)` which sets the `CURL_LIBRARIES`. This was added in D111238.
llvm/lib/Support/HTTPClient.cpp
16–24	That's a great tip, thanks!
35	Thanks!
llvm/tools/llvm-debuginfod/CMakeLists.txt
10	Good catch, this is no longer required.
llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
79	The fallback would kick in when `cache_directory` comes up empty-handed. This can happen on linux if neither `$XDG_CACHE_HOME` nor `$HOME` are in the environment. I would have no problem with removing the fallback and failing in this case as the user can always specify the current directory using `--cache-dir` anyways.

noajshu added inline comments.Oct 20 2021, 11:00 AM

llvm/unittests/Support/HTTPClient.cpp
16	Thanks, updated! I agree with the comments that these unit tests are not adding much value. One option until we have a server in llvm would be to GET http://llvm.org, although that does require an internet connection. By the lower level APIs, are you referring to the static callback functions used to implement `httpGet`, or to the CURL library itself?

Remove cache dir fallback to current directory in debuginfod-find.
Use EXPECT_THAT_EXPECTED in unit tests, switch functions out of anonymous namespaces (use static instead)

Harbormaster completed remote builds in B129782: Diff 381026.Oct 20 2021, 11:12 AM

Make namespaces smaller by switching library implementations to use using namespace llvm; and llvm:: prefixing.

Harbormaster completed remote builds in B129810: Diff 381065.Oct 20 2021, 12:47 PM

labath added inline comments.Oct 21 2021, 1:12 AM

llvm/CMakeLists.txt
975–977 ↗	(On Diff #381065)	Why isn't this next to the other llvm_canonicalize_cmake_booleans calls? (It would also help if you upload the full context with your patch (`arcanist` does it automatically, and you can achieve it with `git diff -U9999` if you do it manually).
llvm/include/llvm/Debuginfod/Debuginfod.h
37	Small strings are typically not returned by value, and I don't think that the performance of this is particularly critical (if it was, you'd probably use a `SmallVectorImpl<char>&` by-ref argument), so I'd probably use a std::string here.
llvm/lib/Support/CMakeLists.txt
79	I see. In that case, I think it would be good to also add `set(LLVM_ENABLE_CURL ... CACHE)` somewhere, so that this appears as a user-settable value in UIs and such.
llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
79	an unset HOME variable is going to be most likely an accident, and i think the usage of CWD would be surprising in that case. So, I'd leave the fallback out, but this is your tool, so I'm going to leave that up to you.
111	it seems `#undef DEBUG_TYPE` is rarely used in cpp files (headers are a different story), and when it is, it's usually because it's `#define`d to a different files immediately afterwards. Undefining at the end of a file seems pointless.
llvm/unittests/Support/HTTPClient.cpp
16	I mean the `httpGet` function itself -- we generally don't test private implementation details (directly). In an ideal world I'd split this patch into two (or even three, with the first part being the introduction of httpGet), and each would come with it's own tests. Testing the error message is nice to have, but it just scratches the surface. In httpGet, the content-length handling seems interesting to test, for example. But yes, you'd need some way to create/mock the server connection for that to work...

Add debuginfod-find.test which is simpler and less general-purpose, and does not require special command-line quoting behavior or JSON configuration.
This test incorporates Pavel's suggestion to assign the port to the first available, and avoids using timing to orchestrate the client and server. This would replace the llvm-debuginfod-find.test + client-server-test.py scheme.

Only use llvm_canonicalize_cmake_boolean where it's needed, in llvm/test/CMakeLists.txt with the other invocations.

noajshu marked an inline comment as done.Oct 21 2021, 10:24 AM

noajshu added inline comments.

llvm/CMakeLists.txt
975–977 ↗	(On Diff #381065)	Thanks and just moved this!

Harbormaster completed remote builds in B129976: Diff 381312.Oct 21 2021, 10:32 AM

Remove old end-to-end tests and add LLVM_ENABLE_CURL cmake variable cache string.

Various style and usability updates.
Remove use of llvm_unreachable in handling user error, extraneous #undef DEBUG_TYPE, add Cmake cache vars, remove CWD fallback for cache dir.

fetchDebuginfo returns a std::string instead of small string.

Harbormaster completed remote builds in B130022: Diff 381368.Oct 21 2021, 1:39 PM

labath added inline comments.Oct 22 2021, 1:25 AM

llvm/test/tools/llvm-debuginfod/debuginfod-find.test
2–3	This seems fine for now, though if you start having more of these, it would definitely be good to factor some of this out. You can consider renaming the file to .py to get syntax highlighting.
22	Where will this be storing the downloaded files? It should probably be `%t` or some derivative. And you should clean the folder before the test.
29–30	The way this is written now, the test would pass if I replaced `debuginfod-find` with `/bin/true`. Is there anything else you can check here?
llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
95	You're only checking that the user specified _at least_ one of the arguments.

Refactor HTTP Client (Curl wrapper).
This refactor enabled writing more meaningful unit tests and fixes several bugs in the client.

Harbormaster completed remote builds in B130850: Diff 382514.Oct 26 2021, 8:49 PM

Workaround for unit tests, to avoid need for parent's destructor to invoke child's override.
There is a more general explanation of why this does not work here: more details here

Harbormaster completed remote builds in B130852: Diff 382517.Oct 26 2021, 9:07 PM

noajshu added inline comments.Oct 26 2021, 9:20 PM

llvm/unittests/Support/HTTPClient.cpp
16	Hi Pavel, first off thanks for all of your comments. I refactored the HTTP client to enable meaningful unit tests. It's now split into two new class hierarchies rooted at `HTTPRequest` and `HTTPResponseHandler`. This way CURL can be swapped out with a a different HTTP backend, including a simulated backend during unit testing. Similarly, the buffered response handler can be unit tested in isolation without data going through a socket. This allows some nontrivial tests of how it handles content-lengths. I am very interested in your thoughts on this refactor, and if you have ideas for other unit tests of the HTTP client or feedback on the ones here. I wonder if it makes sense to refactor the Debuginfod client library as well to enable more meaningful unit tests there. And please feel free to suggest a different approach if you feel the refactor moves us in the wrong direction. For now I will work on addressing your comments on the other areas of the diff. Since it is even larger now, it could certainly make sense to split off into a few diffs. One way to do this could be: [diff 0 ] HTTP Client + unit tests [diff 1] Debuginfod Client library + unit tests [diff 2] Debuginfod-Find tool + end-to-end (lit) tests

phosek added inline comments.Oct 27 2021, 12:40 AM

llvm/include/llvm/Support/HTTPClient.h
42	This is unnecessary.
71–94	I'd move the cURL-based implementation into a separate header and source file, for example `Support/CURLClient.{h,cpp}`. Otherwise, anyone who includes `HTTPClient.h` automatically pulls in `<curl/curl.h>` which is undesirable.
72	This is unnecessary.
80–82	Do these need to public?
84–87	Do these need to be public?
96	This function seems unnecessary as it's functionality is now effectively provided by `HTTPRequest` and `HTTPRequestConfig` which is more flexible (for example they can potentially support methods other than Get).
llvm/lib/Support/HTTPClient.cpp
78	Rather than printing the error to stderr which may be undesirable for example when using LLVM as a library, I think it'd be better to store the error inside the class (it might be possible to use `ErrorList`) and then return it from `curlPerform`.
90	The same here.
104–114	Could this method be inlined directly into `CurlHTTPRequest::performRequest`?

Well... I hate to say it, but I found the version very easy to get lost in. It seems to me you're exposing a lot more implementation details than it is necessary (including the entire curl.h header, which is something we try to avoid. And the fact that tests don't even go through the public interface make it hard to keep track of what is being tested. If we are going to do this, maybe we could use a pattern like this:

class CurlInterface {
  virtual Error init()  = 0;
  virtual Error cleanup() = 0;
  virtual Error perform() = 0;
  virtual void setWriteFunction(...) = 0;
  // etc. 
  // It doesn't have to match the curl interface verbatim. Feel free 
  // to perform basic simplifications and c++-ifications of the 
  // interface. You'll need to do some of those to avoid dependence 
  // on curl.h anyway.
};

Expected<HTTPResponseBuffer> httpGet(const Twine &Url, CurlInterface &curl);
Expected<HTTPResponseBuffer> httpGet(const Twine &Url); // calls the first function with the default/real curl implementation

This shouldn't expose too much implementation details, and this interface could even conceivably by useful by someone who wants to use a custom http-ish library/transport/thingy.

Then, in the test, you could use gmock to program a fake curl implementation:

class CurlMock: public CurlInterface {
  MOCK_METHOD0(init, Error());
  ...
};

TEST(httpGet, whatever) {
  CurlMock curl;
  EXPECT_CALL(curl, setWriteFunction).WillOnce(SaveArg<0>(&writeFunc));
  EXPECT_CALL(curl, perform).WillOnce([&] {
    EXPECT_THAT_ERROR(writeFunc(...), llvm::Succeeded());
    EXPECT_THAT_ERROR(writeFunc(...), llvm::Failed());
    ...
    return llvm::Error::success();
  });
  EXPECT_THAT_ERROR(httpGet("http://foo", curl), llvm::Succeeded())
}

How does that sound?

llvm/lib/Support/HTTPClient.cpp
69	the llvm:: qualifications are not necessary here
139	drop .str().str(), it's cleaner
llvm/unittests/Support/HTTPClient.cpp
16	Yeah, if we're going to have tests for each of these components, I'd definitely recommend splitting them out into separate patches.

Add CurlHTTPRequest::ErrorState using ErrorList to track errors during curl http request lifecycle.
Also miscellaneous style updates, redundant namespace qualifiers.

Harbormaster completed remote builds in B130997: Diff 382731.Oct 27 2021, 11:40 AM

phosek added a reviewer: mcgrathr.Oct 27 2021, 12:16 PM

noajshu marked 3 inline comments as done.Oct 27 2021, 12:56 PM

noajshu added inline comments.

llvm/lib/Support/HTTPClient.cpp
78	Great idea, I've added this and removed the prints to stderr. The `ErrorState` will also surface for other `Error`-returning methods of `CurlHTTPRequest`.

Updates to CurlHTTPRequest access specifiers.

Harbormaster completed remote builds in B131021: Diff 382765.Oct 27 2021, 1:39 PM

Remove extraneous curlPerform method from CurlHTTPRequest

Harbormaster completed remote builds in B131039: Diff 382793.Oct 27 2021, 2:31 PM

Remove httpGet function.

Harbormaster completed remote builds in B131042: Diff 382799.Oct 27 2021, 2:52 PM

Improvements to regression (lit) tests.
Use %t as the asset cache directory and clean up directory before each test. Verify local file contents match expected contents.

Harbormaster completed remote builds in B131085: Diff 382863.Oct 27 2021, 5:50 PM

Split out CurlHTTPRequest into CURLClient.{h,cpp}.

Harbormaster completed remote builds in B131091: Diff 382871.Oct 27 2021, 6:30 PM

Rebase against main.

Harbormaster completed remote builds in B131272: Diff 383132.Oct 28 2021, 1:01 PM

noajshu mentioned this in D112751: [llvm] [Support] Add HTTP Client Support library..Oct 28 2021, 1:17 PM

noajshu edited the summary of this revision. (Show Details)Oct 28 2021, 1:26 PM

noajshu mentioned this in D112753: [llvm] [Support] Add CURL HTTP Client..Oct 28 2021, 1:42 PM

noajshu mentioned this in D112758: [llvm] [Debuginfo] Debuginfod client library..Oct 28 2021, 2:15 PM

noajshu edited the summary of this revision. (Show Details)

noajshu mentioned this in D112759: [llvm] [Debuginfo] Add llvm-debuginfod-find tool and end-to-end-tests..Oct 28 2021, 2:31 PM

noajshu edited the summary of this revision. (Show Details)Oct 29 2021, 11:03 AM

@labath Thank you for your helpful feedback on this diff!
I have split into 4 sub-diffs. The HTTP client/request/response handler architecture is in D112751 and the curl client is in D112753. Could we continue the discussion there?

llvm/include/llvm/Support/HTTPClient.h
80–82	Nope -- updated most to private, and `curlInit` to protected (for unit testing).
84–87	The `curlHandle{HeaderLine,BodyChunk}` method is invoked by the static `curl{HeaderLine,BodyChunk}MethodHook` function, which then needs access. I tried instead having curl directly call back to a private method, but then the implicit `this` argument is replaced by the first parameter passed by curl, which is the pointer to the header/body contents.
96	Agreed that this could be removed and on the flexibility point I bet another use case might be a `StreamingHTTPResponseHandler`. This would be especially nice if the `localCache` is refactored to allow caching objects chunk-by-chunk -- as then large debuginfo would not hog memory (as with the `BufferedHTTPResponseHandler`).

Abandoning due to split into separate patches.

noajshu mentioned this in rGaf69947e7028: [llvm] [Debuginfo] Debuginfod client library..Dec 5 2021, 8:45 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Debuginfod/

Debuginfod.h

44 lines

Support/

HTTPClient.h

33 lines

lib/

CMakeLists.txt

1 line

Debuginfod/

CMakeLists.txt

11 lines

Debuginfod.cpp

104 lines

Support/

CMakeLists.txt

6 lines

HTTPClient.cpp

116 lines

test/

CMakeLists.txt

1 line

lit.cfg.py

9 lines

lit.site.cfg.py.in

1 line

tools/

llvm-debuginfod/

Inputs/

buildid/

fake_build_id/

debuginfo

1 line

executable

1 line

source/

directory/

file.c

1 line

client-server-test.py

113 lines

debuginfod-find.test

49 lines

llvm-debuginfod-find.test

14 lines

tools/

llvm-debuginfod/

CMakeLists.txt

12 lines

llvm-debuginfod-find.cpp

110 lines

unittests/

CMakeLists.txt

1 line

Debuginfod/

CMakeLists.txt

11 lines

DebuginfodTests.cpp

18 lines

Support/

CMakeLists.txt

1 line

HTTPClient.cpp

21 lines

Diff 381312

llvm/include/llvm/Debuginfod/Debuginfod.h

This file was added.

				//===-- llvm/Debuginfod/Debuginfod.h - Debuginfod client --------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				/// This file contains the declarations of the debuginfod::fetchInfo
				/// function and the debuginfod::AssetType enum class
				///
				///
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_SUPPORT_DEBUGINFOD_H
				#define LLVM_SUPPORT_DEBUGINFOD_H

				#include "llvm/ADT/StringRef.h"
				#include "llvm/Support/Error.h"
				#include "llvm/Support/MemoryBuffer.h"

				namespace llvm {

				/// DebuginfodAsset - This enum class specifies the types of
				/// debugging information that can be requested from a debuginfod
				/// server.
				enum class DebuginfodAssetType { Executable, Debuginfo, Source };

				/// Fetch a debuginfod asset to a file in a local cache and return the cached
				/// file path. First queries the local cache in the given directory path,
				/// followed by the debuginfod servers at the given urls for the specified
				/// type of information about the given build ID. Source files are specified
				/// by their absolute path provided in the Description. A callback function
				/// may optionally be provided for further processing of the fetched asset.
				Expected<SmallString<64>> fetchDebuginfo(
				StringRef CacheDirectoryPath, ArrayRef<StringRef> DebuginfodUrls,
				labathUnsubmitted Done Reply Inline Actions Small strings are typically not returned by value, and I don't think that the performance of this is particularly critical (if it was, you'd probably use a `SmallVectorImpl<char>&` by-ref argument), so I'd probably use a std::string here. labath: Small strings are typically not [[ https://llvm.org/docs/ProgrammersManual.html#llvm-adt…
				StringRef BuildID, DebuginfodAssetType Type, StringRef Description = "",
				std::function<void(size_t, std::unique_ptr<MemoryBuffer>)> AddBuffer =
				[](size_t Task, std::unique_ptr<MemoryBuffer> MB) {});

				} // end namespace llvm

				#endif

llvm/include/llvm/Support/HTTPClient.h

This file was added.

//===-- llvm/Support/HTTPClient.h - HTTP client library ---*- C++ -------*-===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

///

/// \file

/// This file contains the declarations of the HTTPResponse struct

/// and httpGet function.

///

//===----------------------------------------------------------------------===//

phosekUnsubmitted

Done

Can you visually separate things with newlines?

phosek: Can you visually separate things with newlines?

#ifndef LLVM_SUPPORT_HTTP_CLIENT_H

#define LLVM_SUPPORT_HTTP_CLIENT_H

#include "llvm/Support/Error.h"

#include "llvm/Support/MemoryBuffer.h"

namespace llvm {

struct HTTPResponse {

long Code = 0;

std::unique_ptr<MemoryBuffer> Body;

phosekUnsubmitted

Done

The same here, this should be in namespace llvm.

phosek: The same here, this should be in namespace `llvm`.

};

phosekUnsubmitted

Done

#include "llvm/Support/Error.h"

namespace llvm {

struct HTTPResponse {

long Code = 0;

std::string Body;

};

Expected<HTTPResponse> httpGet(const Twine &Url);

} // end namespace llvm

#endif // LLVM_SUPPORT_HTTP_CLIENT_H

phosek:

Expected<HTTPResponse> httpGet(const Twine &Url);

} // end namespace llvm

#endif // LLVM_SUPPORT_HTTP_CLIENT_H

phosekUnsubmitted

Not Done

I'd move the cURL-based implementation into a separate header and source file, for example Support/CURLClient.{h,cpp}. Otherwise, anyone who includes HTTPClient.h automatically pulls in <curl/curl.h> which is undesirable.

phosek: I'd move the cURL-based implementation into a separate header and source file, for example…

phosekUnsubmitted

Done

Do these need to public?

phosek: Do these need to public?

noajshuAuthorUnsubmitted

Done

Nope -- updated most to private, and curlInit to protected (for unit testing).

noajshu: Nope -- updated most to private, and `curlInit` to protected (for unit testing).

phosekUnsubmitted

Not Done

Do these need to be public?

phosek: Do these need to be public?

noajshuAuthorUnsubmitted

Done

The curlHandle{HeaderLine,BodyChunk} method is invoked by the static curl{HeaderLine,BodyChunk}MethodHook function, which then needs access.

I tried instead having curl directly call back to a private method, but then the implicit this argument is replaced by the first parameter passed by curl, which is the pointer to the header/body contents.

noajshu: The `curlHandle{HeaderLine,BodyChunk}` method is invoked by the static `curl{HeaderLine…

phosekUnsubmitted

Done

class BufferedHTTPResponseHandler : public HTTPResponseHandler {

- private:

size_t BufferOffset = 0;

This is unnecessary.

phosek: This is unnecessary.

phosekUnsubmitted

Done

class CurlHTTPRequest : public HTTPRequest {

- private:

CURL *Curl = nullptr;

This is unnecessary.

phosek: This is unnecessary.

phosekUnsubmitted

Done

This function seems unnecessary as it's functionality is now effectively provided by HTTPRequest and HTTPRequestConfig which is more flexible (for example they can potentially support methods other than Get).

phosek: This function seems unnecessary as it's functionality is now effectively provided by…

noajshuAuthorUnsubmitted

Done

Agreed that this could be removed and on the flexibility point I bet another use case might be a StreamingHTTPResponseHandler. This would be especially nice if the localCache is refactored to allow caching objects chunk-by-chunk -- as then large debuginfo would not hog memory (as with the BufferedHTTPResponseHandler).

noajshu: Agreed that this could be removed and on the flexibility point I bet another use case might be…

llvm/lib/CMakeLists.txt

	include(LLVM-Build)			include(LLVM-Build)

	# `Demangle', `Support' and `TableGen' libraries are added on the top-level			# `Demangle', `Support' and `TableGen' libraries are added on the top-level
	# CMakeLists.txt			# CMakeLists.txt

	add_subdirectory(IR)			add_subdirectory(IR)
	add_subdirectory(FuzzMutate)			add_subdirectory(FuzzMutate)
	add_subdirectory(FileCheck)			add_subdirectory(FileCheck)
	add_subdirectory(InterfaceStub)			add_subdirectory(InterfaceStub)
	add_subdirectory(IRReader)			add_subdirectory(IRReader)
	add_subdirectory(CodeGen)			add_subdirectory(CodeGen)
	add_subdirectory(BinaryFormat)			add_subdirectory(BinaryFormat)
	add_subdirectory(Bitcode)			add_subdirectory(Bitcode)
	add_subdirectory(Bitstream)			add_subdirectory(Bitstream)
				add_subdirectory(Debuginfod)
	add_subdirectory(DWARFLinker)			add_subdirectory(DWARFLinker)
	add_subdirectory(Extensions)			add_subdirectory(Extensions)
	add_subdirectory(Frontend)			add_subdirectory(Frontend)
	add_subdirectory(Transforms)			add_subdirectory(Transforms)
	add_subdirectory(Linker)			add_subdirectory(Linker)
	add_subdirectory(Analysis)			add_subdirectory(Analysis)
	add_subdirectory(LTO)			add_subdirectory(LTO)
	add_subdirectory(MC)			add_subdirectory(MC)
	Show All 40 Lines

llvm/lib/Debuginfod/CMakeLists.txt

This file was added.

				if(LLVM_ENABLE_CURL)
				add_llvm_component_library(LLVMDebuginfod
				Debuginfod.cpp

				ADDITIONAL_HEADER_DIRS
				${LLVM_MAIN_INCLUDE_DIR}/llvm/Debuginfod

				LINK_COMPONENTS
				Support
				)
				endif()

llvm/lib/Debuginfod/Debuginfod.cpp

This file was added.

//===-- llvm/Debuginfod/Debuginfod.cpp - Debuginfod client library --------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

labathUnsubmitted

Done

wrap fail

labath: wrap fail

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

///

/// \file

///

/// This file defines the fetchInfo function, which retrieves

/// any of the three supported asset types: (executable, debuginfo, source file)

/// associated with a build-id from debuginfod servers. If a source file is to

/// be fetched, its absolute path must be specified in the Description argument

/// to fetchInfo.

///

//===----------------------------------------------------------------------===//

#include "llvm/Debuginfod/Debuginfod.h"

#include "llvm/ADT/StringRef.h"

#include "llvm/Support/CachePruning.h"

#include "llvm/Support/Caching.h"

#include "llvm/Support/Error.h"

#include "llvm/Support/FileUtilities.h"

#include "llvm/Support/HTTPClient.h"

#include "llvm/Support/xxhash.h"

using namespace llvm;

#define DEBUG_TYPE "DEBUGINFOD"

phosekUnsubmitted

Done

#include "llvm/Support/xxhash.h"

- using namespace llvm;

- using namespace sys::fs;

+ namespace llvm {

#define DEBUG_TYPE "DEBUGINFOD"

phosek:

Expected<SmallString<64>> llvm::fetchDebuginfo(

StringRef CacheDirectoryPath, ArrayRef<StringRef> DebuginfodUrls,

StringRef BuildID, DebuginfodAssetType Type, StringRef Description,

std::function<void(size_t, std::unique_ptr<MemoryBuffer>)> AddBuffer) {

LLVM_DEBUG(dbgs() << "fetching info, debuginfod urls size = "

<< DebuginfodUrls.size() << "\n";);

std::string Suffix;

switch (Type) {

phosekUnsubmitted

Done

Nit: no empty line.

phosek: Nit: no empty line.

case DebuginfodAssetType::Executable:

Suffix = "executable";

break;

case DebuginfodAssetType::Debuginfo:

Suffix = "debuginfo";

break;

case DebuginfodAssetType::Source:

// Description is the absolute source path

Suffix = "source" + Description.str();

break;

}

std::string UniqueKey = utostr(xxHash64((BuildID + Suffix).str()));

phosekUnsubmitted

Done

This could be a switch.

phosek: This could be a `switch`.

Expected<NativeObjectCache> CacheOrErr = localCache(

"Debuginfod-client", ".debuginfod-client", CacheDirectoryPath, AddBuffer);

if (Error Err = CacheOrErr.takeError())

return Err;

NativeObjectCache &Cache = *CacheOrErr;

// We choose an arbitrary Task parameter as we do not make use of it.

unsigned Task = 0;

AddStreamFn CacheAddStream = Cache(Task, UniqueKey);

SmallString<64> AbsCachedAssetPath;

sys::path::append(AbsCachedAssetPath, CacheDirectoryPath,

"llvmcache-" + UniqueKey);

if (!CacheAddStream) {

LLVM_DEBUG(dbgs() << "cache hit\n";);

return AbsCachedAssetPath;

}

LLVM_DEBUG(dbgs() << "cache miss, UniqueKey = " << UniqueKey << "\n";);

// The asset was not found in the local cache, so we must query

// the debuginfod servers.

for (const StringRef &ServerUrl : DebuginfodUrls) {

SmallString<64> AssetUrl;

sys::path::append(AssetUrl, ServerUrl, "buildid", BuildID, Suffix);

phosekUnsubmitted

Done

Nit: no empty line.

phosek: Nit: no empty line.

Expected<HTTPResponse> ResponseOrErr = httpGet(AssetUrl);

if (Error Err = ResponseOrErr.takeError())

return Err;

HTTPResponse &Resp = *ResponseOrErr;

if (Resp.Code != 200)

continue;

// We have retrieved the asset from this server,

// and now add it to the file cache.

auto Stream = CacheAddStream(Task);

assert(Resp.Body && "Unallocated MemoryBuffer returned from httpGet.");

*Stream->OS << Resp.Body->getBuffer();

// Return the path to the asset on disk.

return AbsCachedAssetPath;

}

return createStringError(errc::argument_out_of_domain, "build id not found");

}

#undef DEBUG_TYPE

phosekUnsubmitted

Done

return createStringError(errc::argument_out_of_domain, "build id not found");

}

#undef DEBUG_TYPE

+ } // end namespace llvm

phosek:

llvm/lib/Support/CMakeLists.txt

Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	if (MSVC)
set (delayload_flags delayimp -delayload:shell32.dll -delayload:ole32.dll)		set (delayload_flags delayimp -delayload:shell32.dll -delayload:ole32.dll)
endif()		endif()

# Link Z3 if the user wants to build it.		# Link Z3 if the user wants to build it.
if(LLVM_WITH_Z3)		if(LLVM_WITH_Z3)
set(system_libs ${system_libs} ${Z3_LIBRARIES})		set(system_libs ${system_libs} ${Z3_LIBRARIES})
endif()		endif()

		# Link LibCURL if the user wants it
		if (LLVM_ENABLE_CURL)
		set(system_libs ${system_libs} ${CURL_LIBRARIES})
		labathUnsubmitted Done Reply Inline Actions Who sets `CURL_LIBRARIES` ? labath: Who sets `CURL_LIBRARIES` ?
		noajshuAuthorUnsubmitted Done Reply Inline Actions When `LLVM_ENABLE_CURL` is on, `llvm/cmake/config-ix.cmake` calls `find_package(CURL)` which sets the `CURL_LIBRARIES`. This was added in D111238. noajshu: When `LLVM_ENABLE_CURL` is on, `llvm/cmake/config-ix.cmake` calls `find_package(CURL)` which…
		labathUnsubmitted Done Reply Inline Actions I see. In that case, I think it would be good to also add `set(LLVM_ENABLE_CURL ... CACHE)` somewhere, so that this appears as a user-settable value in UIs and such. labath: I see. In that case, I think it would be good to also add `set(LLVM_ENABLE_CURL ... CACHE)`…
		endif()

# Override the C runtime allocator on Windows and embed it into LLVM tools & libraries		# Override the C runtime allocator on Windows and embed it into LLVM tools & libraries
if(LLVM_INTEGRATED_CRT_ALLOC)		if(LLVM_INTEGRATED_CRT_ALLOC)
if (CMAKE_BUILD_TYPE AND NOT ${LLVM_USE_CRT_${uppercase_CMAKE_BUILD_TYPE}} MATCHES "^(MT\|MTd)$")		if (CMAKE_BUILD_TYPE AND NOT ${LLVM_USE_CRT_${uppercase_CMAKE_BUILD_TYPE}} MATCHES "^(MT\|MTd)$")
message(FATAL_ERROR "LLVM_INTEGRATED_CRT_ALLOC only works with /MT or /MTd. Use LLVM_USE_CRT_${uppercase_CMAKE_BUILD_TYPE} to set the appropriate option.")		message(FATAL_ERROR "LLVM_INTEGRATED_CRT_ALLOC only works with /MT or /MTd. Use LLVM_USE_CRT_${uppercase_CMAKE_BUILD_TYPE} to set the appropriate option.")
endif()		endif()

string(REGEX REPLACE "(/\|\\\\)$" "" LLVM_INTEGRATED_CRT_ALLOC "${LLVM_INTEGRATED_CRT_ALLOC}")		string(REGEX REPLACE "(/\|\\\\)$" "" LLVM_INTEGRATED_CRT_ALLOC "${LLVM_INTEGRATED_CRT_ALLOC}")

▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMSupport
Compression.cpp		Compression.cpp
CRC.cpp		CRC.cpp
ConvertUTF.cpp		ConvertUTF.cpp
ConvertUTFWrapper.cpp		ConvertUTFWrapper.cpp
CrashRecoveryContext.cpp		CrashRecoveryContext.cpp
DataExtractor.cpp		DataExtractor.cpp
Debug.cpp		Debug.cpp
DebugCounter.cpp		DebugCounter.cpp
DeltaAlgorithm.cpp		DeltaAlgorithm.cpp
		phosekUnsubmitted Done Reply Inline Actions I think that Debuginfod should be a separate library rather than part of Support. phosek: I think that Debuginfod should be a separate library rather than part of Support.
DivisionByConstantInfo.cpp		DivisionByConstantInfo.cpp
DAGDeltaAlgorithm.cpp		DAGDeltaAlgorithm.cpp
DJB.cpp		DJB.cpp
ELFAttributeParser.cpp		ELFAttributeParser.cpp
ELFAttributes.cpp		ELFAttributes.cpp
Error.cpp		Error.cpp
ErrorHandling.cpp		ErrorHandling.cpp
ExtensibleRTTI.cpp		ExtensibleRTTI.cpp
FileCollector.cpp		FileCollector.cpp
FileUtilities.cpp		FileUtilities.cpp
FileOutputBuffer.cpp		FileOutputBuffer.cpp
FoldingSet.cpp		FoldingSet.cpp
FormattedStream.cpp		FormattedStream.cpp
FormatVariadic.cpp		FormatVariadic.cpp
GlobPattern.cpp		GlobPattern.cpp
GraphWriter.cpp		GraphWriter.cpp
Hashing.cpp		Hashing.cpp
		HTTPClient.cpp
InitLLVM.cpp		InitLLVM.cpp
InstructionCost.cpp		InstructionCost.cpp
IntEqClasses.cpp		IntEqClasses.cpp
IntervalMap.cpp		IntervalMap.cpp
ItaniumManglingCanonicalizer.cpp		ItaniumManglingCanonicalizer.cpp
JSON.cpp		JSON.cpp
KnownBits.cpp		KnownBits.cpp
LEB128.cpp		LEB128.cpp
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

llvm/lib/Support/HTTPClient.cpp

This file was added.

				//===-- llvm/Support/HTTPClient.cpp - HTTP client library -------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				///
				/// This file defines the httpGet function, implemented using libcurl.
				///
				//===----------------------------------------------------------------------===//

				#include "llvm/Support/HTTPClient.h"
				#include "llvm/ADT/APInt.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/Support/Errc.h"
				#include "llvm/Support/Error.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include <curl/curl.h>

				using namespace llvm;

				labathUnsubmitted Done Reply Inline Actions This isn't super important, but the best way to ensure you follow the rules for include ordering, is to put them in a single block (with no empty lines), as then clang-format will order them for you. labath: This isn't super important, but the best way to ensure you follow the [[ https://llvm.
				noajshuAuthorUnsubmitted Done Reply Inline Actions That's a great tip, thanks! noajshu: That's a great tip, thanks!
				#define DEBUG_TYPE "HTTPClient"
				phosekUnsubmitted Done Reply Inline Actions This code should be using C++ casts, not C casts. phosek: This code should be using C++ casts, not C casts.
				phosekUnsubmitted Done Reply Inline Actions This is still not addressed. phosek: This is still not addressed.
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks, I have refactored to use std::vector and eliminated casts entirely. noajshu: Thanks, I have refactored to use std::vector and eliminated casts entirely.

				namespace {

				struct OffsetBuffer {
				size_t Offset = 0;
				std::unique_ptr<WritableMemoryBuffer> Buffer;
				};
				phosekUnsubmitted Done Reply Inline Actions One potential downside of this approach is you may have to resize the buffer multiple times as the data comes in which would be inefficient. I was reading about alternatives, one possibility would be check a priori what's the size of the resource you're fetching is by performing a HEAD request (via `CURLOPT_NOBODY` and `CURLOPT_HEADER` set to 1) asking for the HTTP headers to be written via `CURLOPT_WRITEHEADER` and `CURLOPT_WRITEFUNCTION` and parsing the `Content-Length` value. Then you can allocate an appropriately sized buffer and fetch the actual content. You might even use `WritableMemoryBuffer::getNewUninitMemBuffer` instead of `std::vector` since you don't need to resize. phosek: One potential downside of this approach is you may have to resize the buffer multiple times as…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks, this is a very interesting idea! I will try to do the same thing using just a single request, by triggering the resize in CURLOPT_HEADERFUNCTION when the content-length is available. noajshu: Thanks, this is a very interesting idea! I will try to do the same thing using just a single…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks for this suggestion. I have implemented this single-allocation scheme and switched the HTTP buffer type to a LLVM MemoryBuffer. Please let me know if you have any comments on this. Thanks! noajshu: Thanks for this suggestion. I have implemented this single-allocation scheme and switched the…

				} // end anonymous namespace

				labathUnsubmitted Done Reply Inline Actions The anonymous namespace should end here, and the rest should be static functions, per https://llvm.org/docs/CodingStandards.html#anonymous-namespaces labath: The anonymous namespace should end here, and the rest should be static functions, per <https…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks! noajshu: Thanks!
				static size_t writeMemoryCallback(char *Contents, size_t Size, size_t NMemb,
				OffsetBuffer *Buf) {
				if (!Buf->Buffer) {
				// No buffer has been allocated, so cause curl
				// to return an error code.
				return 0;
				}

				size_t ChunkSize = Size * NMemb;
				memcpy(Buf->Buffer->getBufferStart() + Buf->Offset, Contents, ChunkSize);
				phosekUnsubmitted Done Reply Inline Actions I'd use a `std::vector` here and resize it inside the callback as needed. That's not only more ergonomic, but also automatically releases memory. phosek: I'd use a `std::vector` here and resize it inside the callback as needed. That's not only more…
				phosekUnsubmitted Done Reply Inline Actions This is still not addressed. phosek: This is still not addressed.
				Buf->Offset += ChunkSize;
				phosekUnsubmitted Done Reply Inline Actions We usually also try to provide a message explaining what went wrong: `assert(Size == 1 && "...")`. phosek: We usually also try to provide a message explaining what went wrong: `assert(Size == 1 && "...
				return ChunkSize;
				}

				static bool parseHTTPContentLength(StringRef LineRef,
				unsigned long long &ContentLength) {
				// Content-Length is a mandatory header, and the only one we handle.
				phosekUnsubmitted Done Reply Inline Actions According to the spec, `Content-Length` is the only acceptable spelling so I think we could be a little more strict here and require that particular case. phosek: According to the spec, `Content-Length` is the only acceptable spelling so I think we could be…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Agreed, it seems a safe bet and is much simpler than using a regex. I've switched it out to use nearly the exact same Content-Length parsing as clangd. noajshu: Agreed, it seems a safe bet and is much simpler than using a regex. I've switched it out to use…
				if (LineRef.consume_front("Content-Length: ")) {
				phosekUnsubmitted Done Reply Inline Actions This leaks `chunk.memory`. phosek: This leaks `chunk.memory`.
				phosekUnsubmitted Done Reply Inline Actions This is still not addressed. phosek: This is still not addressed.
				phosekUnsubmitted Done Reply Inline Actions If the header doesn't contain `Content-Length`, would we fail in `writeMemoryCallback` because we haven't allocated the buffer? I think the should be more resilient to handle that situation somehow (even if just by returning an error). phosek: If the header doesn't contain `Content-Length`, would we fail in `writeMemoryCallback` because…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Instead of asserting the buffer has been allocated, I changed it to terminate the download and return an Error. However I agree we ought to be more resilient in this case especially since Transfer-Encoding is a valid alternative to Content-Length. Perhaps we could fall back to repeated re-allocations in the case there is no Content-Length header? noajshu: Instead of asserting the buffer has been allocated, I changed it to terminate the download and…
				noajshuAuthorUnsubmitted Done Reply Inline Actions The latest diff handles a missing `Content-Length` by falling back to a `std::vector` buffer. This has the disadvantage of a `resize` on each write callback and an additional allocation + copy to convert to the returned `MemoryBuffer`. In my test it downloads ~24x slower without the Content-Length header. noajshu: The latest diff handles a missing `Content-Length` by falling back to a `std::vector` buffer.
				phosekUnsubmitted Done Reply Inline Actions I'd avoid the macros which we don't usually use in C++ and instead use the `StringRef` methods, so you can do something like: StringRef Header(Contents, NMemb); if (Header.starts_with("Content-Length: ")) { StringRef Length = Header.substr(16); size_t ContentLength; if (!Length.getAsInteger(10, ContentLength)) { ... } } We should also make the code more resilient to handle potential issues, for example could the header look like `Content-Length : N` or is it guaranteed to always be `Content-Length: N`? phosek: I'd avoid the macros which we don't usually use in C++ and instead use the `StringRef` methods…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks for pointing out the variations in header formatting. I took a look at the HTTP/1.1 specification and changed the parser to use `startswith_insensitive` as header names are case insensitive as well. To handle extra whitespace perhaps we could use a case-insensitive regex match like `content-length\s?:\s?(\d+)`. This would also give us the digits of the length as a match group. noajshu: Thanks for pointing out the variations in header formatting. I took a look at the [[ https…
				noajshuAuthorUnsubmitted Done Reply Inline Actions I have changed this to use a Regex instead of a `startswith` test. noajshu: I have changed this to use a Regex instead of a `startswith` test.
				llvm::getAsUnsignedInteger(LineRef.trim(), 0, ContentLength);
				return true;
				}
				return false;
				}

				static size_t allocateMemoryCallback(char *Contents, size_t Size, size_t NMemb,
				OffsetBuffer *Buf) {
				assert(Size == 1 && "The Size passed by libCURL to CURLOPT_HEADERFUNCTION "
				"should always be 1.");

				if (Buf->Buffer)
				return NMemb;

				StringRef Header(Contents, NMemb);
				unsigned long long ContentLength;
				labathUnsubmitted Done Reply Inline Actions the llvm:: qualifications are not necessary here labath: the llvm:: qualifications are not necessary here
				if (!parseHTTPContentLength(Header, ContentLength))
				return NMemb;

				LLVM_DEBUG(dbgs() << "Received Content-Length header value " << ContentLength
				<< "\n";);

				Buf->Buffer = WritableMemoryBuffer::getNewUninitMemBuffer(ContentLength);

				return NMemb;
				phosekUnsubmitted Done Reply Inline Actions Rather than printing the error to stderr which may be undesirable for example when using LLVM as a library, I think it'd be better to store the error inside the class (it might be possible to use `ErrorList`) and then return it from `curlPerform`. phosek: Rather than printing the error to stderr which may be undesirable for example when using LLVM…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Great idea, I've added this and removed the prints to stderr. The `ErrorState` will also surface for other `Error`-returning methods of `CurlHTTPRequest`. noajshu: Great idea, I've added this and removed the prints to stderr. The `ErrorState` will also…
				}

				Expected<HTTPResponse> llvm::httpGet(const Twine &Url) {
				LLVM_DEBUG(dbgs() << "getting Url " << Url << "\n";);

				CURL *Curl = curl_easy_init();
				CURLcode CurlRes;
				if (!Curl)
				return createStringError(errc::io_error, "http library error");

				curl_easy_setopt(Curl, CURLOPT_URL, Url.str().c_str());
				curl_easy_setopt(Curl, CURLOPT_FOLLOWLOCATION, 1);
				phosekUnsubmitted Done Reply Inline Actions The same here. phosek: The same here.

				OffsetBuffer Buf;
				phosekUnsubmitted Done Reply Inline Actions Is it possible to use `StringRef::getAsInteger` instead? phosek: Is it possible to use `StringRef::getAsInteger` instead?
				noajshuAuthorUnsubmitted Done Reply Inline Actions Switched to use the clangd style parser which uses `llvm::getAsUnsignedInteger`. noajshu: Switched to use the clangd style parser which uses `llvm::getAsUnsignedInteger`.

				curl_easy_setopt(Curl, CURLOPT_WRITEFUNCTION, writeMemoryCallback);
				curl_easy_setopt(Curl, CURLOPT_WRITEDATA, &Buf);

				curl_easy_setopt(Curl, CURLOPT_HEADERFUNCTION, allocateMemoryCallback);
				curl_easy_setopt(Curl, CURLOPT_HEADERDATA, &Buf);

				LLVM_DEBUG(dbgs() << "performing the curl\n";);
				CurlRes = curl_easy_perform(Curl);
				LLVM_DEBUG(dbgs() << "got the CurlRes\n";);

				curl_easy_cleanup(Curl);

				if (CurlRes != CURLE_OK)
				return createStringError(errc::io_error, "curl_easy_perform() failed: %s\n",
				curl_easy_strerror(CurlRes));

				HTTPResponse Resp;
				Resp.Body = std::move(Buf.Buffer);
				curl_easy_getinfo(Curl, CURLINFO_RESPONSE_CODE, &Resp.Code);
				return Resp;
				}
				phosekUnsubmitted Done Reply Inline Actions Could this method be inlined directly into `CurlHTTPRequest::performRequest`? phosek: Could this method be inlined directly into `CurlHTTPRequest::performRequest`?

				#undef DEBUG_TYPE
				labathUnsubmitted Done Reply Inline Actions This isn't explicitly spelled out anywhere (and probably not used entirely consistently), but I believe the prevalent style, and one most consistent with the "make namespaces as small as possible" in the anonymous namespace section is to slap a `using namespace llvm;` at the start of the file, and then to explicitly prefix with `llvm::` the header functions that you're defining. labath: This isn't explicitly spelled out anywhere (and probably not used entirely consistently), but I…
				labathUnsubmitted Done Reply Inline Actions drop .str().str(), it's cleaner labath: drop .str().str(), it's cleaner

llvm/test/CMakeLists.txt

	llvm_canonicalize_cmake_booleans(			llvm_canonicalize_cmake_booleans(
	BUILD_SHARED_LIBS			BUILD_SHARED_LIBS
	LLVM_HAVE_LIBXAR			LLVM_HAVE_LIBXAR
	HAVE_OCAMLOPT			HAVE_OCAMLOPT
	HAVE_OCAML_OUNIT			HAVE_OCAML_OUNIT
	LLVM_ENABLE_DIA_SDK			LLVM_ENABLE_DIA_SDK
	LLVM_ENABLE_FFI			LLVM_ENABLE_FFI
	LLVM_ENABLE_THREADS			LLVM_ENABLE_THREADS
				LLVM_ENABLE_CURL
	LLVM_ENABLE_ZLIB			LLVM_ENABLE_ZLIB
	LLVM_ENABLE_LIBXML2			LLVM_ENABLE_LIBXML2
	LLVM_INCLUDE_GO_TESTS			LLVM_INCLUDE_GO_TESTS
	LLVM_LINK_LLVM_DYLIB			LLVM_LINK_LLVM_DYLIB
	LLVM_TOOL_LTO_BUILD			LLVM_TOOL_LTO_BUILD
	LLVM_USE_INTEL_JITEVENTS			LLVM_USE_INTEL_JITEVENTS
	LLVM_BUILD_EXAMPLES			LLVM_BUILD_EXAMPLES
	LLVM_ENABLE_PLUGINS			LLVM_ENABLE_PLUGINS
	▲ Show 20 Lines • Show All 217 Lines • Show Last 20 Lines

llvm/test/lit.cfg.py

Show First 20 Lines • Show All 152 Lines • ▼ Show 20 Lines	tools = [
ToolSubst('%llvm-bitcode-strip', FindTool('llvm-bitcode-strip')),		ToolSubst('%llvm-bitcode-strip', FindTool('llvm-bitcode-strip')),
ToolSubst('%split-file', FindTool('split-file')),		ToolSubst('%split-file', FindTool('split-file')),
]		]

# FIXME: Why do we have both `lli` and `%lli` that do slightly different things?		# FIXME: Why do we have both `lli` and `%lli` that do slightly different things?
tools.extend([		tools.extend([
'dsymutil', 'lli', 'lli-child-target', 'llvm-ar', 'llvm-as',		'dsymutil', 'lli', 'lli-child-target', 'llvm-ar', 'llvm-as',
'llvm-addr2line', 'llvm-bcanalyzer', 'llvm-bitcode-strip', 'llvm-config',		'llvm-addr2line', 'llvm-bcanalyzer', 'llvm-bitcode-strip', 'llvm-config',
'llvm-cov', 'llvm-cxxdump', 'llvm-cvtres', 'llvm-diff', 'llvm-dis',		'llvm-cov', 'llvm-cxxdump', 'llvm-cvtres', 'llvm-debuginfod-find',
'llvm-dwarfdump', 'llvm-dlltool', 'llvm-exegesis', 'llvm-extract',		'llvm-diff', 'llvm-dis', 'llvm-dwarfdump', 'llvm-dlltool',
'llvm-isel-fuzzer', 'llvm-ifs',		'llvm-exegesis', 'llvm-extract', 'llvm-isel-fuzzer', 'llvm-ifs',
'llvm-install-name-tool', 'llvm-jitlink', 'llvm-opt-fuzzer', 'llvm-lib',		'llvm-install-name-tool', 'llvm-jitlink', 'llvm-opt-fuzzer', 'llvm-lib',
'llvm-link', 'llvm-lto', 'llvm-lto2', 'llvm-mc', 'llvm-mca',		'llvm-link', 'llvm-lto', 'llvm-lto2', 'llvm-mc', 'llvm-mca',
'llvm-modextract', 'llvm-nm', 'llvm-objcopy', 'llvm-objdump', 'llvm-otool',		'llvm-modextract', 'llvm-nm', 'llvm-objcopy', 'llvm-objdump', 'llvm-otool',
'llvm-pdbutil', 'llvm-profdata', 'llvm-profgen', 'llvm-ranlib', 'llvm-rc', 'llvm-readelf',		'llvm-pdbutil', 'llvm-profdata', 'llvm-profgen', 'llvm-ranlib', 'llvm-rc', 'llvm-readelf',
'llvm-readobj', 'llvm-rtdyld', 'llvm-sim', 'llvm-size', 'llvm-split',		'llvm-readobj', 'llvm-rtdyld', 'llvm-sim', 'llvm-size', 'llvm-split',
'llvm-stress', 'llvm-strings', 'llvm-strip', 'llvm-tblgen', 'llvm-tapi-diff',		'llvm-stress', 'llvm-strings', 'llvm-strip', 'llvm-tblgen', 'llvm-tapi-diff',
'llvm-undname', 'llvm-windres', 'llvm-c-test', 'llvm-cxxfilt',		'llvm-undname', 'llvm-windres', 'llvm-c-test', 'llvm-cxxfilt',
'llvm-xray', 'yaml2obj', 'obj2yaml', 'yaml-bench', 'verify-uselistorder',		'llvm-xray', 'yaml2obj', 'obj2yaml', 'yaml-bench', 'verify-uselistorder',
▲ Show 20 Lines • Show All 222 Lines • ▼ Show 20 Lines
if config.have_libxml2:		if config.have_libxml2:
config.available_features.add('libxml2')		config.available_features.add('libxml2')

if config.have_opt_viewer_modules:		if config.have_opt_viewer_modules:
config.available_features.add('have_opt_viewer_modules')		config.available_features.add('have_opt_viewer_modules')

if config.expensive_checks:		if config.expensive_checks:
config.available_features.add('expensive_checks')		config.available_features.add('expensive_checks')

		if getattr(config, 'enable_debuginfod_client', False):
		config.available_features.add('debuginfod_client')

llvm/test/lit.site.cfg.py.in

	Show All 31 Lines
	config.host_os = "@HOST_OS@"			config.host_os = "@HOST_OS@"
	config.host_cc = "@HOST_CC@"			config.host_cc = "@HOST_CC@"
	config.host_cxx = "@HOST_CXX@"			config.host_cxx = "@HOST_CXX@"
	# Note: ldflags can contain double-quoted paths, so must use single quotes here.			# Note: ldflags can contain double-quoted paths, so must use single quotes here.
	config.host_ldflags = '@HOST_LDFLAGS@'			config.host_ldflags = '@HOST_LDFLAGS@'
	config.llvm_use_intel_jitevents = @LLVM_USE_INTEL_JITEVENTS@			config.llvm_use_intel_jitevents = @LLVM_USE_INTEL_JITEVENTS@
	config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"			config.llvm_use_sanitizer = "@LLVM_USE_SANITIZER@"
	config.have_zlib = @LLVM_ENABLE_ZLIB@			config.have_zlib = @LLVM_ENABLE_ZLIB@
				config.enable_debuginfod_client = @LLVM_ENABLE_CURL@
				phosekUnsubmitted Done Reply Inline Actions This kind of logic should live in CMake, this file should only have simple assignment as is done for other values. phosek: This kind of logic should live in CMake, this file should only have simple assignment as is…
	config.have_libxar = @LLVM_HAVE_LIBXAR@			config.have_libxar = @LLVM_HAVE_LIBXAR@
	config.have_libxml2 = @LLVM_ENABLE_LIBXML2@			config.have_libxml2 = @LLVM_ENABLE_LIBXML2@
	config.have_dia_sdk = @LLVM_ENABLE_DIA_SDK@			config.have_dia_sdk = @LLVM_ENABLE_DIA_SDK@
	config.enable_ffi = @LLVM_ENABLE_FFI@			config.enable_ffi = @LLVM_ENABLE_FFI@
	config.build_examples = @LLVM_BUILD_EXAMPLES@			config.build_examples = @LLVM_BUILD_EXAMPLES@
	config.enable_threads = @LLVM_ENABLE_THREADS@			config.enable_threads = @LLVM_ENABLE_THREADS@
	config.build_shared_libs = @BUILD_SHARED_LIBS@			config.build_shared_libs = @BUILD_SHARED_LIBS@
	config.link_llvm_dylib = @LLVM_LINK_LLVM_DYLIB@			config.link_llvm_dylib = @LLVM_LINK_LLVM_DYLIB@
	Show All 29 Lines

llvm/test/tools/llvm-debuginfod/Inputs/buildid/fake_build_id/debuginfo

This file was added.

fake_debuginfo

llvm/test/tools/llvm-debuginfod/Inputs/buildid/fake_build_id/executable

This file was added.

fake_executable

llvm/test/tools/llvm-debuginfod/Inputs/buildid/fake_build_id/source/directory/file.c

This file was added.

int foo = 0;

llvm/test/tools/llvm-debuginfod/client-server-test.py

This file was added.

				import argparse
				import json
				import os
				import subprocess
				import sys
				import threading
				import time


				def run_test(clients, servers, client_delay, timeout) -> int:
				timeout_failure = False
				def kill_subprocess_and_fail_test(process):
				nonlocal timeout_failure
				process.kill()
				timeout_failure = True

				groups = ['servers', 'clients']
				group_delays = {'servers': 0, 'clients': client_delay}
				processes = {group: [] for group in groups}
				watchdogs = {group: [] for group in groups}
				specs = {'clients': clients, 'servers': servers}
				for group in groups:
				time.sleep(group_delays[group])
				print(group)
				labathUnsubmitted Not Done Reply Inline Actions Can we get rid of these sleeps? Like, maybe the server could somehow signal (create a file, write to stdout/other fd, ...) that it has initialized and is ready to serve).. I can speak from experience that getting sleep-based tests to work correctly is very tricky. You need to set the timeout very high to ensure you cover the final 0.01 percentile, and when you do that, you end up needlessly slowing down the other 99.99% of the test runs. labath: Can we get rid of these sleeps? Like, maybe the server could somehow signal (create a file…
				for process_spec in specs[group]:
				print(process_spec)
				process = subprocess.Popen(
				process_spec['args'],
				env={os.environ, process_spec.get('env', {})})
				processes[group].append(process)

				watchdog = threading.Timer(
				timeout, kill_subprocess_and_fail_test,
				args=[process])
				watchdog.start()
				watchdogs[group].append(watchdog)

				all_clients_exited_normally = all(client.wait()==0
				for client in processes['clients'])
				print(f'all_clients_exited_normally = {all_clients_exited_normally}')
				test_passed = all_clients_exited_normally and not timeout_failure

				# kill watchdogs and processes
				for group in groups:
				for watchdog in watchdogs[group]:
				watchdog.cancel()
				for process in processes[group]:
				process.kill()

				if test_passed:
				return 0

				return 1


				def main():
				parser = argparse.ArgumentParser(
				description='Test client(s) and server(s) against each other. '
				'Start servers first in the specified order, then '
				'wait for client-delay before starting clients in '
				'the specified order. Fails the test if any client'
				' has a nonzero exit code.')

				parser.add_argument(
				'--clients',
				type=str,
				nargs='*',
				default=[],
				help='Specification of a client process.\n'
				'Must contain:\n'
				' `args`: [array of str]\n'
				'May optionally include:\n'
				' `env` : {str: str}\n'
				)

				parser.add_argument(
				'--servers',
				type=str,
				nargs='*',
				default=[],
				help='Specification of a server process.\n'
				'Must contain:\n'
				' `args`: [array of str]\n'
				'May optionally include:\n'
				' `env` : {str: str}\n'
				)

				parser.add_argument(
				'--client-delay',
				type=float,
				default=2,
				help='How long to wait after starting the '
				'server(s) before starting the client(s).'
				)

				parser.add_argument(
				'--timeout',
				type=float,
				default=20,
				help='How long to wait until forcefully exiting all '
				'processes and failing the test.'
				)

				args = parser.parse_args()
				result = run_test([json.loads(client) for client in args.clients],
				[json.loads(server) for server in args.servers],
				args.client_delay, args.timeout)
				print(f'result = {result}')
				os._exit(result)
				phosekUnsubmitted Done Reply Inline Actions I'd move this to a `main` function which is more conventional. phosek: I'd move this to a `main` function which is more conventional.


				if __name__ == '__main__':
				main()

llvm/test/tools/llvm-debuginfod/debuginfod-find.test

This file was added.

				# RUN: python %S/debuginfod-find.test %S/Inputs llvm-debuginfod-find
				import threading
				import http.server
				labathUnsubmitted Not Done Reply Inline Actions This seems fine for now, though if you start having more of these, it would definitely be good to factor some of this out. You can consider renaming the file to .py to get syntax highlighting. labath: This seems fine for now, though if you start having more of these, it would definitely be good…
				import functools
				import subprocess
				import os


				def test_tool(inputs_path, tool_path) -> int:
				httpd = http.server.ThreadingHTTPServer(
				('',0), functools.partial(
				http.server.SimpleHTTPRequestHandler,
				directory=inputs_path))
				port = httpd.server_port
				thread = threading.Thread(target=httpd.serve_forever)

				try:
				thread.start()

				for args in [
				[tool_path, 'fake_build_id', '--executable'],
				[tool_path, 'fake_build_id', '--source=/directory/file.c'],
				labathUnsubmitted Not Done Reply Inline Actions Where will this be storing the downloaded files? It should probably be `%t` or some derivative. And you should clean the folder before the test. labath: Where will this be storing the downloaded files? It should probably be `%t` or some derivative.
				[tool_path, 'fake_build_id', '--debuginfo']
				]:
				process = subprocess.Popen(
				args, env={**os.environ,
				'DEBUGINFOD_URLS': f'http://localhost:{port}'})
				if process.wait() != 0:
				return 1

				labathUnsubmitted Not Done Reply Inline Actions The way this is written now, the test would pass if I replaced `debuginfod-find` with `/bin/true`. Is there anything else you can check here? labath: The way this is written now, the test would pass if I replaced `debuginfod-find` with…
				finally:
				httpd.shutdown()
				thread.join()

				return 0


				def main():
				import argparse
				parser = argparse.ArgumentParser()
				parser.add_argument('inputs_path')
				parser.add_argument('tool_path')
				args = parser.parse_args()
				result = test_tool(args.inputs_path, args.tool_path)
				os._exit(result)


				if __name__ == '__main__':
				main()

llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test

This file was added.

				# REQUIRES: debuginfod_client
				# RUN: python %S/client-server-test.py \
				labathUnsubmitted Not Done Reply Inline Actions Since, presumably, every test in this folder is going to have this clause, it would be better to do this via a `lit.local.cfg` file. labath: Since, presumably, every test in this folder is going to have this clause, it would be better…
				# RUN: --servers '{"args": ["python", "-m", "http.server", \
				phosekUnsubmitted Done Reply Inline Actions This would ideally be a substitution. phosek: This would ideally be a substitution.
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks, I didn't realize this wasn't being expanded! I added the tool name in `llvm/test/lit.cfg.py` so that it gets substituted with the full path. If you'd like I can make this more explicit by using a % prefix and `ToolSubst`. noajshu: Thanks, I didn't realize this wasn't being expanded! I added the tool name in `llvm/test/lit.
				# RUN: "--directory", "%S/Inputs", "13576"]}' \
				# RUN: --clients '{"args": ["llvm-debuginfod-find", \
				labathUnsubmitted Not Done Reply Inline Actions There's absolutely no chance this will pass reliably with a hard-coded port. You'll need a mechanism to select a free port at runtime. You can do that by passing port 0 to `http.server.HTTPServer` and then fetching the actual port via `instance.server_port`. This can then be combined with the sleep comment, as you can take the notification of the listening port as a positive acknowledgement that the server is ready to accept connections. labath: There's absolutely no chance this will pass reliably with a hard-coded port. You'll need a…
				# RUN: "fake_build_id", "--executable"],\
				# RUN: "env": {"DEBUGINFOD_URLS":"http://localhost:13576"}}'\
				# RUN: '{"args": ["llvm-debuginfod-find", \
				# RUN: "fake_build_id", "--debuginfo"],\
				# RUN: "env": {"DEBUGINFOD_URLS":"http://localhost:13576"}}'\
				# RUN: '{"args": ["llvm-debuginfod-find", \
				# RUN: "fake_build_id", "--source", "/directory/file.c"],\
				# RUN: "env": {"DEBUGINFOD_URLS":"http://localhost:13576"}}'\
				# RUN: --client-delay 2 --timeout 10
				labathUnsubmitted Not Done Reply Inline Actions I don't think this is a good way to design a test. I expect it will be very hard (or outright impossible) to get this working on windows because of the differences in quoting behavior. I'd recommend a pattern like: RUN: python %S/client-server-test.py %s // Or the server could be started automatically, possibly within the same process, as that makes it easier to pass the actual port number SERVER: ??? // DEBUGINFOD_URLS can probably be set automatically CLIENT: llvm-debuginfod-find fake_build_id --executable CLIENT: llvm-debuginfod-find fake_build_id --source /directory/file.c ... Only the RUN stanza would be parsed by lit. The rest would be processed by the python script. The thing you lose this way is the built-in lit substitutions (they'll only work on the RUN line), but it's not clear to me how useful would those actually be. OTOH, this means you can implement custom substitutions, tailored to your use case. If you're going to have a lot of these tests, you could also consider implementing a custom test format, which would give you an even greater flexibility on how to write and run these tests, but that's probably premature at this point. (BTW: The `#` in front of all the lines is completely redundant. The reason it's normally present is because test file is also a valid source file for some language, but that is not the case here.) labath: I don't think this is a good way to design a test. I expect it will be very hard (or outright…

llvm/tools/llvm-debuginfod/CMakeLists.txt

This file was added.

				if (LLVM_ENABLE_CURL)
				set(LLVM_LINK_COMPONENTS
				Debuginfod
				Support
				)
				add_llvm_tool(llvm-debuginfod-find
				llvm-debuginfod-find.cpp
				)
				if(LLVM_INSTALL_BINUTILS_SYMLINKS)
				add_llvm_tool_symlink(debuginfod-find llvm-debuginfod-find)
				labathUnsubmitted Done Reply Inline Actions What's the purpose of this? labath: What's the purpose of this?
				noajshuAuthorUnsubmitted Done Reply Inline Actions Good catch, this is no longer required. noajshu: Good catch, this is no longer required.
				endif()
				endif()

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp

This file was added.

				//===-- llvm-debuginfod-find.cpp - Simple CLI for libdebuginfod-client ----===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				///
				/// \file
				/// This file contains the llvm-debuginfod-find tool. This tool
				/// queries the debuginfod servers in the DEBUGINFOD_URLS environment
				/// variable (delimited by space (" ")) for the executable,
				/// debuginfo, or specified source file of the binary matching the
				/// given build-id.
				///
				//===----------------------------------------------------------------------===//

				#include "llvm/ADT/StringRef.h"
				#include "llvm/Config/config.h"
				#include "llvm/Debuginfod/Debuginfod.h"
				#include "llvm/Option/Arg.h"
				#include "llvm/Option/ArgList.h"
				#include "llvm/Option/Option.h"
				#include "llvm/Support/COM.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/Errc.h"
				#include "llvm/Support/FileSystem.h"
				#include "llvm/Support/InitLLVM.h"
				#include "llvm/Support/Path.h"
				#include "llvm/Support/StringSaver.h"
				#include "llvm/Support/raw_ostream.h"
				#include <algorithm>
				#include <cstdio>
				#include <cstring>
				#include <string>

				#define DEBUG_TYPE "llvm-debuginfod-find"

				using namespace llvm;

				cl::opt<std::string> InputBuildID(cl::Positional, cl::Required,
				cl::desc("<input build_id>"), cl::init("-"));

				static cl::opt<bool>
				FetchExecutable("executable", cl::init(false),
				cl::desc("fetch the associated executable"));

				static cl::opt<bool> FetchDebuginfo("debuginfo", cl::init(false),
				cl::desc("fetch associated debuginfo"));

				static cl::opt<std::string> FetchSource("source", cl::init(""),
				cl::desc("/filename"));

				static cl::opt<std::string> CacheDir("cache-dir", cl::init(""),
				cl::desc("Cache Directory"),
				cl::value_desc("directory"));

				ExitOnError ExitOnErr;

				int main(int argc, char **argv) {
				InitLLVM X(argc, argv);

				cl::ParseCommandLineOptions(argc, argv);

				const char *DebuginfodUrlsEnv = std::getenv("DEBUGINFOD_URLS");
				if (DebuginfodUrlsEnv == NULL) {
				errs() << "DEBUGINFOD_URLS not set\n";
				return 1;
				}

				SmallVector<StringRef> DebuginfodUrls;
				StringRef(DebuginfodUrlsEnv).split(DebuginfodUrls, " ");

				SmallString<64> CacheDirectoryPath = StringRef(CacheDir);
				if (CacheDirectoryPath.empty() &&
				!sys::path::cache_directory(CacheDirectoryPath))
				labathUnsubmitted Done Reply Inline Actions s/!size()/empty() labath: s/!size()/empty()
				CacheDirectoryPath = ".";

				labathUnsubmitted Done Reply Inline Actions When does the current directory fallback kick in? Is it actually useful? Should you exit instead? labath: When does the current directory fallback kick in? Is it actually useful? Should you exit…
				noajshuAuthorUnsubmitted Done Reply Inline Actions The fallback would kick in when `cache_directory` comes up empty-handed. This can happen on linux if neither `$XDG_CACHE_HOME` nor `$HOME` are in the environment. I would have no problem with removing the fallback and failing in this case as the user can always specify the current directory using `--cache-dir` anyways. noajshu: The fallback would kick in when `cache_directory` comes up empty-handed. This can happen on…
				labathUnsubmitted Done Reply Inline Actions an unset HOME variable is going to be most likely an accident, and i think the usage of CWD would be surprising in that case. So, I'd leave the fallback out, but this is your tool, so I'm going to leave that up to you. labath: an unset HOME variable is going to be most likely an accident, and i think the usage of CWD…
				assert(CacheDirectoryPath.size() && "CacheDirectoryPath should be nonempty");
				phosekUnsubmitted Done Reply Inline Actions I think this should use `XDG_CACHE_HOME` when set instead (which typically defaults to `$HOME/.cache`), see https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html. We should also check what the macOS and Windows alternatives are. This implementation should only be used as a fallback. phosek: I think this should use `XDG_CACHE_HOME` when set instead (which typically defaults to `$HOME/.
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks, I've switched it out to use LLVM's platform-independent `cache_directory` function, which in turn uses `XDG_CACHE_HOME` where applicable. noajshu: Thanks, I've switched it out to use LLVM's platform-independent `cache_directory` function…

				DebuginfodAssetType Type;
				StringRef Description;
				if (FetchExecutable) {
				Type = DebuginfodAssetType::Executable;
				} else if (FetchDebuginfo) {
				Type = DebuginfodAssetType::Debuginfo;
				} else if (FetchSource != "") {
				Type = DebuginfodAssetType::Source;
				Description = FetchSource;
				} else {
				llvm_unreachable("invalid asset request");
				}
				labathUnsubmitted Done Reply Inline Actions llvm_unreachable is not appropriate for user error (wrong command line arguments). labath: llvm_unreachable is not appropriate for user error (wrong command line arguments).

				if (Type == DebuginfodAssetType::Source) {
				labathUnsubmitted Done Reply Inline Actions You're only checking that the user specified _at least_ one of the arguments. labath: You're only checking that the user specified _at least_ one of the arguments.
				// Print the contents of the source file
				ExitOnErr(fetchDebuginfo(CacheDirectoryPath, DebuginfodUrls, InputBuildID,
				Type, Description,
				[](size_t Task, std::unique_ptr<MemoryBuffer> MB) {
				outs() << MB->getBuffer();
				}));
				} else {
				// Print the path to the cached binary file on disk
				outs() << ExitOnErr(fetchDebuginfo(CacheDirectoryPath, DebuginfodUrls,
				InputBuildID, Type, Description))
				<< "\n";
				}
				}

				#undef DEBUG_TYPE
				labathUnsubmitted Done Reply Inline Actions it seems `#undef DEBUG_TYPE` is rarely used in cpp files (headers are a different story), and when it is, it's usually because it's `#define`d to a different files immediately afterwards. Undefining at the end of a file seems pointless. labath: it seems `#undef DEBUG_TYPE` is rarely used in cpp files (headers are a different story), and…

llvm/unittests/CMakeLists.txt

	Show All 16 Lines
	add_subdirectory(ADT)			add_subdirectory(ADT)
	add_subdirectory(Analysis)			add_subdirectory(Analysis)
	add_subdirectory(AsmParser)			add_subdirectory(AsmParser)
	add_subdirectory(BinaryFormat)			add_subdirectory(BinaryFormat)
	add_subdirectory(Bitcode)			add_subdirectory(Bitcode)
	add_subdirectory(Bitstream)			add_subdirectory(Bitstream)
	add_subdirectory(CodeGen)			add_subdirectory(CodeGen)
	add_subdirectory(DebugInfo)			add_subdirectory(DebugInfo)
				add_subdirectory(Debuginfod)
	add_subdirectory(Demangle)			add_subdirectory(Demangle)
	add_subdirectory(ExecutionEngine)			add_subdirectory(ExecutionEngine)
	add_subdirectory(FileCheck)			add_subdirectory(FileCheck)
	add_subdirectory(Frontend)			add_subdirectory(Frontend)
	add_subdirectory(FuzzMutate)			add_subdirectory(FuzzMutate)
	add_subdirectory(InterfaceStub)			add_subdirectory(InterfaceStub)
	add_subdirectory(IR)			add_subdirectory(IR)
	add_subdirectory(LineEditor)			add_subdirectory(LineEditor)
	Show All 17 Lines

llvm/unittests/Debuginfod/CMakeLists.txt

This file was added.

				if (LLVM_ENABLE_CURL)
				set(LLVM_LINK_COMPONENTS
				Debuginfod
				)

				add_llvm_unittest(DebuginfodTests
				DebuginfodTests.cpp
				)

				target_link_libraries(DebuginfodTests PRIVATE LLVMTestingSupport)
				endif()

llvm/unittests/Debuginfod/DebuginfodTests.cpp

This file was added.

				//===-- llvm/unittest/Support/DebuginfodTests.cpp - unit tests --- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Debuginfod/Debuginfod.h"
				#include "llvm/Testing/Support/Error.h"
				#include "gtest/gtest.h"

				TEST(DebuginfodTests, noDebuginfodUrlsFetchInfoTest) {
				EXPECT_THAT_EXPECTED(fetchDebuginfo("./", {}, "fakeBuildId",
				phosekUnsubmitted Done Reply Inline Actions I don't think this is a particularly useful test, but we should be able to improve this once we also have a server. phosek: I don't think this is a particularly useful test, but we should be able to improve this once we…
				llvm::DebuginfodAssetType::Executable,
				""),
				llvm::Failed<llvm::StringError>());
				}

llvm/unittests/Support/CMakeLists.txt

Show All 35 Lines	add_llvm_unittest(SupportTests
FileCollectorTest.cpp		FileCollectorTest.cpp
FileOutputBufferTest.cpp		FileOutputBufferTest.cpp
FileUtilitiesTest.cpp		FileUtilitiesTest.cpp
FormatVariadicTest.cpp		FormatVariadicTest.cpp
FSUniqueIDTest.cpp		FSUniqueIDTest.cpp
GlobPatternTest.cpp		GlobPatternTest.cpp
HashBuilderTest.cpp		HashBuilderTest.cpp
Host.cpp		Host.cpp
		HTTPClient.cpp
IndexedAccessorTest.cpp		IndexedAccessorTest.cpp
InstructionCostTest.cpp		InstructionCostTest.cpp
ItaniumManglingCanonicalizerTest.cpp		ItaniumManglingCanonicalizerTest.cpp
JSONTest.cpp		JSONTest.cpp
KnownBitsTest.cpp		KnownBitsTest.cpp
LEB128Test.cpp		LEB128Test.cpp
LinearPolyBaseTest.cpp		LinearPolyBaseTest.cpp
LineIteratorTest.cpp		LineIteratorTest.cpp
▲ Show 20 Lines • Show All 81 Lines • Show Last 20 Lines

llvm/unittests/Support/HTTPClient.cpp

This file was added.

				//===-- llvm/unittest/Support/HTTPClient.cpp - unit tests -------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifdef LLVM_WITH_CURL

				#include "llvm/Support/HTTPClient.h"
				#include "llvm/Testing/Support/Error.h"
				#include "gtest/gtest.h"

				TEST(HTTPClientTests, invalidUrlTest) {
				std::string invalidUrl = "this is not a valid url";
				phosekUnsubmitted Done Reply Inline Actions Ditto for this one. phosek: Ditto for this one.
				labathUnsubmitted Not Done Reply Inline Actions This would be better done as `EXPECT_THAT_EXPECTED(httpGet(...), Failed<StringError>())` or even `FailedWithMessage("whatever")`, though I agree that this is not very useful. I think the interesting question is whether we're ok with not having any coverage for the lower level apis, and I'm not really sure what the answer to that is. labath: This would be better done as `EXPECT_THAT_EXPECTED(httpGet(...), Failed<StringError>())` or…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Thanks, updated! I agree with the comments that these unit tests are not adding much value. One option until we have a server in llvm would be to GET http://llvm.org, although that does require an internet connection. By the lower level APIs, are you referring to the static callback functions used to implement `httpGet`, or to the CURL library itself? noajshu: Thanks, updated! I agree with the comments that these unit tests are not adding much value. One…
				labathUnsubmitted Not Done Reply Inline Actions I mean the `httpGet` function itself -- we generally don't test private implementation details (directly). In an ideal world I'd split this patch into two (or even three, with the first part being the introduction of httpGet), and each would come with it's own tests. Testing the error message is nice to have, but it just scratches the surface. In httpGet, the content-length handling seems interesting to test, for example. But yes, you'd need some way to create/mock the server connection for that to work... labath: I mean the `httpGet` function itself -- we generally don't test private implementation details…
				noajshuAuthorUnsubmitted Done Reply Inline Actions Hi Pavel, first off thanks for all of your comments. I refactored the HTTP client to enable meaningful unit tests. It's now split into two new class hierarchies rooted at `HTTPRequest` and `HTTPResponseHandler`. This way CURL can be swapped out with a a different HTTP backend, including a simulated backend during unit testing. Similarly, the buffered response handler can be unit tested in isolation without data going through a socket. This allows some nontrivial tests of how it handles content-lengths. I am very interested in your thoughts on this refactor, and if you have ideas for other unit tests of the HTTP client or feedback on the ones here. I wonder if it makes sense to refactor the Debuginfod client library as well to enable more meaningful unit tests there. And please feel free to suggest a different approach if you feel the refactor moves us in the wrong direction. For now I will work on addressing your comments on the other areas of the diff. Since it is even larger now, it could certainly make sense to split off into a few diffs. One way to do this could be: [diff 0 ] HTTP Client + unit tests [diff 1] Debuginfod Client library + unit tests [diff 2] Debuginfod-Find tool + end-to-end (lit) tests noajshu: Hi Pavel, first off thanks for all of your comments. I refactored the HTTP client to enable…
				labathUnsubmitted Not Done Reply Inline Actions Yeah, if we're going to have tests for each of these components, I'd definitely recommend splitting them out into separate patches. labath: Yeah, if we're going to have tests for each of these components, I'd definitely recommend…
				EXPECT_THAT_EXPECTED(llvm::httpGet(invalidUrl),
				llvm::Failed<llvm::StringError>());
				}

				#endif

This is an archive of the discontinued LLVM Phabricator instance.

[llvm] [Support] [Debuginfo] Add http and debuginfod client libraries and llvm-debuginfod-find toolAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 381312

llvm/include/llvm/Debuginfod/Debuginfod.h

llvm/include/llvm/Support/HTTPClient.h

llvm/lib/CMakeLists.txt

llvm/lib/Debuginfod/CMakeLists.txt

llvm/lib/Debuginfod/Debuginfod.cpp

llvm/lib/Support/CMakeLists.txt

llvm/lib/Support/HTTPClient.cpp

llvm/test/CMakeLists.txt

llvm/test/lit.cfg.py

llvm/test/lit.site.cfg.py.in

llvm/test/tools/llvm-debuginfod/Inputs/buildid/fake_build_id/debuginfo

llvm/test/tools/llvm-debuginfod/Inputs/buildid/fake_build_id/executable

llvm/test/tools/llvm-debuginfod/Inputs/buildid/fake_build_id/source/directory/file.c

llvm/test/tools/llvm-debuginfod/client-server-test.py

llvm/test/tools/llvm-debuginfod/debuginfod-find.test

llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test

llvm/tools/llvm-debuginfod/CMakeLists.txt

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp

llvm/unittests/CMakeLists.txt

llvm/unittests/Debuginfod/CMakeLists.txt

llvm/unittests/Debuginfod/DebuginfodTests.cpp

llvm/unittests/Support/CMakeLists.txt

llvm/unittests/Support/HTTPClient.cpp

[llvm] [Support] [Debuginfo] Add http and debuginfod client libraries and llvm-debuginfod-find tool
AbandonedPublic