This is an archive of the discontinued LLVM Phabricator instance.

[llvm] [Support] [Debuginfo] Add http and debuginfod client libraries and llvm-debuginfod-find tool
AbandonedPublic

Authored by noajshu on Oct 6 2021, 11:27 AM.

Details

Summary

This patch implements debuginfod and http client libraries with local caching. A few unit and lit tests are included.
The client libraries are only built if the cmake configuration flag is passed:
cmake [...] -DLLVM_ENABLE_DEBUGINFOD_CLIENT=1
LibCURL is required to build the clients.
The standalone tool llvm-debuginfod-find is also provided; it wraps the debuginfod client library.
The lit tests specific to this tool can be run e.g. with:
LIT_FILTER='.*llvm-debuginfod.*' ninja -C build check-llvm
Thanks in advance for comments and critiques!

This has been split into the following sub-diffs:

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
phosek added inline comments.Oct 14 2021, 1:39 AM
llvm/lib/Debuginfod/Debuginfod.cpp
38 ↗(On Diff #379601)

Nit: no empty line.

42–51 ↗(On Diff #379601)

This could be a switch.

80 ↗(On Diff #379601)

Nit: no empty line.

llvm/lib/Support/HTTPClient.cpp
25

This is still not addressed.

45

This is still not addressed.

53

This is still not addressed.

noajshu updated this revision to Diff 379782.Oct 14 2021, 11:10 AM

Refactor HTTP Client to use std::vector & improve code formatting.

noajshu marked 10 inline comments as done.Oct 14 2021, 11:18 AM
noajshu added inline comments.
llvm/lib/Support/HTTPClient.cpp
25

Thanks, I have refactored to use std::vector and eliminated casts entirely.

noajshu updated this revision to Diff 379785.Oct 14 2021, 11:20 AM
noajshu marked an inline comment as done.

rebase against main

phosek added inline comments.Oct 14 2021, 11:27 AM
llvm/lib/Support/HTTPClient.cpp
32

One potential downside of this approach is you may have to resize the buffer multiple times as the data comes in which would be inefficient.

I was reading about alternatives, one possibility would be check a priori what's the size of the resource you're fetching is by performing a HEAD request (via CURLOPT_NOBODY and CURLOPT_HEADER set to 1) asking for the HTTP headers to be written via CURLOPT_WRITEHEADER and CURLOPT_WRITEFUNCTION and parsing the Content-Length value. Then you can allocate an appropriately sized buffer and fetch the actual content. You might even use WritableMemoryBuffer::getNewUninitMemBuffer instead of std::vector since you don't need to resize.

noajshu marked 3 inline comments as done.Oct 14 2021, 11:33 AM
noajshu updated this revision to Diff 379853.Oct 14 2021, 2:48 PM

Replaced lit.util.pythonize_bool with llvm_canonicalize_cmake_booleans in lit.site.cfg.py.in.

noajshu marked an inline comment as done.Oct 14 2021, 2:54 PM
noajshu added inline comments.Oct 14 2021, 4:13 PM
llvm/lib/Support/HTTPClient.cpp
32

Thanks, this is a very interesting idea! I will try to do the same thing using just a single request, by triggering the resize in CURLOPT_HEADERFUNCTION when the content-length is available.

noajshu updated this revision to Diff 380075.Oct 15 2021, 11:57 AM

Parse Content-Length header to avoid HTTP response buffer reallocations, convert HTTP Body to MemoryBuffer.

noajshu updated this revision to Diff 380087.Oct 15 2021, 12:58 PM

Cherry-pick to fix extraneous delete in patch, minor style / formatting changes.

noajshu marked an inline comment as done.Oct 15 2021, 1:12 PM
noajshu added inline comments.
llvm/lib/Support/HTTPClient.cpp
32

Thanks for this suggestion. I have implemented this single-allocation scheme and switched the HTTP buffer type to a LLVM MemoryBuffer. Please let me know if you have any comments on this. Thanks!

phosek added inline comments.Oct 15 2021, 3:09 PM
llvm/lib/Support/HTTPClient.cpp
46

We usually also try to provide a message explaining what went wrong: assert(Size == 1 && "...").

51–53

I'd avoid the macros which we don't usually use in C++ and instead use the StringRef methods, so you can do something like:

StringRef Header(Contents, NMemb);
if (Header.starts_with("Content-Length: ")) {
  StringRef Length = Header.substr(16);
  size_t ContentLength;
  if (!Length.getAsInteger(10, ContentLength)) {
    ...
  }
}

We should also make the code more resilient to handle potential issues, for example could the header look like Content-Length : N or is it guaranteed to always be Content-Length: N?

53

If the header doesn't contain Content-Length, would we fail in writeMemoryCallback because we haven't allocated the buffer? I think the should be more resilient to handle that situation somehow (even if just by returning an error).

noajshu updated this revision to Diff 380260.Oct 17 2021, 11:54 AM

Perform Content-Length header parsing with StringRef's startswith_insensitive.
Also made minor formatting changes and remove C macros.

noajshu added inline comments.Oct 17 2021, 12:21 PM
llvm/lib/Support/HTTPClient.cpp
51–53

Thanks for pointing out the variations in header formatting. I took a look at the HTTP/1.1 specification and changed the parser to use startswith_insensitive as header names are case insensitive as well.
To handle extra whitespace perhaps we could use a case-insensitive regex match like content-length\s*?:\s*?(\d+). This would also give us the digits of the length as a match group.

noajshu updated this revision to Diff 380464.Oct 18 2021, 10:28 AM

Use Regex to parse Content-Length header.
Comments for assertions. Gracefully produce an error if the HTTP response lacks Content-Length header.

noajshu marked 2 inline comments as done.Oct 18 2021, 10:35 AM
noajshu added inline comments.
llvm/lib/Support/HTTPClient.cpp
51–53

I have changed this to use a Regex instead of a startswith test.

53

Instead of asserting the buffer has been allocated, I changed it to terminate the download and return an Error. However I agree we ought to be more resilient in this case especially since Transfer-Encoding is a valid alternative to Content-Length.
Perhaps we could fall back to repeated re-allocations in the case there is no Content-Length header?

noajshu updated this revision to Diff 380544.Oct 18 2021, 4:32 PM

Gracefully fall back to std::vector buffer if Content-Length is not specified in HTTP response.
Note that this is ~24x slower for a 3GB download. libCURL executes the callback once per 1KB on average.
Also moved static callback functions and OffsetBuffer inside an anonymous namespace in HTTPClient.cpp.

noajshu marked an inline comment as done.Oct 18 2021, 4:38 PM
noajshu added inline comments.
llvm/lib/Support/HTTPClient.cpp
53

The latest diff handles a missing Content-Length by falling back to a std::vector buffer. This has the disadvantage of a resize on each write callback and an additional allocation + copy to convert to the returned MemoryBuffer. In my test it downloads ~24x slower without the Content-Length header.

phosek added inline comments.Oct 18 2021, 10:59 PM
llvm/lib/Support/HTTPClient.cpp
52

According to the spec, Content-Length is the only acceptable spelling so I think we could be a little more strict here and require that particular case.

92

Is it possible to use StringRef::getAsInteger instead?

llvm/test/tools/llvm-debuginfod/find-debuginfod-asset.py
8–12 ↗(On Diff #380544)

I'd consider using argparse and give the arguments more descriptive names.

33 ↗(On Diff #380544)

The path of the tool should be passed as a variable to this script from the test (which should get it as a lit substitution from CMake). Otherwise, you might accidentally pick up the system version which is undesirable.

llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test
3

This would ideally be a substitution.

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
72–80

I think this should use XDG_CACHE_HOME when set instead (which typically defaults to $HOME/.cache), see https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html. We should also check what the macOS and Windows alternatives are. This implementation should only be used as a fallback.

llvm/unittests/Debuginfod/DebuginfodTests.cpp
13 ↗(On Diff #380544)

I don't think this is a particularly useful test, but we should be able to improve this once we also have a server.

llvm/unittests/Support/HTTPClient.cpp
16

Ditto for this one.

Gracefully fall back to std::vector buffer if Content-Length is not specified in HTTP response.
Note that this is ~24x slower for a 3GB download. libCURL executes the callback once per 1KB on average.
Also moved static callback functions and OffsetBuffer inside an anonymous namespace in HTTPClient.cpp.

Given that Content-Length is really required for efficient reads, I think we should consider it mandatory and treat its absence as a protocol error rather than providing an inefficient fallback.

This is what clangd does as well, see https://github.com/llvm/llvm-project/blob/8189c4eee74959882f4f31c6c5f969cec5cca7eb/clang-tools-extra/clangd/JSONTransport.cpp#L242. You could also use that implementation as an inspiration for how to parse the header using LLVM libraries and avoid regular expressions.

noajshu updated this revision to Diff 380786.Oct 19 2021, 3:13 PM

Generalize / rewrite lit testing script to handle multiple clients and servers and simplify content-length parsing + HTTP buffer allocation.

noajshu marked 3 inline comments as done.Oct 19 2021, 3:19 PM
noajshu added inline comments.
llvm/lib/Support/HTTPClient.cpp
52

Agreed, it seems a safe bet and is much simpler than using a regex. I've switched it out to use nearly the exact same Content-Length parsing as clangd.

92

Switched to use the clangd style parser which uses llvm::getAsUnsignedInteger.

llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test
3

Thanks, I didn't realize this wasn't being expanded! I added the tool name in llvm/test/lit.cfg.py so that it gets substituted with the full path. If you'd like I can make this more explicit by using a % prefix and ToolSubst.

noajshu updated this revision to Diff 380810.Oct 19 2021, 4:18 PM
noajshu marked 3 inline comments as done.

Use $XDG_DATA_HOME to set cache dir, fall back to $HOME and then current directory.

noajshu updated this revision to Diff 380822.Oct 19 2021, 4:46 PM
noajshu marked an inline comment as done.

Use sys::path::cache_directory to find where to put cached debuginfod assets.

noajshu added inline comments.Oct 19 2021, 4:49 PM
llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
72–80

Thanks, I've switched it out to use LLVM's platform-independent cache_directory function, which in turn uses XDG_CACHE_HOME where applicable.

phosek added inline comments.
llvm/test/tools/llvm-debuginfod/client-server-test.py
57–108 ↗(On Diff #380822)

I'd move this to a main function which is more conventional.

Let me just say outright that I don't feel qualified to say whether this belongs in llvm (I think it does) or not, so I am not going to click approve no matter how well you respond to my comments. However, I do have some experience with sockets and funky tests, so I think I can say something useful about those aspects of the patch.

llvm/lib/Debuginfod/Debuginfod.cpp
1–2 ↗(On Diff #380822)

wrap fail

llvm/lib/Support/CMakeLists.txt
79

Who sets CURL_LIBRARIES ?

llvm/lib/Support/HTTPClient.cpp
16–24

This isn't super important, but the best way to ensure you follow the rules for include ordering, is to put them in a single block (with no empty lines), as then clang-format will order them for you.

35

The anonymous namespace should end here, and the rest should be static functions, per https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

120–121

This isn't explicitly spelled out anywhere (and probably not used entirely consistently), but I believe the prevalent style, and one most consistent with the "make namespaces as small as possible" in the anonymous namespace section is to slap a using namespace llvm; at the start of the file, and then to explicitly prefix with llvm:: the header functions that you're defining.

llvm/test/tools/llvm-debuginfod/client-server-test.py
23 ↗(On Diff #380822)

Can we get rid of these sleeps? Like, maybe the server could somehow signal (create a file, write to stdout/other fd, ...) that it has initialized and is ready to serve)..

I can speak from experience that getting sleep-based tests to work correctly is very tricky. You need to set the timeout very high to ensure you cover the final 0.01 percentile, and when you do that, you end up needlessly slowing down the other 99.99% of the test runs.

llvm/test/tools/llvm-debuginfod/llvm-debuginfod-find.test
2

Since, presumably, every test in this folder is going to have this clause, it would be better to do this via a lit.local.cfg file.

5

There's absolutely no chance this will pass reliably with a hard-coded port. You'll need a mechanism to select a free port at runtime. You can do that by passing port 0 to http.server.HTTPServer and then fetching the actual port via instance.server_port.

This can then be combined with the sleep comment, as you can take the notification of the listening port as a positive acknowledgement that the server is ready to accept connections.

6–14

I don't think this is a good way to design a test. I expect it will be very hard (or outright impossible) to get this working on windows because of the differences in quoting behavior. I'd recommend a pattern like:

RUN: python %S/client-server-test.py %s

// Or the server could be started automatically, possibly within the same process, as that makes it easier to pass the actual port number
SERVER: ???

// DEBUGINFOD_URLS can probably be set automatically
CLIENT: llvm-debuginfod-find fake_build_id --executable
CLIENT: llvm-debuginfod-find fake_build_id --source /directory/file.c
...

Only the RUN stanza would be parsed by lit. The rest would be processed by the python script. The thing you lose this way is the built-in lit substitutions (they'll only work on the RUN line), but it's not clear to me how useful would those actually be. OTOH, this means you can implement custom substitutions, tailored to your use case.

If you're going to have a lot of these tests, you could also consider implementing a custom test format, which would give you an even greater flexibility on how to write and run these tests, but that's probably premature at this point.

(BTW: The # in front of all the lines is completely redundant. The reason it's normally present is because test file is also a valid source file for some language, but that is not the case here.)

llvm/tools/llvm-debuginfod/CMakeLists.txt
10

What's the purpose of this?

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
77

s/!size()/empty()

79

When does the current directory fallback kick in? Is it actually useful? Should you exit instead?

93

llvm_unreachable is not appropriate for user error (wrong command line arguments).

llvm/unittests/Support/HTTPClient.cpp
16

This would be better done as EXPECT_THAT_EXPECTED(httpGet(...), Failed<StringError>()) or even FailedWithMessage("whatever"), though I agree that this is not very useful.

I think the interesting question is whether we're ok with not having any coverage for the lower level apis, and I'm not really sure what the answer to that is.

noajshu marked 4 inline comments as done.Oct 20 2021, 11:00 AM
noajshu added inline comments.
llvm/lib/Support/CMakeLists.txt
79

When LLVM_ENABLE_CURL is on, llvm/cmake/config-ix.cmake calls find_package(CURL) which sets the CURL_LIBRARIES. This was added in D111238.

llvm/lib/Support/HTTPClient.cpp
16–24

That's a great tip, thanks!

35

Thanks!

llvm/tools/llvm-debuginfod/CMakeLists.txt
10

Good catch, this is no longer required.

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
79

The fallback would kick in when cache_directory comes up empty-handed. This can happen on linux if neither $XDG_CACHE_HOME nor $HOME are in the environment. I would have no problem with removing the fallback and failing in this case as the user can always specify the current directory using --cache-dir anyways.

noajshu added inline comments.Oct 20 2021, 11:00 AM
llvm/unittests/Support/HTTPClient.cpp
16

Thanks, updated!
I agree with the comments that these unit tests are not adding much value. One option until we have a server in llvm would be to GET http://llvm.org, although that does require an internet connection. By the lower level APIs, are you referring to the static callback functions used to implement httpGet, or to the CURL library itself?

noajshu updated this revision to Diff 381026.Oct 20 2021, 11:02 AM
noajshu marked 2 inline comments as done.

Remove cache dir fallback to current directory in debuginfod-find.
Use EXPECT_THAT_EXPECTED in unit tests, switch functions out of anonymous namespaces (use static instead)

noajshu updated this revision to Diff 381065.Oct 20 2021, 12:36 PM
noajshu marked 3 inline comments as done.

Make namespaces smaller by switching library implementations to use using namespace llvm; and llvm:: prefixing.

labath added inline comments.Oct 21 2021, 1:12 AM
llvm/CMakeLists.txt
975–977

Why isn't this next to the other llvm_canonicalize_cmake_booleans calls?

(It would also help if you upload the full context with your patch (arcanist does it automatically, and you can achieve it with git diff -U9999 if you do it manually).

llvm/include/llvm/Debuginfod/Debuginfod.h
36 ↗(On Diff #381065)

Small strings are typically not returned by value, and I don't think that the performance of this is particularly critical (if it was, you'd probably use a SmallVectorImpl<char>& by-ref argument), so I'd probably use a std::string here.

llvm/lib/Support/CMakeLists.txt
79

I see. In that case, I think it would be good to also add set(LLVM_ENABLE_CURL ... CACHE) somewhere, so that this appears as a user-settable value in UIs and such.

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
79

an unset HOME variable is going to be most likely an accident, and i think the usage of CWD would be surprising in that case. So, I'd leave the fallback out, but this is your tool, so I'm going to leave that up to you.

111

it seems #undef DEBUG_TYPE is rarely used in cpp files (headers are a different story), and when it is, it's usually because it's #defined to a different files immediately afterwards. Undefining at the end of a file seems pointless.

llvm/unittests/Support/HTTPClient.cpp
16

I mean the httpGet function itself -- we generally don't test private implementation details (directly).
In an ideal world I'd split this patch into two (or even three, with the first part being the introduction of httpGet), and each would come with it's own tests. Testing the error message is nice to have, but it just scratches the surface. In httpGet, the content-length handling seems interesting to test, for example.

But yes, you'd need some way to create/mock the server connection for that to work...

noajshu updated this revision to Diff 381311.Oct 21 2021, 10:18 AM

Add debuginfod-find.test which is simpler and less general-purpose, and does not require special command-line quoting behavior or JSON configuration.
This test incorporates Pavel's suggestion to assign the port to the first available, and avoids using timing to orchestrate the client and server. This would replace the llvm-debuginfod-find.test + client-server-test.py scheme.

noajshu updated this revision to Diff 381312.Oct 21 2021, 10:23 AM

Only use llvm_canonicalize_cmake_boolean where it's needed, in llvm/test/CMakeLists.txt with the other invocations.

noajshu marked an inline comment as done.Oct 21 2021, 10:24 AM
noajshu added inline comments.
llvm/CMakeLists.txt
975–977

Thanks and just moved this!

noajshu updated this revision to Diff 381364.Oct 21 2021, 12:45 PM
noajshu marked 2 inline comments as done.

Remove old end-to-end tests and add LLVM_ENABLE_CURL cmake variable cache string.

noajshu updated this revision to Diff 381367.Oct 21 2021, 12:58 PM
noajshu marked 5 inline comments as done.

Various style and usability updates.
Remove use of llvm_unreachable in handling user error, extraneous #undef DEBUG_TYPE, add Cmake cache vars, remove CWD fallback for cache dir.

noajshu updated this revision to Diff 381368.Oct 21 2021, 1:01 PM

fetchDebuginfo returns a std::string instead of small string.

labath added inline comments.Oct 22 2021, 1:25 AM
llvm/test/tools/llvm-debuginfod/debuginfod-find.test
1–2 ↗(On Diff #381368)

This seems fine for now, though if you start having more of these, it would definitely be good to factor some of this out.

You can consider renaming the file to .py to get syntax highlighting.

21 ↗(On Diff #381368)

Where will this be storing the downloaded files? It should probably be %t or some derivative. And you should clean the folder before the test.

28–29 ↗(On Diff #381368)

The way this is written now, the test would pass if I replaced debuginfod-find with /bin/true. Is there anything else you can check here?

llvm/tools/llvm-debuginfod/llvm-debuginfod-find.cpp
95

You're only checking that the user specified _at least_ one of the arguments.

noajshu updated this revision to Diff 382514.Oct 26 2021, 8:39 PM
noajshu marked an inline comment as done.

Refactor HTTP Client (Curl wrapper).
This refactor enabled writing more meaningful unit tests and fixes several bugs in the client.

noajshu updated this revision to Diff 382517.Oct 26 2021, 9:01 PM

Workaround for unit tests, to avoid need for parent's destructor to invoke child's override.
There is a more general explanation of why this does not work here: more details here

noajshu added inline comments.Oct 26 2021, 9:20 PM
llvm/unittests/Support/HTTPClient.cpp
16

Hi Pavel, first off thanks for all of your comments. I refactored the HTTP client to enable meaningful unit tests. It's now split into two new class hierarchies rooted at HTTPRequest and HTTPResponseHandler. This way CURL can be swapped out with a a different HTTP backend, including a simulated backend during unit testing.
Similarly, the buffered response handler can be unit tested in isolation without data going through a socket. This allows some nontrivial tests of how it handles content-lengths.
I am very interested in your thoughts on this refactor, and if you have ideas for other unit tests of the HTTP client or feedback on the ones here. I wonder if it makes sense to refactor the Debuginfod client library as well to enable more meaningful unit tests there. And please feel free to suggest a different approach if you feel the refactor moves us in the wrong direction.

For now I will work on addressing your comments on the other areas of the diff.
Since it is even larger now, it could certainly make sense to split off into a few diffs. One way to do this could be:

[diff 0 ] HTTP Client + unit tests
[diff 1] Debuginfod Client library + unit tests
[diff 2] Debuginfod-Find tool + end-to-end (lit) tests

phosek added inline comments.Oct 27 2021, 12:40 AM
llvm/include/llvm/Support/HTTPClient.h
42

This is unnecessary.

71–94

I'd move the cURL-based implementation into a separate header and source file, for example Support/CURLClient.{h,cpp}. Otherwise, anyone who includes HTTPClient.h automatically pulls in <curl/curl.h> which is undesirable.

72

This is unnecessary.

80–82

Do these need to public?

84–87

Do these need to be public?

96

This function seems unnecessary as it's functionality is now effectively provided by HTTPRequest and HTTPRequestConfig which is more flexible (for example they can potentially support methods other than Get).

llvm/lib/Support/HTTPClient.cpp
78

Rather than printing the error to stderr which may be undesirable for example when using LLVM as a library, I think it'd be better to store the error inside the class (it might be possible to use ErrorList) and then return it from curlPerform.

90

The same here.

104–114

Could this method be inlined directly into CurlHTTPRequest::performRequest?

Well... I hate to say it, but I found the version very easy to get lost in. It seems to me you're exposing a lot more implementation details than it is necessary (including the entire curl.h header, which is something we try to avoid. And the fact that tests don't even go through the public interface make it hard to keep track of what is being tested. If we are going to do this, maybe we could use a pattern like this:

class CurlInterface {
  virtual Error init()  = 0;
  virtual Error cleanup() = 0;
  virtual Error perform() = 0;
  virtual void setWriteFunction(...) = 0;
  // etc. 
  // It doesn't have to match the curl interface verbatim. Feel free 
  // to perform basic simplifications and c++-ifications of the 
  // interface. You'll need to do some of those to avoid dependence 
  // on curl.h anyway.
};

Expected<HTTPResponseBuffer> httpGet(const Twine &Url, CurlInterface &curl);
Expected<HTTPResponseBuffer> httpGet(const Twine &Url); // calls the first function with the default/real curl implementation

This shouldn't expose too much implementation details, and this interface could even conceivably by useful by someone who wants to use a custom http-ish library/transport/thingy.

Then, in the test, you could use gmock to program a fake curl implementation:

class CurlMock: public CurlInterface {
  MOCK_METHOD0(init, Error());
  ...
};

TEST(httpGet, whatever) {
  CurlMock curl;
  EXPECT_CALL(curl, setWriteFunction).WillOnce(SaveArg<0>(&writeFunc));
  EXPECT_CALL(curl, perform).WillOnce([&] {
    EXPECT_THAT_ERROR(writeFunc(...), llvm::Succeeded());
    EXPECT_THAT_ERROR(writeFunc(...), llvm::Failed());
    ...
    return llvm::Error::success();
  });
  EXPECT_THAT_ERROR(httpGet("http://foo", curl), llvm::Succeeded())
}

How does that sound?

llvm/lib/Support/HTTPClient.cpp
69

the llvm:: qualifications are not necessary here

139

drop .str().str(), it's cleaner

llvm/unittests/Support/HTTPClient.cpp
16

Yeah, if we're going to have tests for each of these components, I'd definitely recommend splitting them out into separate patches.

noajshu updated this revision to Diff 382731.Oct 27 2021, 11:34 AM

Add CurlHTTPRequest::ErrorState using ErrorList to track errors during curl http request lifecycle.
Also miscellaneous style updates, redundant namespace qualifiers.

noajshu marked 3 inline comments as done.Oct 27 2021, 12:56 PM
noajshu added inline comments.
llvm/lib/Support/HTTPClient.cpp
78

Great idea, I've added this and removed the prints to stderr. The ErrorState will also surface for other Error-returning methods of CurlHTTPRequest.

noajshu updated this revision to Diff 382765.Oct 27 2021, 1:22 PM
noajshu marked an inline comment as done.

Updates to CurlHTTPRequest access specifiers.

noajshu updated this revision to Diff 382793.Oct 27 2021, 2:21 PM
noajshu marked 5 inline comments as done.

Remove extraneous curlPerform method from CurlHTTPRequest

noajshu updated this revision to Diff 382799.Oct 27 2021, 2:32 PM

Remove httpGet function.

noajshu updated this revision to Diff 382863.Oct 27 2021, 5:40 PM
noajshu marked 2 inline comments as done.

Improvements to regression (lit) tests.
Use %t as the asset cache directory and clean up directory before each test. Verify local file contents match expected contents.

noajshu updated this revision to Diff 382871.Oct 27 2021, 6:20 PM

Split out CurlHTTPRequest into CURLClient.{h,cpp}.

noajshu updated this revision to Diff 383132.Oct 28 2021, 12:51 PM

Rebase against main.

noajshu edited the summary of this revision. (Show Details)Oct 28 2021, 1:26 PM
noajshu edited the summary of this revision. (Show Details)Oct 29 2021, 11:03 AM

@labath Thank you for your helpful feedback on this diff!
I have split into 4 sub-diffs. The HTTP client/request/response handler architecture is in D112751 and the curl client is in D112753. Could we continue the discussion there?

llvm/include/llvm/Support/HTTPClient.h
80–82

Nope -- updated most to private, and curlInit to protected (for unit testing).

84–87

The curlHandle{HeaderLine,BodyChunk} method is invoked by the static curl{HeaderLine,BodyChunk}MethodHook function, which then needs access.

I tried instead having curl directly call back to a private method, but then the implicit this argument is replaced by the first parameter passed by curl, which is the pointer to the header/body contents.

96

Agreed that this could be removed and on the flexibility point I bet another use case might be a StreamingHTTPResponseHandler. This would be especially nice if the localCache is refactored to allow caching objects chunk-by-chunk -- as then large debuginfo would not hog memory (as with the BufferedHTTPResponseHandler).

noajshu abandoned this revision.Nov 24 2021, 12:38 PM

Abandoning due to split into separate patches.