This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix TCPSocket::Connect when getaddrinfo returns multiple addrs
ClosedPublic

Authored by ddiproietto on May 31 2022, 7:44 AM.

Details

Summary

TCPSocket::Connect() calls SocketAddress::GetAddressInfo() and tries to
connect any of them (in a for loop).

This used to work before commit 4f6d3a376c9f("[LLDB] Fix setting of
success in Socket::Close()") https://reviews.llvm.org/D116768.

As a side effect of that commit, TCPSocket can only connect to the first
address returned by SocketAddress::GetAddressInfo().

  1. If the attempt to connect to the first address fails, TCPSocket::Connect(), calls CLOSE_SOCKET(GetNativeSocket()), which closes the fd, but DOES NOT set m_socket to kInvalidSocketValue.
  2. On the second attempt, TCPSocket::CreateSocket() calls Socket::Close().
  3. Socket::Close() proceeds, because IsValid() is true (m_socket was not reset on step 1).
  4. Socket::Close() calls ::close(m_socket), which fails
  5. Since commit 4f6d3a376c9f("[LLDB] Fix setting of success in Socket::Close()"), this is error is detected. Socket::Close() returns an error.
  6. TCPSocket::CreateSocket() therefore returns an error.
  7. TCPSocket::Connect() detects the error and continues, skipping the second (and the third, fourth...) address.

This commit fixes the problem by changing step 1: by calling
Socket::Close, instead of directly calling close(m_socket), m_socket is
also se to kInvalidSocketValue. On step 3, Socket::Close() is going to
return immediately and, on step 6, TCPSocket::CreateSocket() does not
fail.

How to reproduce this problem:

On my system, getaddrinfo() resolves "localhost" to "::1" (first) and to
"127.0.0.1" (second).

Start a gdbserver that only listens on 127.0.0.1:

gdbserver 127.0.0.1:2159 /bin/cat
Process /bin/cat created; pid = 2146709
Listening on port 2159

Start lldb and make it connect to "localhost:2159"

./bin/lldb
(lldb) gdb-remote localhost:2159

Before 4f6d3a376c9f("[LLDB] Fix setting of success in Socket::Close()"),
this used to work. After that commit, it stopped working. This commit
fixes the problem.

Diff Detail

Event Timeline

ddiproietto created this revision.May 31 2022, 7:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2022, 7:44 AM
ddiproietto requested review of this revision.May 31 2022, 7:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2022, 7:44 AM
srhines added a subscriber: srhines.
labath accepted this revision.Jun 7 2022, 2:12 AM
This revision is now accepted and ready to land.Jun 7 2022, 2:12 AM

I do not have commit access, can someone commit this for me, please?

Also, would it be possible for this to be cherry picked somewhere (apologies, I'm not familiar with the LLVM branching model), for an eventual 14.0.1 release?

Thanks!

Committed now, thanks for the patch. I'm not sure if there will be another 14.x release, but you can file a bug to track this, if there happens to be one.