This is an archive of the discontinued LLVM Phabricator instance.

[demangler] Rust demangler buffer reuse
ClosedPublic

Authored by urnathan on Apr 8 2022, 12:13 PM.

Details

Summary

The rust demangler has some odd buffer handling code, which will copy the demangled string into the provided buffer, if it will fit. Otherwise it uses the allocated buffer it made. But the length of the incoming buffer will have come from a previous call, which was the length of the demangled string -- not the buffer size. And of course, we're unconditionally allocating a temporary buffer in the first place. So we don't actually get buffer reuse, and we get a memcpy in somecases.

However, nothing in LLVM ever passes in a non-null pointer. Neither does anything pass in a status pointer, that is then made use of. The only exercise these have is in the test suite.

So let's just make the rust demangler have the same API as the dlang demangler.

[this is part of a series trying to clean up the demangler's APIs]

Diff Detail

Event Timeline

urnathan created this revision.Apr 8 2022, 12:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2022, 12:13 PM
urnathan requested review of this revision.Apr 8 2022, 12:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2022, 12:13 PM
tmiasko added a comment.EditedApr 8 2022, 5:02 PM

The API thus far was intended to match that of __cxa_demangle, where the caller retains the ownership of the provided buffer when demangling fails. This is no longer the case in the new implementation. I would rather avoid diverging from __cxa_demangle in such a subtle way.

The Itanium demangler gets away with the direct use of the buffer because it separates fallible parsing from infallible printing.

the length of the incoming buffer will have come from a previous call, which was the length of the demangled string

I think this is a bug in the implementation. The *N should have been updated to reflect allocated memory size. From Demangler API in Itanium C++ ABI:

In either case, the new buffer size will be stored in *n.

The API thus far was intended to match that of __cxa_demangle, where the caller retains the ownership of the provided buffer when demangling fails. This is no longer the case in the new implementation. I would rather avoid diverging from __cxa_demangle in such a subtle way.

Ah, it took me a while to figure out where that's happening, I see it now. Awkward. I suppose the rust demangler could always return the new buffer (and free any incoming buffer) in the success case? Something like if (Buf) std::free(Buf); instead of that memcpy/free dance?

The Itanium demangler gets away with the direct use of the buffer because it separates fallible parsing from infallible printing.

Any thoughts about doing similar in the Rust demangler -- passing an output buffer reference around or something?

the length of the incoming buffer will have come from a previous call, which was the length of the demangled string

I think this is a bug in the implementation. The *N should have been updated to reflect allocated memory size. From Demangler API in Itanium C++ ABI:

In either case, the new buffer size will be stored in *n.

Indeed, that's the direction I'm trying to go in :)

FWIW, this is part of a series trying to make buffer ownership transfer clearer (and other issues)

The API thus far was intended to match that of __cxa_demangle, where the caller retains the ownership of the provided buffer when demangling fails. This is no longer the case in the new implementation. I would rather avoid diverging from __cxa_demangle in such a subtle way.

Ah, it took me a while to figure out where that's happening, I see it now. Awkward. I suppose the rust demangler could always return the new buffer (and free any incoming buffer) in the success case? Something like if (Buf) std::free(Buf); instead of that memcpy/free dance?

Sounds good. At that point we might remove N and Buf parameters altogether? This functionality is not used anywhere in LLVM.

urnathan updated this revision to Diff 422463.Apr 13 2022, 4:01 AM
urnathan edited the summary of this revision. (Show Details)

Sounds good. At that point we might remove N and Buf parameters altogether? This functionality is not used anywhere in LLVM.

good point, updated to do exactly that.

tmiasko accepted this revision.Apr 13 2022, 5:00 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Apr 13 2022, 5:00 AM
This revision was landed with ongoing or failed builds.Apr 13 2022, 8:50 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 13 2022, 8:50 AM