Page MenuHomePhabricator

[scudo][standalone] Add string utility functions

Authored by cryptoad on Mar 12 2019, 9:22 AM.



Add some string utility functions, notably to format strings, get
lengths, convert a string to a number. Those functions will be
used in reports and flags (coming up next). They were mostly borrowed
from sanitizer_common.

Make use of the string length function in a couple places in the
platform code that was checked in with inlined version of it.

Add some tests.

Diff Detail


Event Timeline

cryptoad created this revision.Mar 12 2019, 9:22 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMar 12 2019, 9:22 AM
Herald added subscribers: Restricted Project, jdoerfert, delcypher, mgorny. · View Herald Transcript
hctim added inline comments.Mar 12 2019, 10:54 AM
33 ↗(On Diff #190279)

Why unsigned? Shouldn't this just be char?

56 ↗(On Diff #190279)

Personally, input params before output params, EndPtr should be after Base.

63 ↗(On Diff #190279)

const char *OldNPtr = NPtr

65 ↗(On Diff #190279)

Sign is initialized to 1 in declaration, can remove this.

73 ↗(On Diff #190279)

Why uptr? char or int should be sufficient. Also const not needed here.

78 ↗(On Diff #190279)

Can remove brackets

79 ↗(On Diff #190279)

*EndPtr = (HaveDigits) ? NPtr : OldNPtr;

81 ↗(On Diff #190279)

Can remove brackets. Same with the else of this clause.

82 ↗(On Diff #190279)

Is return Min(static_cast<u64>(INT64_MAX), Res); sufficient here?

96 ↗(On Diff #190279)

Nit: Capitalize appends at start of sentence.

9 ↗(On Diff #190279)


79 ↗(On Diff #190279)

Could you add some UINT_MAX and INT_MAX tests here?

Also "+0", "-0", "-", "+", however I'm not sure how likely these are to be used in the reports/flags.

vitalybuka added inline comments.Mar 12 2019, 11:08 AM
24 ↗(On Diff #190279)

do you need internal implementation of these functions? Why can't you use libc?
I would expect that scudo has no the same restrictions as sanitizer_common

24 ↗(On Diff #190279)

Could you please switch this to aggregation instead of inheritance?

cryptoad updated this revision to Diff 190304.Mar 12 2019, 11:28 AM
cryptoad marked 12 inline comments as done.

Addressing Mitch's comments.

cryptoad added inline comments.Mar 12 2019, 11:28 AM
56 ↗(On Diff #190279)

This one was mostly to mirror the behavior & parameters of
I am open to changing it if it makes sense to everyone.

73 ↗(On Diff #190279)

Changed to int. char errors with -Werror=conversion.
As for const, I'd rather keep for consistency.

cryptoad marked an inline comment as done.Mar 12 2019, 11:32 AM
cryptoad added inline comments.
24 ↗(On Diff #190279)

With those, I am slightly worried about localization in the libc, and how strings might end up being processed, with possibly at some point heap allocations.
Looking around it looks like strlen/strnlen/strcmp/strncmp are generally safe (glibc/bionic) but it's not an exhaustive assessment.
strtol appears to be far more complicated.
You likely know better about those intricacies, let me know what you think!

cryptoad updated this revision to Diff 190310.Mar 12 2019, 12:34 PM
cryptoad marked an inline comment as done and an inline comment as not done.

Changing ScopedString to aggregation instead of inheritance.

vitalybuka added inline comments.Mar 12 2019, 12:36 PM
24 ↗(On Diff #190279)

I would go with as little as possible of custom code.
Add them when you hit an issue?

cryptoad updated this revision to Diff 190312.Mar 12 2019, 12:42 PM

Removing the custom strlen & strncmp in favor of the libc ones.

cryptoad marked 2 inline comments as done.Mar 12 2019, 12:42 PM
hctim marked an inline comment as done.Mar 12 2019, 12:45 PM
hctim added inline comments.
56 ↗(On Diff #190279)

Agreed, probably best to leave as-is then.

hctim added inline comments.Mar 12 2019, 12:50 PM
24 ↗(On Diff #190279)

Similar to this, is there any particular reason why we're not using the current sanitizer printf commons and basically copying the code? Are we trying to avoid bringing compiler-rt along with us wherever scuda goes? If this is the case, it doesn't look like the printf commons have much that we're not using, and doesn't have any large dependencies on other things in compiler-rt, could we use the commons instead?

cryptoad marked 3 inline comments as done.Mar 12 2019, 12:57 PM
cryptoad added inline comments.
24 ↗(On Diff #190279)

The reason for the rewrite as a standalone component is to not depend on sanitizer_common anymore.
One of the reasons being that Fuchsia wants Scudo in their libc, and we can't afford to pull in extraneous code.
Some benefits is that Scudo doesn't have the same requirements as a lot of sanitizer (wrt interception for example) and so we can use some standard function that they can't (like strlen as Vitaly pointed out).

morehouse added inline comments.Mar 14 2019, 10:11 AM
17 ↗(On Diff #190312)

Can we use isspace from libc?

22 ↗(On Diff #190312)

Can we use isdigit from libc?

24 ↗(On Diff #190312)

Can we use tolower from libc?

28 ↗(On Diff #190312)

strtoll should be safe here. I can't think of a good reason for it to malloc anything.

32 ↗(On Diff #190312)

Why do we need this append with va_list?

13 ↗(On Diff #190312)

Can we add a test that the ScopedString buffer won't be overflowed by append?

cryptoad updated this revision to Diff 191106.Mar 18 2019, 9:11 AM
cryptoad marked 5 inline comments as done.

Following Matt's comments, I had a look at various libc implementations
and the ctype function isspace, isdigit, tolower all appear to be
safe, so use them instead of our own.
strtol & co. should also do (Windows might be using some multibyte to
widechar conversion function which would use dynamic allocation, but we
don't support Windows so we should be good).
Add a test that attempts to overflow a ScopedString with different
variations of append.

cryptoad updated this revision to Diff 191108.Mar 18 2019, 9:23 AM

Correct some spacing.

morehouse accepted this revision.Mar 18 2019, 12:15 PM

Still want to know if we need append with a va_list.

This revision is now accepted and ready to land.Mar 18 2019, 12:15 PM
cryptoad added inline comments.Mar 18 2019, 12:44 PM
32 ↗(On Diff #190312)

Printf for example uses a va_list for append. The variadic version is a wrapper to this one as well.

morehouse added inline comments.Mar 18 2019, 12:47 PM
32 ↗(On Diff #190312)

It looks like we only use void append(const char *Format, ...).

Are there cases where void append(const char *Format, va_list Args) is used?

morehouse marked an inline comment as done.Mar 18 2019, 12:48 PM
morehouse added inline comments.
32 ↗(On Diff #190312)

Never mind, I see what you were saying about Printf now.

vitalybuka accepted this revision.Mar 18 2019, 1:11 PM
vitalybuka added inline comments.
130 ↗(On Diff #191108)

I expect that returning the position where it stops is more convenient.

static int appendChar(char *Buffer, const char *BufferEnd, char C) {
  if (Buffer < BufferEnd)
    *Buffer = C;
  return Buffer  + 1;

remove Res and return (Buffer - BufferIn)

But up to you if you prefer the current approach.

34 ↗(On Diff #191108)

If you use LLVM style, members go before methods

cryptoad marked an inline comment as done.Mar 18 2019, 1:45 PM
cryptoad added inline comments.
34 ↗(On Diff #191108)

Ack. I will revisit that later on.
I'd like to land what is there for now, as Phabricator is quickly making this unreadable, if that's ok with you.

vitalybuka added inline comments.Mar 18 2019, 1:46 PM
34 ↗(On Diff #191108)


This revision was automatically updated to reflect the committed changes.