This is an archive of the discontinued LLVM Phabricator instance.

[libFuzzer] Use octal instead of hex escape sequences in PrintASCII
ClosedPublic

Authored by hans on Oct 1 2021, 4:53 AM.

Details

Summary

Previously, PrintASCII would print the string "\ta" as "\x09a". However, in C/C++ those strings are not the same: the trailing 'a' is part of the escape sequence, which means it's equivalent to "\x9a". This is an annoying quirk of the standard. (See https://eel.is/c++draft/lex.ccon#nt:hexadecimal-escape-sequence)

To fix this, output three-digit octal escape sequences instead. Since octal escapes are limited to max three digits, this avoids the problem of subsequent characters unintentionally becoming part of the escape sequence.

Dictionary files still use the non-C-compatible hex escapes, but I believe we can't change the format since it comes from AFL, and libfuzzer never writes such files, it only has to read them, so they're not affected by this change.

(One alternative, which might be nice since hex sequences are more readable, would be to use hex sequences when the following character is not a valid hex digit. It would add some complexity though, and we'd still need the octals as a fallback, so not sure if it would be worth it.)

Diff Detail

Event Timeline

hans requested review of this revision.Oct 1 2021, 4:53 AM
hans created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptOct 1 2021, 4:53 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
This revision is now accepted and ready to land.Oct 1 2021, 7:31 AM