This patch ensures the output of source minimizer is never larger than the input. This property will be handy in a follow up patch that makes it possible to pad the minimized output to the size of the original input.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This patch ensures the output of source minimizer is never larger than the input. This property will be handy in a follow up patch that makes it possible to pad the minimized output to the size of the original input.
I suppose when I wrote first wrote this I was thinking of a secondary purpose of canonicalizing the sources; in which case, adding a trailing newline (for example) seems like feature rather than a bug. I'm not too attached to that, but I am curious for more context about why it's important to be able to pad the file to the same size, and whether that indicates a bug that should be fixed somewhere else instead. (I'll read the follow-up patches in the stack...)
clang/unittests/Lex/DependencyDirectivesSourceMinimizerTest.cpp | ||
---|---|---|
149 | I'm not comfortable with this change... the idea of the (/*invalid*/ was to guarantee that an invalid macro argument list didn't get minimized to something valid. Perhaps the result could be "#define MACRO(\n", and just strip the comment? |
Another thing to consider: a client that wants the minimized source to be "no bigger than" the original source can use the original source if it ends up growing. For example, https://reviews.llvm.org/D104462 could check the resulting size, and just return the original source in the (extremely unlikely?) corner case where the minimized sources are bigger.
In that case, we wouldn't need this patch's assertion.
I bet some of these changes are good to do anyway (like dropping the /*invalid*/ comment, which was useful as a canary-in-the-coal-mine when testing the prototype on large bodies of sources (and maybe has outlived its usefulness)), but you could skip the fiddly changes here that complicate the logic and make the output less canonical and harder to read, like "maintain 0 whitespace after macro names" and "maintain missing newline at EOF".
I'm not comfortable with this change... the idea of the (/*invalid*/ was to guarantee that an invalid macro argument list didn't get minimized to something valid.
Perhaps the result could be "#define MACRO(\n", and just strip the comment?