Page MenuHomePhabricator

[clang] Make negative getLocWithOffset widening-safe.
Needs ReviewPublic

Authored by simon_tatham on Jul 6 2021, 9:27 AM.

Details

Summary

This is part of a patch series working towards the ability to make
SourceLocation into a 64-bit type to handle larger translation units.

In many places in the code, SourceLocation::getLocWithOffset is used
with an argument of type unsigned, holding a value that's
conceptually negative (to find a location shortly before the current
one). If SourceLocation becomes a 64-bit type, this won't work any
more, because the negation will happen before widening to 64 bits, so
that instead of an offset of (say) -3, you'll get 2^32-3.

This patch fixes that problem in the short term, by adding casts to
SourceLocation::IntType at all the current call sites. I haven't
thought of a reliable way to protect against the introduction of new
call sites that similarly get it wrong, though.

Diff Detail

Event Timeline

simon_tatham created this revision.Jul 6 2021, 9:27 AM
simon_tatham requested review of this revision.Jul 6 2021, 9:27 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2021, 9:27 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
miyuki added a subscriber: miyuki.Jul 6 2021, 10:12 AM

I'm not a huge fan of this as an API; it is not obvious what the function and parameters do without reading the comment or implementation (e.g. whether Offset is ignored if NegOffset is given, and what it means to pass offsets to both). It moves the subtraction logic from the call site, where it is meaningful, to inside the function where it is not. I also think it's the caller's responsibility to make sure what they pass in is correctly signed. I've put suggestions for how each call site could be updated without casting while keeping getLocWithOffset unchanged.

clang/lib/AST/SelectorLocationsKind.cpp
41

These are probably fine as they were

clang/lib/CodeGen/CoverageMappingGen.cpp
235–238

Seems like getFileOffset should now return ui64.

clang/lib/Format/FormatTokenLexer.cpp
835–836

This probably would need a cast; widening TrailingWhitespace seems overkill.

clang/lib/Lex/Lexer.cpp
535

ditto getDecomposedLoc

568–570

Shouldn't the return type of getDecomposedLoc need updating at some stage? If so that would take care of this case

clang/lib/Parse/ParseStmtAsm.cpp
174–179
clang/lib/StaticAnalyzer/Core/HTMLDiagnostics.cpp
1073–1074
simon_tatham edited the summary of this revision. (Show Details)

I also think it's the caller's responsibility to make sure what they pass in is correctly signed.

I don't disagree with that in principle, but in a case like this, where the error is extremely easy to make, not warned about at compile time (so you have to exercise the failing code path to spot it), and won't even show up at runtime in the default build configuration (even once 64-bit SourceLocations are actually implemented, the current plan is for them to be off by default), I think it's worth at least trying to think of ways to help the caller get it right.

Perhaps an alternative API might be to replace SourceLocation::getLocWithOffset with an operator+, and add an operator- to go with it? Then you could write NewLoc = OldLoc + ThisInteger - ThatInteger and each integer would be implicitly widened if it was the wrong size.

But in this version of the patch, I haven't done that; I've just reverted the introduction of dyadic getLocWithOffset and added cumbersome casts or intermediate variables at the call sites where I'd previously used it.

clang/lib/CodeGen/CoverageMappingGen.cpp
235–238

That seems sensible from first principles, but I tried tugging on that thread and it turned out to have a lot of extra work behind it, because file offsets are absolutely all over the place (perhaps even more so that SourceLocations proper), and in particular, they're often used as indexes into in-memory buffers, which gets awkward if you want 64-bit SourceLocations and 64-bit builds of clang to be independent choices.

Perhaps you're right that expanding file offsets to 64-bit would also be a useful piece of work. But doing it as part of this work would double the size of the job, so I think it would make more sense to keep it separate.

martong removed a subscriber: martong.Jul 28 2021, 9:16 AM