I feel like I found a bug but I am both not entirely sure and running out of ideas. I have these two tests that I assume we want to be passing but they are currently failing.
I found what I feel is an issue when text replacements (using Rewriter exposed in libclang API I am working on) were off by one char at the end. After some digging I'd say it's because the way range lengths are calculated. I feel it's all about the fact that we have two representations of ranges - either it's a pair of indices to a buffer where the semantics are as expected - start of the range, one past the end of the range. Then we also have "token ranges" where the semantics are different - it's start and exactly the end. And I feel the we're converting or calculating the lengths in not-completly-consistent fashion. Alternatively it could be due to some kind of range conversion in libclang.
I tried couple different approaches but haven't found any that would both feel sensible AND won't fail significant number of tests.
I tried removing this in int Rewriter::getRangeSize(const CharSourceRange &Range, RewriteOptions opts) const { which felt like not the right thing to do but surprisingly no test failed.
// Adjust the end offset to the end of the last token, instead of being the // start of the last token if this is a token range. if (Range.isTokenRange()) EndOff += Lexer::MeasureTokenLength(Range.getEnd(), *SourceMgr, *LangOpts);
I tried this which I felt is the right thing to do but 70 failing tests disagreed:
int Rewriter::getRangeSize(SourceRange Range, RewriteOptions opts) const { - return getRangeSize(CharSourceRange::getTokenRange(Range), opts); + return getRangeSize(CharSourceRange::getCharRange(Range), opts); }
I guess I still need to digest this comment in tests:
CharSourceRange CRange; // covers exact char range CharSourceRange TRange; // extends CRange to whole tokens SourceRange SRange; // different type but behaves like TRange
I will try to wrap my head around this after the weekend.