This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
cfe/trunk/lib/Basic/
-
trunk/
-
lib/
-
Basic/
-
SourceManager.cpp

Differential D55484

ComputeLineNumbers: delete SSE2 vectorization
ClosedPublic

Authored by MaskRay on Dec 8 2018, 11:33 PM.

Download Raw Diff

Details

Reviewers

bkramer

Group Reviewers

Restricted Project

Commits

rGd906e731ece9: ComputeLineNumbers: delete SSE2 vectorization
rC348777: ComputeLineNumbers: delete SSE2 vectorization
rL348777: ComputeLineNumbers: delete SSE2 vectorization

Summary

SSE2 vectorization was added in 2012, but it is 2018 now and I can't
observe any performance boost with the existing _mm_movemask_epi8 or the following SSE4.2 (compiling with -msse4.2):

__m128i C = _mm_setr_epi8('\r','\n',0,0,0,0,0,0,0,0,0,0,0,0,0,0);
_mm_cmpestri(C, 2, Chunk, 16, _SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_ANY | _SIDD_POSITIVE_POLARITY | _SIDD_LEAST_SIGNIFICANT)

Delete the vectorization to simplify the code.

Also don't check the line ending sequence \n\r

Diff Detail

Repository: rL LLVM

Event Timeline

MaskRay created this revision.Dec 8 2018, 11:33 PM

Herald added a subscriber: cfe-commits. · View Herald TranscriptDec 8 2018, 11:33 PM

Harbormaster completed remote builds in B25863: Diff 177405.Dec 8 2018, 11:34 PM

MaskRay added a reviewer: Restricted Project.Dec 8 2018, 11:35 PM

The performance difference on preprocessing huge files was tiny back then, doesn't surprise me that it disappeared. What did you test this on?

Dropping it is fine with me.

This revision is now accepted and ready to land.Dec 9 2018, 9:55 PM

In D55484#1324983, @bkramer wrote:

The performance difference on preprocessing huge files was tiny back then, doesn't surprise me that it disappeared. What did you test this on?

I tested it on

cat lib/Sema/*.cpp lib/CodeGen/*.cpp > /tmp/all.cpp
perf stat -r 10 clang -E /tmp/all.cpp [-I extracted from build.ninja]

and /tmp/all.cpp (13M) repeated 3 and 9 times.

Closed by commit rL348777: ComputeLineNumbers: delete SSE2 vectorization (authored by MaskRay). · Explain WhyDec 10 2018, 10:16 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptDec 10 2018, 10:16 AM

I'm hitting a crash by this code,

getting:

exception thrown: RuntimeError: unreachable,RuntimeError: unreachable
    at ComputeLineNumbers(clang::DiagnosticsEngine&, clang::SrcMgr::ContentCache*, llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096ul, 4096ul>&, clang::SourceManager const&, bool&) (wasm-function[21781]:719)
    at clang::SourceManager::getLineNumber(clang::FileID, unsigned int, bool*) const (wasm-function[21780]:187)
    at clang::SourceManager::getPresumedLoc(clang::SourceLocation, bool) const (wasm-function[21779]:733)
    at clang::Preprocessor::HandlePragmaSystemHeader(clang::Token&) (wasm-function[21284]:297)
    at (anonymous namespace)::PragmaSystemHeaderHandler::HandlePragma(clang::Preprocessor&, clang::PragmaIntroducer, clang::Token&) (wasm-function[21317]:5)
    at clang::PragmaNamespace::HandlePragma(clang::Preprocessor&, clang::PragmaIntroducer, clang::Token&) (wasm-function[21278]:565)
    at clang::PragmaNamespace::HandlePragma(clang::Preprocessor&, clang::PragmaIntroducer, clang::Token&) (wasm-function[21278]:565)
    at clang::Preprocessor::HandlePragmaDirective(clang::PragmaIntroducer) (wasm-function[21279]:142)
    at clang::Preprocessor::HandleDirective(clang::Token&) (wasm-function[21157]:1721)
    at clang::Lexer::LexTokenInternal(clang::Token&, bool) (wasm-function[20897]:14349)

The reason seems to be that the code appears to expect the Buf is zero ended, but it doesn't seem to be the case. The last character of the Buf was actually \n

Herald added a project: Restricted Project. · View Herald TranscriptNov 26 2019, 9:32 AM

In D55484#1760432, @shi-yan wrote:

I'm hitting a crash by this code,

getting:

exception thrown: RuntimeError: unreachable,RuntimeError: unreachable
    at ComputeLineNumbers(clang::DiagnosticsEngine&, clang::SrcMgr::ContentCache*, llvm::BumpPtrAllocatorImpl<llvm::MallocAllocator, 4096ul, 4096ul>&, clang::SourceManager const&, bool&) (wasm-function[21781]:719)
    at clang::SourceManager::getLineNumber(clang::FileID, unsigned int, bool*) const (wasm-function[21780]:187)
    at clang::SourceManager::getPresumedLoc(clang::SourceLocation, bool) const (wasm-function[21779]:733)
    at clang::Preprocessor::HandlePragmaSystemHeader(clang::Token&) (wasm-function[21284]:297)
    at (anonymous namespace)::PragmaSystemHeaderHandler::HandlePragma(clang::Preprocessor&, clang::PragmaIntroducer, clang::Token&) (wasm-function[21317]:5)
    at clang::PragmaNamespace::HandlePragma(clang::Preprocessor&, clang::PragmaIntroducer, clang::Token&) (wasm-function[21278]:565)
    at clang::PragmaNamespace::HandlePragma(clang::Preprocessor&, clang::PragmaIntroducer, clang::Token&) (wasm-function[21278]:565)
    at clang::Preprocessor::HandlePragmaDirective(clang::PragmaIntroducer) (wasm-function[21279]:142)
    at clang::Preprocessor::HandleDirective(clang::Token&) (wasm-function[21157]:1721)
    at clang::Lexer::LexTokenInternal(clang::Token&, bool) (wasm-function[20897]:14349)

The reason seems to be that the code appears to expect the Buf is zero ended, but it doesn't seem to be the case. The last character of the Buf was actually \n

ComputeLineNumbers assumes that the buffer has a NUL terminator (RequiresNullTerminator in lib/Support/MemoryBuffer.cpp). The function has the assumption both before and after the change (I don't think this patch should be blamed...). That said, a detailed reproducible instruction will be useful, though I think the mostly likely problem is that your code does not set RequiresNullTerminator when calling one of the MemoryBuffer creation routines.

Revision Contents

Path

Size

cfe/

trunk/

lib/

Basic/

SourceManager.cpp

67 lines

Diff 177552

cfe/trunk/lib/Basic/SourceManager.cpp

Show First 20 Lines • Show All 1,210 Lines • ▼ Show 20 Lines	static void ComputeLineNumbers(DiagnosticsEngine &Diag, ContentCache *FI,
// not look at trigraphs, escaped newlines, or anything else tricky.		// not look at trigraphs, escaped newlines, or anything else tricky.
SmallVector<unsigned, 256> LineOffsets;		SmallVector<unsigned, 256> LineOffsets;

// Line #1 starts at char 0.		// Line #1 starts at char 0.
LineOffsets.push_back(0);		LineOffsets.push_back(0);

const unsigned char Buf = (const unsigned char )Buffer->getBufferStart();		const unsigned char Buf = (const unsigned char )Buffer->getBufferStart();
const unsigned char End = (const unsigned char )Buffer->getBufferEnd();		const unsigned char End = (const unsigned char )Buffer->getBufferEnd();
unsigned Offs = 0;		unsigned I = 0;
while (true) {		while (true) {
// Skip over the contents of the line.		// Skip over the contents of the line.
const unsigned char NextBuf = (const unsigned char )Buf;		while (Buf[I] != '\n' && Buf[I] != '\r' && Buf[I] != '\0')
		++I;

#ifdef __SSE2__		if (Buf[I] == '\n' \|\| Buf[I] == '\r') {
// Try to skip to the next newline using SSE instructions. This is very		// If this is \r\n, skip both characters.
// performance sensitive for programs with lots of diagnostics and in -E		if (Buf[I] == '\r' && Buf[I+1] == '\n')
// mode.		++I;
__m128i CRs = _mm_set1_epi8('\r');		++I;
__m128i LFs = _mm_set1_epi8('\n');		LineOffsets.push_back(I);

// First fix up the alignment to 16 bytes.
while (((uintptr_t)NextBuf & 0xF) != 0) {
if (NextBuf == '\n' \|\| NextBuf == '\r' \|\| *NextBuf == '\0')
goto FoundSpecialChar;
++NextBuf;
}

// Scan 16 byte chunks for '\r' and '\n'. Ignore '\0'.
while (NextBuf+16 <= End) {
const __m128i Chunk = (const __m128i)NextBuf;
__m128i Cmp = _mm_or_si128(_mm_cmpeq_epi8(Chunk, CRs),
_mm_cmpeq_epi8(Chunk, LFs));
unsigned Mask = _mm_movemask_epi8(Cmp);

// If we found a newline, adjust the pointer and jump to the handling code.
if (Mask != 0) {
NextBuf += llvm::countTrailingZeros(Mask);
goto FoundSpecialChar;
}
NextBuf += 16;
}
#endif

while (NextBuf != '\n' && NextBuf != '\r' && *NextBuf != '\0')
++NextBuf;

#ifdef __SSE2__
FoundSpecialChar:
#endif
Offs += NextBuf-Buf;
Buf = NextBuf;

if (Buf[0] == '\n' \|\| Buf[0] == '\r') {
// If this is \n\r or \r\n, skip both characters.
if ((Buf[1] == '\n' \|\| Buf[1] == '\r') && Buf[0] != Buf[1]) {
++Offs;
++Buf;
}
++Offs;
++Buf;
LineOffsets.push_back(Offs);
} else {		} else {
// Otherwise, this is a null. If end of file, exit.		// Otherwise, this is a NUL. If end of file, exit.
if (Buf == End) break;		if (Buf+I == End) break;
// Otherwise, skip the null.		++I;
++Offs;
++Buf;
}		}
}		}

// Copy the offsets into the FileInfo structure.		// Copy the offsets into the FileInfo structure.
FI->NumLines = LineOffsets.size();		FI->NumLines = LineOffsets.size();
FI->SourceLineCache = Alloc.Allocate<unsigned>(LineOffsets.size());		FI->SourceLineCache = Alloc.Allocate<unsigned>(LineOffsets.size());
std::copy(LineOffsets.begin(), LineOffsets.end(), FI->SourceLineCache);		std::copy(LineOffsets.begin(), LineOffsets.end(), FI->SourceLineCache);
}		}
▲ Show 20 Lines • Show All 998 Lines • Show Last 20 Lines