This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Alternate MD5 fix, NFC
ClosedPublic

Authored by probinson on Jul 28 2023, 1:23 PM.

Diff Detail

Event Timeline

probinson created this revision.Jul 28 2023, 1:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 28 2023, 1:23 PM
Herald added a subscriber: StephenFan. · View Herald Transcript

From the other review:

Could/should we do the lookup on the CU filename before it goes into the DI metadata, and store that FileID somewhere for later use?

There's a note in issue 63955 to the effect that I can't find an API to turn a filename into a FileID. If there is one that I didn't find, we could use FileIDs instead of pointers to name strings.

I was thinking more along the lines of keeping the same code we have, string pointer equality (though perhaps FileID is equivalent in cases where the pointer euqality is valid, and the FileID approach would be more self documenting/less subtle). I was thinking - for other files we get filenames correctly that are pointer identical because they're in the FileID, right? But for this one because we stash it in/get it from the DICompileUnit, that identity gets lost. But presumably the identity is true for the string filename used to build the DICompileUnit? So if, at that point, we stored the pointer, or FileID aside and used that instead of retrieving it from the DICompileUnit, perhaps that'd make this name consistent with the other names in being able to use pointer identity?

The CU's DIFile is conjured up in CGDebugInfo::CreateCompileUnit(), and the name is derived from -main-file-name rather than anything SourceManager provides. Although there is a comment wondering about that. CreateCompileUnit() computes a checksum for that DIFile, but apparently the DIFile never makes it into the DIFileCache.

DIFileCache is generated by CGDebugInfo::createFile(); it looks like there are some subtle differences in the path handling between what it does and what CreateCompileUnit() does, so making CreateCompileUnit() go through createFile() might not the the right thing. But it does look like CreateCompileUnit() might just be missing a one-liner to add the CU's DIFile to DIFileCache.

I'll try that instead.

probinson updated this revision to Diff 546115.Aug 1 2023, 10:12 AM

Use the main FileID instead of expensive string compares.

Figured this out after staring at CreateCompileUnit for long enough. Seeding the DIFileCache with the DIFile created there made another test unhappy (difile_entry.cpp) but this fix doesn't.

Of course this means we're rerunning MD5 on the main source file; probably can capture that from TheCU and save that cost as well.

probinson updated this revision to Diff 546144.Aug 1 2023, 10:55 AM

Reuse the main file's checksum instead of recalculating it.

dblaikie accepted this revision.Aug 16 2023, 12:29 PM

Sounds OK to me - thanks!

bit more detail in the actual patch description/commit message (from some of the comments in this review, perhaps) would be handy.

This revision is now accepted and ready to land.Aug 16 2023, 12:29 PM
This revision was landed with ongoing or failed builds.Aug 17 2023, 7:04 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptAug 17 2023, 7:04 AM

Detail added in the commit message, good idea!

nikic added a comment.Aug 17 2023, 8:16 AM

Thanks! Can confirm that this recovers the compile-time regression.

This patch is possibly a suspect in at least some bot failures although I'm at a loss to understand why. Perhaps I can't just blithely call getChecksum() and copy what it sends back? The ways of metadata remain mysterious to me. I'll be poking at this more as time permits, unless someone can tell me my obvious mistake.

Failing on one of our 2 stage AArch64 bots https://lab.llvm.org/buildbot/#/builders/176/builds/4267 after the reland (that bot has been red for various reasons today, so no emails got sent).

Maybe it helps to know that it doesn't fail on the same config with single stage https://lab.llvm.org/buildbot/#/builders/197/builds/8998. So it's at least able to build simple test cases (the test suite is failing to build for a separate reason).

If this is somehow AArch64 only we (Linaro) can help investigate, I see that it was ok on PowerPC for example https://lab.llvm.org/buildbot/#/builders/36/builds/36728.

Thanks @DavidSpickett the patch is currently reverted. I have a revised patch coming soon and I will keep a close eye on the bots.
I believe it's a string-lifetime issue and so whether it manifests is unpredictable, but we have enough different bots in the farm that it did get caught.

probinson added inline comments.Aug 18 2023, 11:21 AM
clang/lib/CodeGen/CGDebugInfo.cpp
423

In the final commit, Checksum is outside the if so that its lifetime persists to the end of the function (specifically, past the createFile call). CSInfo holds a StringRef to this string, not the string itself.