This is an archive of the discontinued LLVM Phabricator instance.

Strip invalid TBAA when reading bitcode
ClosedPublic

Authored by mehdi_amini on Dec 15 2016, 6:55 PM.

Details

Summary

This ensure backward compatibility on bitcode loading.

Diff Detail

Repository
rL LLVM

Event Timeline

mehdi_amini retitled this revision from to Strip invalid TBAA when reading bitcode.
mehdi_amini updated this object.
mehdi_amini added reviewers: sanjoy, pcc.
sanjoy requested changes to this revision.Dec 16 2016, 9:52 AM
sanjoy edited edge metadata.
sanjoy added inline comments.
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
51 ↗(On Diff #81704)

I'd have mildly preferred doing the NFC header sorting in a separate change.

157 ↗(On Diff #81704)

Why are you skipping these? Would be nice to add a one-liner, unless it is patently obvious.

476 ↗(On Diff #81704)

I'd have named this as TBAAStripped since it really indicates if we've already stripped out TBAA metadata (and hence there is no need to check this in for newer functions). Right now this seems imperative, and it sounds like setting this to true will cause TBAA to be stripped.

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1347 ↗(On Diff #81704)

Should this not continue (otherwise you'll drop all metadata once you've seen !tbaa)?

Actually, I'd rather write this out separately as:

if (I->second == LLVMContext::MD_tbaa && StripTBAA)
  continue;

right after the if (!MD) check.

llvm/lib/Bitcode/Reader/MetadataLoader.h
58 ↗(On Diff #81704)

End comment with period. Also, why not /// for doxygen?

This revision now requires changes to proceed.Dec 16 2016, 9:52 AM
mehdi_amini added inline comments.Dec 16 2016, 10:12 AM
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
51 ↗(On Diff #81704)

That was not intentional, clang-format did it for me.

157 ↗(On Diff #81704)

A function that is Materializable hasn't been materialized, so it does not have a body. It is not a declaration either as its body may be parsed later. Trying to iterate it results in an assertion.

476 ↗(On Diff #81704)

Actually it will also cause TBAA to be stripped :)
The first time you encounter an invalid TBAA, you set this boolean to request that from now on when parsing the body of a function, it has to ignore TBAA MD.
It is also used line 4469 to see if we're in a mode where we strip TBAA during reading, so we don't need to check every instruction in the function.
Make sense?

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1347 ↗(On Diff #81704)

It is breaking out of the switch. I'll move the check above.

sanjoy added inline comments.Dec 16 2016, 10:46 AM
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
157 ↗(On Diff #81704)

I think this use of dropUnknownNonDebugMetadata is wrong (and so is the usage in CodeGenPrepare::stripInvariantGroupMetadata).

The argument to dropUnknownNonDebugMetadata is the set of metadata IDs that are "known" and should not be dropped. All non-debug metadata *not* in that set is dropped.

The reason why this works is that TBAA->getMetadataID() is an entirely different thing. It is the IDs that are used to distinguish between MDString, MDNode etc. (the subclass id, basically, which also feeds into dyn_cast etc.). Since that ID is not equal to LLVMContext::MD_tbaa, we end up dropping MD_tbaa.

The right usage here is: I.setMetadata(LLVMContext::MD_tbaa, nullptr).

476 ↗(On Diff #81704)

The first time you encounter an invalid TBAA, you set this boolean to request that from now on when parsing the body of a function, it has to ignore TBAA MD.

But that is not this flag is it? I thought you were dropping future TBAA by setting the flag on MetadataLoaderImpl.

From what I can gather locally, if I manually bool StripTBAA = false; to bool StripTBAA = true; then I would not drop TBAA at all, and you're using this flag solely to remember if you've asked MDLoader to not load new TBAA, and so you don't have to walk instructions(F). In that sense, it isn't imperative, it just "remembers" if a certain action was done in the past.

However, I'm not cognizant of the inter-relationships between these classes that you changed, so I may be missing the forest for the trees.

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1347 ↗(On Diff #81704)

Right, but isn't it also breaking out of the innermost for?

sanjoy added inline comments.Dec 16 2016, 11:04 AM
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
157 ↗(On Diff #81704)

I've fixed the bad usage in CGP in https://reviews.llvm.org/rL289973

mehdi_amini edited edge metadata.

Address comments!

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
157 ↗(On Diff #81704)

Uh... I took indeed example on the usage in CodeGenPrepare!
I'll switch to I.setMetadata(LLVMContext::MD_tbaa, nullptr).

476 ↗(On Diff #81704)

Indeed it is redundant with the state on the MetadataLoader. I think I can remove it from here and add a getter on the MetadataLoader. You'll tell me with the next update if it helps :)

llvm/lib/Bitcode/Reader/MetadataLoader.cpp
1347 ↗(On Diff #81704)

Right, I missed the inner for! I was looking at the break that follows a few lines below, but missed the fact that we had an enclosing for.

mehdi_amini marked 16 inline comments as done.Dec 16 2016, 11:12 AM
sanjoy accepted this revision.Dec 16 2016, 11:18 AM
sanjoy edited edge metadata.

lgtm

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
159 ↗(On Diff #81777)

The checking is redundant -- you can just do I.setMetadata(LLVMContext::MD_tbaa, nullptr) unconditionally (that should not do anything if MD_tbaa was not present on the instruction).

This revision is now accepted and ready to land.Dec 16 2016, 11:18 AM
mehdi_amini marked an inline comment as done.Dec 16 2016, 11:25 AM
mehdi_amini added inline comments.
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
159 ↗(On Diff #81777)

Good point.

I also modified the test to check that we only strip the TBAA attachment on instructions.

This revision was automatically updated to reflect the committed changes.
mehdi_amini marked an inline comment as done.