This is an archive of the discontinued LLVM Phabricator instance.

Fix: Distinguish CFI Metadata Checks in MergeFunctions Pass
Changes PlannedPublic

Authored by oskarwirga on Jun 29 2023, 10:46 AM.

Details

Summary

This diff fixes an issue in the MergeFunctions pass where two different Control Flow Integrity (CFI) metadata checks were incorrectly considered identical. These merges would lead to runtime violations down the line as two separate objects contained a single destructor which itself contained checks for only one of the objects.

Here I update the comparison logic to take into account the metadata at llvm.type.test checks. Now, only truly identical checks will be considered for merging, thus preserving the integrity of each check.

Diff Detail

Event Timeline

oskarwirga created this revision.Jun 29 2023, 10:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2023, 10:46 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
oskarwirga requested review of this revision.Jun 29 2023, 10:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2023, 10:46 AM
nikic added a subscriber: nikic.Jun 29 2023, 12:17 PM
nikic added inline comments.
llvm/lib/Transforms/Utils/FunctionComparator.cpp
665

This doesn't looks like a problem specific to the type.test intrinsic. Other calls may have metadata operands as well.

oskarwirga added inline comments.Jun 29 2023, 1:33 PM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
665

Should I remove the intrinsic check and instead loop through the arg operands to check that they are the same?

nikic added inline comments.Jun 30 2023, 7:23 AM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
665

The caller cmpBasicBlocks() already does a loop over the operands and compares them via cmpValues(). So that's probably the place where MetadataAsValue needs to be handled.

oskarwirga added a reviewer: nikic.

Generalize the metadata checks

oskarwirga marked 2 inline comments as done.Jun 30 2023, 3:13 PM
smeenai added inline comments.Jun 30 2023, 4:15 PM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
879

Should this be inside cmpValues instead, since that has a bunch of other callers too?

881

Is it correct to just check for pointer equality here? https://llvm.org/docs/LangRef.html#metadata mentions !unique for avoiding content-baed metadata merging, but it wasn't clear if that merging would otherwise always take place.

oskarwirga added inline comments.Jun 30 2023, 5:40 PM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
881

Yeah you know what maybe what we need to do is add !unique to type test metadata checks because this is too broad and its failing to merge functions with differing debug metadata. We need some way to prevent the CFI metadata as part of the cfi checks from getting merged but we need identical debug metadata (and perhaps other metadata) to get merged.

llvm/test/Transforms/MergeFunc/cfi-function-merging.ll
41

@smeenai @nikic This metadata is pretty generic and I am not entirely sure how CFI knows to populate it later with the correct vtable entry

Added checks for distinct metadata in operands to prevent merging unless it's just debug info. This ensures that distinct metadata, which should be treated as different, is not merged.

This looks reasonable to me in the sense that it treats distinct metadata as being different but bypasses distinct debug info. CC @dexonsmith

Update to use cmpValues() instead of cmpBasicBlock

kyulee added a subscriber: kyulee.Aug 24 2023, 8:58 PM

Thanks for fixing this! This seems a good case to cover.

llvm/lib/Transforms/Utils/FunctionComparator.cpp
802

Like the below Constant case below, can you run dyn_cast more explicit to ensure they're not null?.

auto *MDL = cast<MetadataAsValue>(L);
auto *MDR = cast<MetadataAsValue>(R);
if (MDL && MDR) {
878

This space seems unrelated.

kyulee added inline comments.Aug 24 2023, 8:59 PM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
802

Sorry. I meant dyn_cast not cast.

kyulee added inline comments.Aug 24 2023, 9:12 PM
llvm/test/Transforms/MergeFunc/cfi-function-merging.ll
35

Can you also simplify the test?
I think you can delete attributes like unnamed_addr #3, #6, align 8, etc.,, to focus on only relevant metadata..

kyulee added inline comments.Aug 24 2023, 9:26 PM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
808

I think you can drop llvm: as with others.

dexonsmith requested changes to this revision.Aug 28 2023, 9:46 AM
This revision now requires changes to proceed.Aug 28 2023, 9:46 AM

Address @kyulee 's comments

Sorry, Phabricator lost my comment somehow. Adding back a version of it now.

I don't think it's viable to have the two functions arbitrarily compare "less than" if they both have distinct metadata. That leads to assertions like this:

assert(!(V1<V2 && V2<V1) && "Both values are less than the other?");

firing.

It also makes it undefined behaviour to use the comparator in "sort" algorithms. Have a look at:
https://en.cppreference.com/w/cpp/named_req/LessThanComparable
https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings

And I assume this is used for sorting somehow?

I suggest changing function merging to check for distinct metadata as a second step, after functions compare equal. But there might be another way.

llvm/lib/Transforms/Utils/FunctionComparator.cpp
810

This breaks strict weak ordering, which means compareValues can never be used for sorting, which I assume is the whole point.

oskarwirga marked 6 inline comments as done.Aug 28 2023, 9:55 AM

Remove more unnecessary metadata from test

oskarwirga marked an inline comment as done.Aug 28 2023, 9:58 AM

ping

Looks like you're still breaking strict-weak ordering. Is there a reason that's not a problem?

In particular, is it somehow guaranteed that the comparator is never used in a sorting function (where it would lead to undefined behaviour)? If so, how?

oskarwirga added inline comments.Sep 5 2023, 2:48 PM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
810

I had to consult with ChatGPT about what strict weak ordering is, but IIUC what I am doing wrong here is that I am returning -1 no matter what rather than 1 vs -1 depending on L vs R and R vs L, is my understanding correct here?

dexonsmith added inline comments.Sep 5 2023, 3:03 PM
llvm/lib/Transforms/Utils/FunctionComparator.cpp
810

Correct. As I suggested in my longer comment, it I suspect it’s better to fix this outside of the compare function.

oskarwirga planned changes to this revision.Sep 6 2023, 5:47 AM