This is an archive of the discontinued LLVM Phabricator instance.

[NFC] Methods to compare IR added in each IR subclass
Needs ReviewPublic

Authored by vish99 on Jun 30 2020, 10:32 AM.

Details

Summary

When there is a change to IR, the comparators also need to be updated which if not done can leads to mis-compiles. The function haveSameSpecialState() in line 407 of Instruction.cpp in sync with the IR so that the comparators can call this function to compare the Instruction specific properties. The ideal thing is when IR is defined, the author should also tell how two instructions of that specific IR type can be compared. This can be done by defining a method in that subclass of IR which compares two instructions. To achieve this, a method hasSamePropertiesAs() has been added in every subclass of IR which compares the instruction specific properties of that class, like for in AllocaInst class we added the method hasSamePropertiesAs() which compares the allocated type and alignment. This can solve the following issues.

  1. If we add a new instruction.

In the haveSameSpecialState() in Instruction.cpp checks have been added for every known IR subclass. If the current instructions to compare don't fit into the case of any known instructions, an llvm_unreachable() call is executed. This will lead to an error when a new instruction is added but the haveSameSpecialState() function is not updated.

  1. If a new operand is added.

If a new operand is added to an IR instruction, there will be a need to do a lot of changes in the subclass of that IR instruction like changing the constructors. Along with these changes, an additional check need to be added in the hasSamePropertiesAs() method of that same subclass if the operand being added matters for comparison. This can be easily checked.

This minimizes the number of files to touch for a change in the IR while maintaining sync of IR with comparators.

Diff Detail

Event Timeline

vish99 created this revision.Jun 30 2020, 10:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2020, 10:32 AM
nikic added a subscriber: nikic.Jun 30 2020, 11:52 AM

At this point, I would recommend writing this as a switch over the instruction opcode.

nikic added inline comments.Jun 30 2020, 11:56 AM
llvm/lib/IR/Instruction.cpp
420

I don't think this belongs here, as it is comparison operands. That's handled separately, e.g. in isSameOperationAs.

423

This can be an assertion, because operand count equality implies it.

426

Similar to the return case, I don't think this belongs, as it's checking operands.

431

Same here.

447

inbounds is subclass optional data, it should not be compared here. (It is legal to drop inbounds.)

vish99 updated this revision to Diff 274825.Jul 1 2020, 8:32 AM

Added comparision for all types in haveSameSpecialState() of Instruction.cpp and changed the if-else branching to switch over the Instruction Opcode

vish99 marked 5 inline comments as done.Jul 1 2020, 8:36 AM

At this point, I would recommend writing this as a switch over the instruction opcode.

I do agree that a switch over instruction opcode would make more sense. I updated the revision to do the same.

vish99 updated this revision to Diff 274838.Jul 1 2020, 8:58 AM

Added case for Freeze instruction.

vish99 updated this revision to Diff 275443.Jul 3 2020, 11:32 AM

Added hasSamePropertiesAs() method in each subclass of the IR which should say how two IR instructions of that subclass should be compared.
Changed haveSameSpecialState() method to make use of the hasSamePropertiesAs() methods.
When there is a change to IR, only the hasSamePropertiesAs() method needs of that subclass need to be updated.

hiraditya added inline comments.Jul 3 2020, 2:50 PM
llvm/lib/IR/Instruction.cpp
416

Good idea. this should ensure that any new added instruction will fail compilation unless an entry is added to this switch statement. Maybe we should put a comment here.

Is this a bug fix? If it is NFC (No Functional Change) please specify that in the title, otherwise needs a test case.

vish99 retitled this revision from Added comparision for all types in haveSameSpecialState() of Instruction.cpp to [NFC] Added comparision for all types in haveSameSpecialState() of Instruction.cpp.Jul 8 2020, 2:27 AM
vish99 edited the summary of this revision. (Show Details)

Is this a bug fix? If it is NFC (No Functional Change) please specify that in the title, otherwise needs a test case.

Thanks. The title has been edited :)

vish99 updated this revision to Diff 277097.Jul 10 2020, 10:36 AM

Comment added

vish99 marked an inline comment as done.Jul 14 2020, 7:36 AM

@jfb @tejohnson, just reminding you of the review as the patch has been lying dormant since a week.

Why not just make hasSamePropertiesAs a pure virtual on Instruction, then you don't need any switch statement in haveSameSpecialState? You'd need to do the cast on the Instruction operand within the implementations, but that would be cleaner IMO. For the ones that currently return true, you wouldn't need the cast just an opcode comparison.

llvm/include/llvm/IR/Instructions.h
1323

Stray ';'

llvm/lib/IR/Instruction.cpp
406

Does anything similar need to be done to the above function (which appears to have been moved to its own file, lib/Transforms/Utils/FunctionComparator.cpp)?

Why not just make hasSamePropertiesAs a pure virtual on Instruction, then you don't need any switch statement in haveSameSpecialState? You'd need to do the cast on the Instruction operand within the implementations, but that would be cleaner IMO. For the ones that currently return true, you wouldn't need the cast just an opcode comparison.

I think adding a virtual method to Instruction class would fail a static_assert in OperandTraits which says a subclass of User shouldn't be polymorphic and if virtual methods are added to subclasses of User, it breaks use lists.

Why not just make hasSamePropertiesAs a pure virtual on Instruction, then you don't need any switch statement in haveSameSpecialState? You'd need to do the cast on the Instruction operand within the implementations, but that would be cleaner IMO. For the ones that currently return true, you wouldn't need the cast just an opcode comparison.

I think adding a virtual method to Instruction class would fail a static_assert in OperandTraits which says a subclass of User shouldn't be polymorphic and if virtual methods are added to subclasses of User, it breaks use lists.

Ah got it, no wonder there are no virtual methods in Instruction.h right now. Nevermind this suggestion then!

vish99 updated this revision to Diff 279022.Jul 18 2020, 12:58 PM

Changed hasSamePropertiesAs() method in each subclass of IR to introduce total ordering between two instructions of that subclass
and modified FunctionComparator to use haveSameSpecialState() method for IR comparision.

vish99 updated this revision to Diff 279057.Jul 18 2020, 10:29 PM
vish99 marked 3 inline comments as done.

Comment added

nikic added inline comments.Jul 19 2020, 6:43 AM
llvm/include/llvm/IR/InstrTypes.h
183

Now that this implements a comparison rather than an equality, the name is a bit odd. Maybe compareProperties() or so?

vish99 updated this revision to Diff 279104.Jul 19 2020, 12:35 PM

Changed named of comparator method in each subclass to compareInstSpecificProperties()

vish99 updated this revision to Diff 280039.Jul 23 2020, 12:15 AM
vish99 marked an inline comment as done.

Changed name from haveSameSpecialState() to compareSpecialState()

Can we build and run llvm testsuite to make sure there are no miscompiles?

Ran the llvm test-suite. All the tests were going fine.

This comment was removed by vish99.

Quoting the observed text segment sizes for few object files in MySQL, by enabling MergeFunctions by default and comparing them with baseline.

FILENAMEBaselineWith-mergefuncSize reduction %
none.cpp.o(ex libbinlogevents.a)91377115.5%
mysqlrouter26,81124,4548.7%
load_data_events.cpp.o (ex libbinlogstandalone.a)3,4593,1718.3%
rows_event.cpp.o (ex libbinlogstandalone.a)18,26416,7568.2%
control_events.cpp.o (ex libbinlogstandalone.a)13,82013,0185.8%
iterator.cpp.o (ex libbinlogevents.a)3,7753,5874.9%
mysqlrouter_passwd56,87554,1294.8%
mysqld55,329,09552,690,0494.7%
binlog_event.cpp.o (ex libbinlogstandalone.a)2,5252,4303.7%
rows_event.cpp.o (ex libbinlogevents.a)15,44214,8883.5%
control_events.cpp.o (ex libbinlogevents.a)12,47012,0503.3%

@tejohnson @nikic do we have comments to be addressed?

Please take a look at how Instruction::clone() is implemented, using a protected cloneImpl() method on each instruction type. I would suggest to follow the same pattern, and make these type-specific methods protected (and drop the redundant doc comment on each one of them).

It would be nice to also use the HANDLE_INST pattern to avoid the long list of instructions, but I guess that might be a bit complicated by the IgnoreAlignment/IgnoreMetaData flags. Maybe you just want to keep these as flags passed down to all the functions, regardless of whether they need them or not. I'm not sure on this.

nikic added inline comments.Jul 31 2020, 11:41 AM
llvm/lib/IR/Instructions.cpp
161 ↗(On Diff #280039)

The compareTypes() function looks a bit problematic to me. The issue is that MergeFunctions considers integer types and all pointer types of the same size equal. Other places in LLVM require that the type must be exactly identical. The function as implemented here seems to go some middle way, where pointers are not considered equal to integers, but all pointer types are the same. That's in the spirit of opaque pointers, but we're not there yet.

hiraditya added inline comments.Aug 2 2020, 7:27 PM
llvm/lib/IR/Instructions.cpp
161 ↗(On Diff #280039)

Other places in LLVM require that the type must be exactly identical

Could you provide an instance where this is required. That will help formulate the right check here. The current check is replicating what MergeFunction did. If there's a bug, it'll be good to fix it.

vish99 updated this revision to Diff 282929.Aug 4 2020, 8:41 AM

Changed compareInstSpecificProperies() methods to protected and added comparision of elemnet types of pointers in compareTypes()

hiraditya added inline comments.Aug 10 2020, 7:19 AM
llvm/lib/IR/Instructions.cpp
176 ↗(On Diff #282929)

s/i/I
s/e/E
Same for other clang-tidy warnings as well.

jfb added a comment.Aug 10 2020, 3:00 PM

I didn't check that all of these compare all the required state for all opcodes.

That being said, this still seems somewhat brittle, if less so than before. Are there sufficient tests to find future divergences?

llvm/include/llvm/IR/Instruction.h
655 ↗(On Diff #282929)

Can you document flags?

llvm/include/llvm/IR/Instructions.h
69

I prefer having an enum class instead of bool parameters. i.e.

enum class IgnoreAlignment { No, Yes };
int compareInstSpecificProperties(const AllocaInst *AI,
                                    IgnoreAlignment ignoreAlignment = IgnoreAlignment::No) const;
188

Same here on flags.

323

Ditto.

1500

Ditto.

3822

Ditto.

4040

Ditto.

llvm/lib/IR/Instructions.cpp
82 ↗(On Diff #282929)

Is it expected that APInts with different bit width but the same value won't be equal?

vish99 updated this revision to Diff 284758.Aug 11 2020, 8:37 AM

Changed IgnoreAlignment and IgnoreMetaData to enum class instead of bool type and fixed clang-tidy warnings.

In D82892#2208245, @jfb wrote:

I didn't check that all of these compare all the required state for all opcodes.

That being said, this still seems somewhat brittle, if less so than before. Are there sufficient tests to find future divergences?

When a new instruction has been added and the compareSpecialState() method has not been updated, it leads to failure. But when there is an addition of a new operand, it's the responsibility of the author to change the compareInstSpecificProperties() method of that method(which can be done easily as it'll be in the same file) and this can update all the comparators. Currently, I don't think there are enough tests.

llvm/lib/IR/Instructions.cpp
161 ↗(On Diff #280039)

How about we also compare the element types of the pointers through a recursive call to compareTypes() along with the Address space comparison?

vish99 updated this revision to Diff 285661.Aug 14 2020, 8:25 AM
vish99 marked 9 inline comments as done.

Added TestCases

hiraditya added inline comments.Aug 14 2020, 10:27 AM
llvm/lib/IR/Instructions.cpp
58 ↗(On Diff #285661)

Can we do:

if (L==R)
  return 0;
if (L)
  return 1;
return -1;
66 ↗(On Diff #285661)

SInce AtomicOrdering is unsinged. We should be able to just return their difference.

vish99 retitled this revision from [NFC] Added comparision for all types in haveSameSpecialState() of Instruction.cpp to [NFC] Added comparison for all types in haveSameSpecialState() of Instruction.cpp.Aug 14 2020, 11:15 AM
vish99 updated this revision to Diff 285850.Aug 15 2020, 11:17 AM

cmpAPInts fix

please run clang-format

vish99 updated this revision to Diff 286074.Aug 17 2020, 10:40 AM
vish99 marked an inline comment as done.

Clang-format fix

vish99 updated this revision to Diff 286478.Aug 18 2020, 10:34 PM

Comment Added

vish99 retitled this revision from [NFC] Added comparison for all types in haveSameSpecialState() of Instruction.cpp to [NFC] Methods to compare IR added in each IR subclass.Aug 28 2020, 11:10 AM
vish99 added a reviewer: nikic.
vish99 added a comment.Oct 9 2020, 8:52 AM

Are they any more improvements needed?