This is an archive of the discontinued LLVM Phabricator instance.

[IRSim] Adding IR Instruction Mapper
ClosedPublic

Authored by AndrewLitteken on Sep 1 2020, 1:09 PM.

Details

Summary

This introduces the IRInstructionMapper, and the associated wrapper for instructions, IRInstructionData, that maps IR level Instructions to unsigned integers.

Mapping is done mainly by using the "isSameOperationAs" comparison between two instructions. If they return true, the opcode, result type, and operand types of the instruction are used to hash the instruction with an unsigned integer. The mapper accepts instruction ranges, and adds each resulting integer to a list, and each wrapped instruction to a separate list.

At present, branches, phi nodes are not mapping and exception handling is illegal. Debug instructions are not considered.

The different mapping schemes are tested in unittests/Analysis/IRSimilarityIdentifierTest.cpp

Diff Detail

Event Timeline

AndrewLitteken created this revision.Sep 1 2020, 1:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 1 2020, 1:09 PM
AndrewLitteken requested review of this revision.Sep 1 2020, 1:09 PM
AndrewLitteken added a reviewer: paquette.

Updating for clang-format.

paquette added inline comments.Sep 8 2020, 10:19 AM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
109

Is this related to InstrType?

228

This is pretty much the same as the MachineOutliner's instruction mapper. Can you add a TODO saying that the mappers should be merged?

232

Don't need \p here

296

Prefer static_cast over C-style casts for greppability.

298

Prefer static_cast over C-style casts for greppability.

302

Do you need both constructors?

Would something like the suggested edit above work?

353

add newline

llvm/lib/Analysis/IRSimilarityIdentifier.cpp
165

add newline

AndrewLitteken added inline comments.Sep 8 2020, 12:40 PM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
109

It is informed by the result yes, it is used during comparison to make sure that two illegal instructions are considered different.

AndrewLitteken added inline comments.Sep 8 2020, 12:53 PM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
302

I don't even think we need the nullptr. In the next patch, I think the empty constructor is removed.

ping @efriedma @craig.topper @MatzeB as potential reviewers for IR Similarity/Outliner passes.

jroelofs added inline comments.
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
58
267

The UnsignedVec name here conveys what it is, but that's the same as its type... maybe there's a better name that says more about what it does? ValueNumbering? InstNumbering?

paquette added inline comments.Sep 15 2020, 10:01 AM
llvm/lib/Analysis/IRSimilarityIdentifier.cpp
44

I think this entire function can be:

return A.Legal && A.Inst->isSameOperationAs(B.Inst);

Because if A is legal, and is the same operation as B, then B must also be legal.

plotfi added inline comments.Sep 15 2020, 10:31 AM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
106

How are OperTypes different from OperVals getTypes() ?

111

Any reason Legal is true by default? Did I miss a comment here?

118

Can Instruction &I be const Instruction &I ?

173

Can you make these a pass by reference to avoid the need for assert?

183

Same, pass by reference?

plotfi added inline comments.Sep 15 2020, 10:51 AM
llvm/lib/Analysis/IRSimilarityIdentifier.cpp
120

Can you use llvm's error handling features for this by chance? Could this make sense here?

https://llvm.org/docs/ProgrammersManual.html#error-handling

llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
173

Since the DenseMap accepts an IRInstructionData pointer rather than a value, these have to be a pointer rather than pass by reference.

llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
106

There isn't any difference, it is just easier to access them since the types need to be accessed often, and in different ways than OperVals.

plotfi added inline comments.Sep 15 2020, 11:41 AM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
106

There isn't any difference, it is just easier to access them since the types need to be accessed often, and in different ways than OperVals.

Can these be dropped then?

AndrewLitteken marked an inline comment as done.Sep 15 2020, 11:45 AM
AndrewLitteken added inline comments.
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
106

I'll rework it to get the types as needed.

plotfi added inline comments.Sep 15 2020, 11:48 AM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
173

Cant you just take the address of the reference when adding to the DenseMap??

llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
173

I'm not sure what you're asking for. These function are for the DenseMapInfo struct that makes sure that the pointers aren't just mapped by the address, but rather, by the instruction itself. As I understand it, since the DenseMap is set to accept IRInstructionData pointers, the DenseMapInfo functions have to handle pointers, not values, and so can't be by reference. But, I might be missing something, I haven't had to work with a lot of these data structures before.

AndrewLitteken added inline comments.Sep 15 2020, 1:13 PM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
118

Not really. While that would generally work for IRSimilarity, for tools that want to optimize based on the information, the const-ness would become a bit of a restriction.

Updating for simplicity, typos, and reducing redundant data structures.

Sorry for second diff so fast, forgot to update some of the parameter names in the comments.

jroelofs added inline comments.Sep 15 2020, 2:52 PM
llvm/lib/Analysis/IRSimilarityIdentifier.cpp
26

(since it's no longer collecting both)

Updating comments.

paquette accepted this revision.Sep 16 2020, 3:57 PM

I pretty much only have comment nits at this point, so I think this LGTM.

llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
18

Can you add \code and \endcode around these?

53

These comments can be slightly more succinct. Talking about validity doesn't add any information here. It would also be good to be clearer in what role these play.

Legal Instructions are considered when looking for similarity between Instructions.
Illegal Instructions cannot be considered when looking for similarity between Instructions. They act as boundaries between similarity regions.
Invisible Instructions are skipped over during similarity analysis.

etc

89

Missing ///?

llvm/lib/Analysis/IRSimilarityIdentifier.cpp
27

The operand use?

This revision is now accepted and ready to land.Sep 16 2020, 3:57 PM
This revision was automatically updated to reflect the committed changes.

Hi @AndrewLitteken,
sorry, but this commit fails the cross toolchain builders on Windows platform with the following errors:

FAILED: lib/Analysis/CMakeFiles/LLVMAnalysis.dir/IRSimilarityIdentifier.cpp.obj 
C:\PROGRA~2\MICROS~1\2017\COMMUN~1\VC\Tools\MSVC\1416~1.270\bin\Hostx64\x64\cl.exe  /nologo /TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib\Analysis -IC:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\lib\Analysis -Iinclude -IC:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include -D__OPTIMIZE__ /Zc:inline /Zc:__cplusplus /Zc:strictStrings /Oi /Zc:rvalueCast /bigobj /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2    /EHs-c- /GR- -UNDEBUG -std:c++14 /showIncludes /Folib\Analysis\CMakeFiles\LLVMAnalysis.dir\IRSimilarityIdentifier.cpp.obj /Fdlib\Analysis\CMakeFiles\LLVMAnalysis.dir\LLVMAnalysis.pdb /FS -c C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\lib\Analysis\IRSimilarityIdentifier.cpp
C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include\llvm/Analysis/IRSimilarityIdentifier.h(154): error C3861: 'hash_value': identifier not found
C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include\llvm/Analysis/IRSimilarityIdentifier.h(154): error C2672: 'llvm::IRSimilarity::hash_value': no matching overloaded function found
C:\buildbot\as-builder-2\x-aarch64\llvm-project\llvm\include\llvm/Analysis/IRSimilarityIdentifier.h(153): error C2672: 'hash_combine': no matching overloaded function found

see more details here:
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/3016
http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/1265

This broke some bots:

http://lab.llvm.org:8011/builders/mlir-windows/builds/7787
http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/18977

You should have received notifications from them about the failure, but maybe you missed them. I reverted the change so the bots can go back to green.

Updating hash_value, hash_combine calls with namespace specification for Windows buildbots, and explicitly deleting OperVals from IRInstructionData to fix a memory leak.

paquette added inline comments.Sep 17 2020, 9:09 AM
llvm/include/llvm/Analysis/IRSimilarityIdentifier.h
122

I think this will destroy OperVals twice.

https://ideone.com/5m8Meq

Updating to use SpecificBumpAllocator instead, to get rid of memory leaks.

Updated diff LGTM

jfi for updated diff: the cross toolchain build on Windows has passed. check-llvm/check-clang also looks ok.