This is an archive of the discontinued LLVM Phabricator instance.

[mlir][LLVMIR] Do not update instMap via assignments to map entry references
ClosedPublic

Authored by myhsu on Apr 28 2022, 11:26 AM.

Details

Summary

Inside processInstruction, we assign the translated mlir::Value to a reference previously taken from the corresponding entry in instMap. However, instMap (a DenseMap) might resize after the entry reference was taken, rendering the assignment useless since it's assigning to a dangling reference. Here is a (pseudo) snippet that shows the concept:

// inst has type llvm::Instruction *
Value &v = instMap[inst];
...
// op is one of the operands of inst, has type llvm::Value *
processValue(op);
// instMap resizes inside processValue
...
translatedValue = b.createOp<Foo>(...);
// v is already a dangling reference at this point!
// The following assignment is bogus.
v = translatedValue;

Nevertheless, after we stop caching llvm::Constant into instMap, there is only one case that can cause processValue to resize instMap: If the operand is a llvm::ConstantExpr. In which case we will insert the derived llvm::Instruction into instMap. To trigger instMap to resize, which is a DenseMap, the threshold depends on the ratio between # of map entries and # of (hash) buckets. More specifically, it resizes if (# of map entries / # of buckets) >= 0.75. In this case # of map entries is equal to # of LLVM instructions, and # of buckets is the power-of two upperbound of # of map entries. Thus, eventually in the attaching test case (test/Target/LLVMIR/Import/incorrect-instmap-assignment.ll), we picked 96 and 128 for the # of map entries and # of buckets, respectively. (We can't pick numbers that are too small since DenseMap used inlined storage for small number of entries). Therefore, the ConstantExpr in the
said test case (i.e. a GEP) is the 96-th llvm::Value cached into the instMap, triggering the issue we're discussing here on its enclosing instruction (i.e. a load).

This patch fixes this issue by calling operator[] everytime we need to
update an entry.

Diff Detail

Event Timeline

myhsu created this revision.Apr 28 2022, 11:26 AM
myhsu requested review of this revision.Apr 28 2022, 11:26 AM
Mogball accepted this revision.Apr 28 2022, 11:53 AM
Mogball added a subscriber: Mogball.
Mogball added inline comments.
mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
737–738

Can the variable itself just be deleted?

This revision is now accepted and ready to land.Apr 28 2022, 11:53 AM
myhsu updated this revision to Diff 425902.Apr 28 2022, 2:14 PM
myhsu marked an inline comment as done.

Remove unused variable v and update the assertion check.

myhsu added inline comments.Apr 28 2022, 2:16 PM
mlir/lib/Target/LLVMIR/ConvertFromLLVMIR.cpp
737–738

Yes, I'd removed this line and replaced the check in next line with !instMap.count(inst)

Mogball accepted this revision.Apr 28 2022, 2:49 PM

lgtm but you might want to wait if anyone else has complaints

The complexity of the test is scary. Is it just a matter of having a sufficient amount of instructions and/or constants in a vector?

myhsu added a comment.May 2 2022, 3:36 PM

The complexity of the test is scary. Is it just a matter of having a sufficient amount of instructions and/or constants in a vector?

Yeah I agree. I think it's just having sufficient number of instructions (since we're not caching constants into instMap anymore) to saturate instMap and force it to resize. I'll try to simplify the test, primarily removing some scary global variables/types.

myhsu updated this revision to Diff 426875.May 3 2022, 4:06 PM
myhsu edited the summary of this revision. (Show Details)

Update the test case.

myhsu added a comment.EditedMay 3 2022, 4:09 PM

The complexity of the test is scary. Is it just a matter of having a sufficient amount of instructions and/or constants in a vector?

It turns out it's much more tricky to trigger the issue. Please checkout the updated summary above for the rationale.

myhsu edited the summary of this revision. (Show Details)May 3 2022, 4:11 PM