This is an archive of the discontinued LLVM Phabricator instance.

[BranchFolding] Merge MMOs during tail merge
AbandonedPublic

Authored by flyingforyou on Dec 4 2015, 6:50 AM.

Details

Summary

Recommit r254694, it was reverted in r254700 to investigate a bot failure.

If we remove the MMOs from Load/Store instructions,
they are treated as volatile. This makes other optimization passes unhappy.
eg. Load/Store Optimization

So, it looks better to merge, not remove.
But if the product of two MMOs count is higher than threshold,
we just remove them for preventing any pathological cases from
causing compile time blowups or overflow.

Diff Detail

Event Timeline

flyingforyou retitled this revision from to [BranchFolding] Merge MMOs during tail merge.
flyingforyou updated this object.
flyingforyou added reviewers: gberry, mcrosier.
flyingforyou added subscribers: rafael, llvm-commits.

Hi, Chad, Geoff.

In D14797, @gberry wrote:
Alternatively, could you just add a bail out case that clears all memops if M1->getMemOperands().size() * M2->getMemOperands().size() is above a certain threshold? That would > > prevent any pathological cases from causing compile time blowups.
@mcrosier wrote:
This sounds very reasonable, IMO.
@flyingforyou wrote:
I will make an another patch. We can discuss about threshold value.

Your opinion is not optinal. It's essential. So, I added threshold and set it 255. It's maximum value of MMOs count.
After setting threshold, there is no error likes below.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21531/steps/annotate/logs/stdio

Could you review this commit again, please?

I am really sorry for hasty decisions.

Junmo.

mcrosier edited edge metadata.Dec 4 2015, 7:29 AM

Your opinion is not optinal. It's essential. So, I added threshold and set it 255. It's maximum value of MMOs count.
After setting threshold, there is no error likes below.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21531/steps/annotate/logs/stdio

Hi Junmo,
Would you mind explaining what exactly caused the bootstrap to fail? I just want to make sure this is in fact the correct fix. Also, I'm wondering how you arrived at the 255 threshold; I would have probably picked something much smaller (e.g., 16).

Chad

p.s., We should always do the necessary due diligence to prevent breakages, but please don't lose sleep over it. :) We do appreciate your efforts on this patch!

Would you mind explaining what exactly caused the bootstrap to fail?

The error message is below.

void llvm::MachineInstr::setMemRefs(llvm::MachineInstr::mmo_iterator, llvm::MachineInstr::mmo_iterator): Assertion `NumMemRefs == NewMemRefsEnd - NewMemRefs && "Too many memrefs"' failed.

MachineInstr.h

   uint8_t NumMemRefs;                   // Information on memory references.
   mmo_iterator MemRefs;

NumMemRefs type is uint8_t. So the maximum value is 255. Previous commit doesn't care about the NewMemrefs's count. When modified clang build clang, there are several cases that sum of two MI's MMOs count is over 255.
So, it has a problem. I just set the threshold value maximum. No matter MI1, MI2's count is, it could be safe.
eg. MI1 MMOs count == 1, MI2 MMOs count == 255, it will be occurred overflow. Because Sum of two MMOs count is 256. (Product of two MMOs count is 255. This is the maximum value of NumMemRefs.)

I don't want to set threshold value to 255. 16 is also good choice. (If the threshold value is 16, most of MMOs can be merged.)

p.s., We should always do the necessary due diligence to prevent breakages, but please don't lose sleep over it. :) We do appreciate your efforts on this patch!

I really appreciate your kind words.

Junmo.

flyingforyou edited edge metadata.

Addressed Chad's comments. (Reduce threshold value)

Thanks Chad.

gberry edited edge metadata.Dec 11 2015, 8:18 AM

A few comments below.

lib/CodeGen/BranchFolding.cpp
760

You might want to change this comment to refer to the threshold check done in the caller.

784

This comment formatting should be cleaned up. Don't capitalize "Merge" on this line and move it up to the above line and re-wrap it.

820

It seems to me that adding MachineInstr::getNumMemOperands() and using it here would be cleaner.

flyingforyou edited edge metadata.
flyingforyou marked 3 inline comments as done.

Addressed Geoff's comments.
Thanks.

gberry added a subscriber: hfinkel.
gberry added inline comments.
include/llvm/CodeGen/MachineInstr.h
350

I don't think you want to have the uint8_t implementation detail leak out of the interface here. Perhaps this should be 'unsigned'?
I'd like to hear from @hfinkel on this detail before I can approve it.

flyingforyou added inline comments.Dec 16 2015, 3:10 AM
include/llvm/CodeGen/MachineInstr.h
350

Thanks Geoff. Before waiting review from @hfinkel, I want to talk something.

We choose the way to check threshold MI1 MMO's count * MI2's MMO's count. If we change uint8_t to unsigned, it seems that can return over 255. This means we may take care overflow. Because the type of merge-mmos-threshold is unsigned. it's not uint64_t.

If @hfinkel also think change the return type to unsigned, we need to add assert before multipication.

And MachineInstr has two more uint8_t variables Flags, AsmPrinterFlags. Their getter & setter also have same return type.
So I think this function may be ok also.

Junmo.

gberry added inline comments.Dec 16 2015, 10:23 AM
include/llvm/CodeGen/MachineInstr.h
350

Junmo, I agree with your points above, other than a small point that we shouldn't add an assert for overflow, but rather a check that handles it appropriately (i.e. by not doing the merge).

I think all that's left is for @hfinkel to chime in.

Ping: Hal, Could you review this commit, please?

reames added a subscriber: reames.Dec 23 2015, 9:36 AM

Do you have test cases to motivate this change? I ask for two reasons:

  1. They'll be absolutely required before submission.
  2. I'm trying to understand when the merging is useful in practice.
include/llvm/CodeGen/MachineInstr.h
350

I disagree on using uint8_t. The maximum storage size of the mem operand array is an implementation detail and should not leak outside the owning class.

p.s. I have out of tree changes which change the size of this field. I know of at least one other group that has a similar change. It'd be nice if we abstracted over the storage size. :)

lib/CodeGen/BranchFolding.cpp
69

Given how many bugs we have around treating cleared MMOs correctly, I *strongly* recommend preserving them if at all possible.

764

This is an order N^2 algorithm. Not acceptable particular since a sort/unique based one is clearly feasible.

I'll also comment that the semantics of merging two memoperand lists are unclear at the moment. This code might be semantically correct, but I don't know enough yet to say for sure.

816

And what happens if I set merge threshold to 270? There needs to be an assert/bounds check on the merge threshold.

reames requested changes to this revision.Dec 23 2015, 9:37 AM
reames added a reviewer: reames.
This revision now requires changes to proceed.Dec 23 2015, 9:37 AM

I have a review up (http://reviews.llvm.org/D15757) which cleans up the API this is building on. I suggest that you wait until that patch has landed, then rebase your changes there. It'll simplify the changes needed and remove the need for the interface extensions. Once my change is in, moving the isIdenticalMMO as a special case of the new conservative merge would also be useful.

Thanks, Philip.

This commit is bug fix of D14797. Several discussions were already done in D14797.

I will rebase the code soon.

include/llvm/CodeGen/MachineInstr.h
350

OK. If someone use NumMemRefs uint16_t, we could use unsigned for return type.

lib/CodeGen/BranchFolding.cpp
69

Most of cases, I think threshold value(16) is enough. (Also this is Chad's idea.) If you need practical data, I will make it. please let me know.

764

please refer below commit.
http://reviews.llvm.org/D14797

Most of cases(98%), MI1's MMO count is 0 or 1. so It takes less time than we think. If we make the code using Smallset, we should use function call, compare function, using more room for saving MMOs. I don't think this is betther than now.

816

I will add comment that the threshold limit should be 254. Then, we don't need to insert assert check. right?

flyingforyou edited edge metadata.

Addressed Philip's comments.

junbuml added a subscriber: junbuml.Jan 5 2016, 7:15 AM

I've landed two changes (r256966 and r256909) which clarify the semantics of MMO merging. This patch should be rebased over the generic merging infrastructure now in place so that other parts of the backend can pick them up.

I still think the merging algorithm used here is unnecessarily inefficient. There's no reason this merge step shouldn't scale to the maximum number of memrefs we allow. I suggested in an offlist email that we use a std::sort idiom. That is slightly complicated by the need to sort the pointers based on the contents of objects reachable through pointers, but that still seems like the right approach. A custom comparator function or lambda would be perfect in this context.

If you'd rather, I'm happy to write up a patch which implements that. The only thing I would need is a couple of test cases which show the need for it so that I can tell if the code is actually correct. (The tests would be mandatory for the rebased version of this change as well.)

flyingforyou abandoned this revision.Jan 7 2016, 3:29 AM

Hi Philip.

This patch's main idea is merging MMOs. But you already did that on D15913.
So I think rebasing this commit is not necesaary any more.

Thanks.

gberry added a comment.Jan 7 2016, 7:00 AM

I don't think that is correct. If you look at the body of MachineInstr::mergeMemRefsWith() after Philip's change, it is only merging in the case where the MMO lists are the same. I believe his intention was for you to refactor this change by making a similar change in mergeMemRefsWith() instead of in TryTailMergeBlocks()

Hi Geoff.

I checked triple time after seeing your comment. I think I am not wrong with my comment.

In mergeMemRefsWith()

if (hasIdenticalMMOs(*this, Other))
  return std::make_pair(MemRefs, NumMemRefs);
-> If MMO lists are same, just return itself's MMOs.

-> If MMO lists are not same, it starts merging likes below.

// TODO: consider uniquing elements within the operand lists to reduce
// space usage and fall back to conservative information less often.
size_t CombinedNumMemRefs = NumMemRefs + Other.NumMemRefs;

After applying philip's commit, if the MMO lists are not same, we preserve MMOs.

Anyway, what I want to check is below part.

if (MBBICommon->mayLoad() || MBBICommon->mayStore())
  MBBICommon->setMemRefs(MBBI->mergeMemRefsWith(*MBBI));

MBBI->mergeMemRefsWith(*MBBI) might be
MBBICommon->mergeMemRefsWith(*MBBI) or MBBI->mergeMemRefsWith(*MBBICommon).

I will upload new commit for this.

-Junmo.

gberry added a comment.Jan 8 2016, 7:35 AM

The other difference from your original change is that duplicates aren't removed from the newly merged list, but I guess we'd need to see a case where that mattered to justify the cost of fixing it.