This is an archive of the discontinued LLVM Phabricator instance.

[IR] Allow destruction of symbol table entries regardless of DiscardValueNames
ClosedPublic

Authored by yrouban on Feb 7 2023, 4:03 AM.

Details

Summary

Value::setNameImpl() is used both to set and reset name of the value.
In destructor of Function all arguments get reset their names (see Function::clearArguments()).
If the arguments had their names set (e.g. when the function was created with LLVMContex::DiscardValueNames == true) then their ValueName entries referred by the function's symbol table must be destructed. They are not destructed if LLVMContex::DiscardValueNames is set to false because of the fast path in Value::setNameImpl(). See the new test cases that demonstrate the problem. Without the fix they both crash in the function's destructor.

In Value::setNameImpl() this patch narrows down the fast path return for DiscardValueNames == true to allow destruction of ValueName entries if any.

Diff Detail

Event Timeline

yrouban created this revision.Feb 7 2023, 4:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2023, 4:03 AM
yrouban requested review of this revision.Feb 7 2023, 4:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2023, 4:03 AM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm added a comment.Feb 7 2023, 4:43 AM

This is only an issue if shouldDiscardValueNames() changes mid-compilation? What’s the use case for that?

llvm/unittests/IR/BasicBlockTest.cpp
549

This needs to be ASSERT_

554

Ditto

This is only an issue if shouldDiscardValueNames() changes mid-compilation? What’s the use case for that?

We use one context for many subsequent compilations. Modules are removed and created as needed. In product mode DiscardValueNames are set to false. Sometimes we inject code using parseAssemblyString(), for this we have to set DiscardValueNames to true.

@efriedma, @jsilvanus, @tejohnson, @arsenm, please review or suggest someone who can.

If I'm understanding correctly, the issue here is that you create a context with shouldDiscardValueNames() disabled, parse some code, then enable shouldDiscardValueNames()? That's a little exotic, but I guess we can accommodate if there isn't a measurable penalty for normal frontends.

As an alternative to this patch, I guess we could add a "clearValueName()" method that doesn't check shouldDiscardValueNames() at all, and use it in the necessary places. That's mostly equivalent to this patch, but saves a bunch of conditional checks. That should be a performance improvement whether or not shouldDiscardValueNames() is enabled... although I'm not sure the improvement would be measurable.

llvm/lib/IR/Value.cpp
320

I would guess it's important that we preserve the fast-path here (although I don't have numbers). Amending it to getContext().shouldDiscardValueNames() && !isa<GlobalValue>(this) && !hasName()) or something like that is probably trivial enough that it doesn't really matter, but the current patch runs a lot of code that doesn't do anything.

331

Can we just make this StringRef NameRef = NeedNewName ? NewName.toStringRef(NameData) : "";,instead of messing with the rest of the function?

If I'm understanding correctly, the issue here is that you create a context with shouldDiscardValueNames() disabled, parse some code, then enable shouldDiscardValueNames()? That's a little exotic, but I guess we can accommodate if there isn't a measurable penalty for normal frontends.

I believe you get it right. It is a functional issue and I do not expect any performance impact.

As an alternative to this patch, I guess we could add a "clearValueName()" method that doesn't check shouldDiscardValueNames() at all, and use it in the necessary places.

To not complicate things further I would suggest trying the clearValueName() approach only if we see any performance impact.

llvm/lib/IR/Value.cpp
320

I agree. shouldDiscardValueNames should be used only to check if we need the new name, but the old name (if any) we have to discard unconditionally.

331

I agree that we have to avoid NewName.toStringRef(NameData). I have another patch that refactors Value::setNameImpl() further in this way - D143550. Please take a look and choose one patch that I should finalize.

efriedma added inline comments.Mar 21 2023, 1:21 PM
llvm/lib/IR/Value.cpp
331

Yes, D143550 does that... but it also rearranges the whole function in a bunch of other ways that I'm not sure make sense. I'd prefer the minimal change for the bugfix, then considering the full refactoring separately.

yrouban updated this revision to Diff 507281.Mar 22 2023, 2:40 AM
yrouban edited the summary of this revision. (Show Details)

Returned back the first fast pass with hasName() added to the condition as suggested.

yrouban marked 4 inline comments as done.Mar 22 2023, 2:40 AM
efriedma added inline comments.Mar 22 2023, 1:02 PM
llvm/lib/IR/Value.cpp
358

This change shouldn't be necessary, I think? If NeedNewName is false, then NameRef is the empty string, so we can't hit this codepath.

yrouban updated this revision to Diff 507577.Mar 22 2023, 7:41 PM

nothing new. just added the full context lines

yrouban added inline comments.Mar 22 2023, 7:57 PM
llvm/lib/IR/Value.cpp
358

Do you suggest that we remove this condition on NeedNewName and set the name unconditionally here?
I agree that If NeedNewName is false then NameRef is empty. But ValueName::create() still allocates memory (it is StringMapEntry<Value*>::create()) and sets HasName = true.

In other words, we need this additional condition because we narrowed down the fast path in the beginning, so we can reach this part with NeedNewName == false.

efriedma added inline comments.Mar 22 2023, 8:25 PM
llvm/lib/IR/Value.cpp
358

A few lines earlier, we have:

if (NameRef.empty()) {
  // Free the name for this value.
  destroyValueName();
  return;
}

If NameRef is the empty string, we return early. If NameRef isn't the empty string, NeedNewName must be true. Either way, the additional check here does nothing.

yrouban added inline comments.Mar 22 2023, 8:36 PM
llvm/lib/IR/Value.cpp
358

now I see. Let me restructure the code a bit further ...

yrouban updated this revision to Diff 507583.Mar 22 2023, 8:40 PM

made the (!ST) case simpler

efriedma added inline comments.Mar 23 2023, 10:17 AM
llvm/lib/IR/Value.cpp
374

Similarly here... given NameRef is the empty string, if hasName() is true, we hit:

if (NameRef.empty())
  return;

if hasName() is false, we hit:

if (getName() == NameRef)
  return;

So NeedNewName is always true here.

yrouban added inline comments.Mar 23 2023, 10:00 PM
llvm/lib/IR/Value.cpp
374

Here at the line 373 we can have:

hasName && !NameRef.empty() && getName != NameRef

given that hasName and getName are calculated as hasName() and getName() before the name destruction.

In other words, the current name and the new name are not empty and not equal. It is the main path for changing an existing name. In the previous block we have just destroyed the old name and is about to set the new one. I agree that for !hasName we have NeedNewName == true but for hasName == true NeedNewName can be false.

I'm not sure I understand your point. Do you propose a deeper refactoring like the following:

  ...
  if (!hasName()) {
    assert(NeedNewName);
    setValueName(ST->createValueName(NameRef, this));
    return;
  }

  // Remove old name.
  // NOTE: Could optimize for the case the name is shrinking to not deallocate
  // then reallocated.
  ST->removeValueName(getValueName());
  destroyValueName();

  if (!NameRef.empty()) {
    assert(NeedNewName);
    setValueName(ST->createValueName(NameRef, this));
  }
}

or a simple one

  ...
  if (hasName()) {
    // Remove old name.
    // NOTE: Could optimize for the case the name is shrinking to not deallocate
    // then reallocated.
    ST->removeValueName(getValueName());
    destroyValueName();
  }

  if (NeedNewName && !NameRef.empty())
    setValueName(ST->createValueName(NameRef, this));
}

Please suggest an exact code, if possible.

efriedma added inline comments.Mar 24 2023, 10:44 AM
llvm/lib/IR/Value.cpp
374

but for hasName == true NeedNewName can be false.

If hasName is true and NeedNewName is false, NameRef is the empty string, so we hit the check on 368, and return early:

if (NameRef.empty())
  return;

So I'm suggesting the following:

assert(NeedNewName);
setValueName(ST->createValueName(NameRef, this));
yrouban added inline comments.Mar 26 2023, 10:29 PM
llvm/lib/IR/Value.cpp
374

I still cannot agree.
At line 373 we have the following conditions satisfied:

!(!NeedNewName && !hasName()) // came from line 323
!(NewName.isTriviallyEmpty() && !hasName()) // came from 327
!(getName() == NameRef) // came from 336
!(hasName() && NewRef.empty()) // came from 368

Simplified:

323: NeedNewName || hasName()
327: !NewName.isTriviallyEmpty() || hasName()
336: getName() != NameRef
368: !hasName() || !NewRef.empty()

Observe two cases: hasName() == true and hasName() == false

hasName() == false further simplifies to the following:

323: NeedNewName
327: !NewName.isTriviallyEmpty()
336: getName() != NameRef

hasName() == true further simplifies to the following:

336: getName() != NameRef
368: !NewRef.empty()

So, for the case hasName() == true we do not have any restriction on NeedNewName. Thus it can be false and wee need if (NeedNewName) at line 373.

efriedma added inline comments.Mar 27 2023, 12:58 PM
llvm/lib/IR/Value.cpp
374

!NameRef.empty() means NameRef isn't the empty string, which means NeedNewName is true (because if NeedNewName were false, we would have set NameRef to the empty string). So no, it can't be false.

yrouban added inline comments.Mar 27 2023, 6:38 PM
llvm/lib/IR/Value.cpp
374

Ok, I see now that I missed the name at line 368: it is about NameRef rather than NewRef.

yrouban updated this revision to Diff 508890.EditedMar 27 2023, 10:01 PM

replaced

if (NeedNewName)

with

assert(NeedNewName);

at line 373.

@efriedma, please take a look at the latest change you suggested.

This revision is now accepted and ready to land.Mar 30 2023, 12:24 PM