This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Fix for MachineBasicBlock::rfindDebugLoc(instr_rend())
ClosedPublic

Authored by bjope on May 15 2023, 7:50 AM.

Details

Summary
  • Add some simple unittests for the findDebugLoc, rfindDebugLoc, findPrevDebugLoc and rfindPrevDebugLoc helpers in MachineBasicBlock.
  • Make sure we do not crash in rfindDebugLoc when starting at instr_rend(). Solution is to see it as we start one MI before the first MI, so we can start searching forward at instr_begin() instead.
  • Clean up code comments and code formatting related to the functions mentioned above.

Diff Detail

Event Timeline

bjope created this revision.May 15 2023, 7:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 7:50 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
bjope requested review of this revision.May 15 2023, 7:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 15 2023, 7:50 AM

Hi @bjope

As I understand it, this change means the behaviour of rfindDebugLoc when given instr_rend() is now similar to the existing behaviour of findPrevDebugLoc when given instr_end(). findPrevDebugLoc calls prev_nodbg (code) which decrements the iterator before calling skipDebugInstructionsBackward. I.e. In both cases an end iterator is valid input and will not be dereferenced. (SGTM)

Is there a reason that findDebugLoc and rfindDebugLoc call skipDebugInstructionsForward / skipDebugInstructionsBackward rather than next_nodbg / prev_nodbg. It looks like calling the nodbg functions could solve the problem (the end-iterator-dereference) too, with no new code, but I am not certain of that without testing it.

llvm/unittests/CodeGen/MachineBasicBlockTest.cpp
64

tiny nit: double full stop here

71

misplaced : in comment?

bjope added a comment.May 16 2023, 6:21 AM

Hi @bjope

As I understand it, this change means the behaviour of rfindDebugLoc when given instr_rend() is now similar to the existing behaviour of findPrevDebugLoc when given instr_end(). findPrevDebugLoc calls prev_nodbg (code) which decrements the iterator before calling skipDebugInstructionsBackward. I.e. In both cases an end iterator is valid input and will not be dereferenced. (SGTM)

Right. This patch is making the end iterators valid input. Basically treating end()/rend() as pointing to an extra MI with an undefined DebugLoc after/before the last/first MI.

Is there a reason that findDebugLoc and rfindDebugLoc call skipDebugInstructionsForward / skipDebugInstructionsBackward rather than next_nodbg / prev_nodbg. It looks like calling the nodbg functions could solve the problem (the end-iterator-dereference) too, with no new code, but I am not certain of that without testing it.

The findDebugLoc and rfindDebugLoc functions are taking the DebugLoc from the MI pointed to by the iterator, if that MI fulfil the requirements. So they kind of do "skip-debug + examine MI + step-iterator", while the findPrev variants do "step-iterator + skip-debug + examine MI". The next_nodbg / prev_nodbg is a short form for doing "step + skip", so in order to use them one would need to rewrite the code a bit more.

It had perhaps been more interesting to have helpers for scanning forward/backward, and doing pre/post increment. That would give four different variants. Now we got two variants that scan forward doing post-increment. And two variants that scan backwards doing pre-increment. I got a bit confused about that for awhile (and it is not that easy to understand what prev/next, first/last, forward/backward etc means when having variants using both iterators and reverse iterators). One idea with the unittests I added is to actuallly show what happens when using those helpers (even if one still would need to add some debug instructions to make those tests even more complete).

bjope updated this revision to Diff 522589.May 16 2023, 6:40 AM

Fix comments in the test case after review feedback.

Hi @bjope

Is there a reason that findDebugLoc and rfindDebugLoc call skipDebugInstructionsForward / skipDebugInstructionsBackward rather than next_nodbg / prev_nodbg. It looks like calling the nodbg functions could solve the problem (the end-iterator-dereference) too, with no new code, but I am not certain of that without testing it.

The findDebugLoc and rfindDebugLoc functions are taking the DebugLoc from the MI pointed to by the iterator, if that MI fulfil the requirements. So they kind of do "skip-debug + examine MI + step-iterator", while the findPrev variants do "step-iterator + skip-debug + examine MI". The next_nodbg / prev_nodbg is a short form for doing "step + skip", so in order to use them one would need to rewrite the code a bit more.

It had perhaps been more interesting to have helpers for scanning forward/backward, and doing pre/post increment. That would give four different variants. Now we got two variants that scan forward doing post-increment. And two variants that scan backwards doing pre-increment. I got a bit confused about that for awhile (and it is not that easy to understand what prev/next, first/last, forward/backward etc means when having variants using both iterators and reverse iterators). One idea with the unittests I added is to actuallly show what happens when using those helpers (even if one still would need to add some debug instructions to make those tests even more complete).

Aha, yeah that makes sense. Thanks for the explanation!

Since rfindDebugLoc wants to get the DebugLoc of the passed-in iterator/instruction, as you explained, it feels like maybe it should be the caller's responsibility to filter out instr_rend(). However, findDebugLoc gracefully handles being handled instr_end() (by returning an empty DebugLoc), so handling instr_rend in rfindDebugLoc seems like the least surprising option.

Whether it's better to return an empty DebugLoc (because the iterator is invalid) or scan forward for the next one (because rend points to somewhere before the start, and rfindDebugLoc finds the "this _or next_" filtered DebugLoc) I am not entirely sure as I think both make sense. What are your thoughts on this? I am happy to accept this as it is, but I want to add some other debug-info reviewers in case anyone has a stronger opinion.

bjope updated this revision to Diff 523040.May 17 2023, 7:05 AM

Rebased and added some DBG_VALUE instructions to the unittest to make it a bit
more complete.

bjope added a comment.May 24 2023, 8:14 AM

Whether it's better to return an empty DebugLoc (because the iterator is invalid) or scan forward for the next one (because rend points to somewhere before the start, and rfindDebugLoc finds the "this _or next_" filtered DebugLoc) I am not entirely sure as I think both make sense. What are your thoughts on this? I am happy to accept this as it is, but I want to add some other debug-info reviewers in case anyone has a stronger opinion.

I think I'd prefer one of these two alternatives:

  1. Allow passing an end/rend iterator as currently proposed in this patch.
  2. Add code comments explaining that the iterators must point to a MachineInst (end()/rend() is not supported). And then add asserts to verify that.

The reason why I went for proposing alternative 1 was to make it possible to find the last DebugLoc in the MBB using findPrevDebugLoc(instr_end()). If we disallow passing the end iterator, or if consider a third alternative to return an empty DebugLoc in such situations, then we'd need to add a new findDebugLocBackward helper to support that use case (or the user would need to explicitly check the last MI before using findPrevDebugLoc).

Orlando accepted this revision.May 24 2023, 8:43 AM

That's reasonable, and option 1 still looks good to me given what we've discussed, so LGTM. Thanks!

This revision is now accepted and ready to land.May 24 2023, 8:43 AM
dblaikie added inline comments.
llvm/lib/CodeGen/MachineBasicBlock.cpp
1489–1491

This looks like unrelated reformatting? Could you remove changes like this from this patch?

bjope added inline comments.May 25 2023, 2:46 AM
llvm/lib/CodeGen/MachineBasicBlock.cpp
1489–1491

It's not unrelated to "Clean up code comments and code formatting related to the functions mentioned above.", but maybe you want all those things in a separate commit?

bjope added inline comments.May 25 2023, 4:03 AM
llvm/lib/CodeGen/MachineBasicBlock.cpp
1489–1491

For info: I'll split this up in two commits when landing. One that adds test cases and cleans up code comments etc. And one that modifies MachineBasicBlock::rfindDebugLoc to avoid crashing on the rend iterator..

@dblaikie, I think that makes sense. Thanks!

https://reviews.llvm.org/rGa23f9846163956b74ab578bc972415c015022d10
added llvm/unittests/CodeGen/MachineBasicBlockTest.cpp
which is leading to uninitialized memory errors:

https://lab.llvm.org/buildbot/#/builders/239/builds/2416

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0xaaaad2ec0254 in operator new(unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:95:3
    #1 0xaaaad821b360 in operator new /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/Metadata.cpp:531:40
    #2 0xaaaad821b360 in llvm::MDTuple::getImpl(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>, llvm::Metadata::StorageType, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/Metadata.cpp:937:20
    #3 0xaaaad800a32c in getTemporary /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/IR/Metadata.h:1395:24
    #4 0xaaaad800a32c in llvm::DIBuilder::createFunction(llvm::DIScope*, llvm::StringRef, llvm::StringRef, llvm::DIFile*, unsigned int, llvm::DISubroutineType*, unsigned int, llvm::DINode::DIFlags, llvm::DISubprogram::DISPFlags, llvm::MDTupleTypedArrayWrapper<llvm::DITemplateParameter>, llvm::DISubprogram*, llvm::MDTupleTypedArrayWrapper<llvm::DIType>, llvm::MDTupleTypedArrayWrapper<llvm::DINode>, llvm::StringRef) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/DIBuilder.cpp:854:7
    #5 0xaaaad30c3a8c in (anonymous namespace)::FindDebugLocTest_DifferentIterators_Test::TestBody() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/unittests/CodeGen/MachineBasicBlockTest.cpp:47:11
    #6 0xaaaad90750e0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc

(this was the closest Differential revision I could find for this file)

bjope added a comment.May 25 2023, 9:35 AM

https://reviews.llvm.org/rGa23f9846163956b74ab578bc972415c015022d10
added llvm/unittests/CodeGen/MachineBasicBlockTest.cpp
which is leading to uninitialized memory errors:

https://lab.llvm.org/buildbot/#/builders/239/builds/2416

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0xaaaad2ec0254 in operator new(unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:95:3
    #1 0xaaaad821b360 in operator new /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/Metadata.cpp:531:40
    #2 0xaaaad821b360 in llvm::MDTuple::getImpl(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>, llvm::Metadata::StorageType, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/Metadata.cpp:937:20
    #3 0xaaaad800a32c in getTemporary /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/IR/Metadata.h:1395:24
    #4 0xaaaad800a32c in llvm::DIBuilder::createFunction(llvm::DIScope*, llvm::StringRef, llvm::StringRef, llvm::DIFile*, unsigned int, llvm::DISubroutineType*, unsigned int, llvm::DINode::DIFlags, llvm::DISubprogram::DISPFlags, llvm::MDTupleTypedArrayWrapper<llvm::DITemplateParameter>, llvm::DISubprogram*, llvm::MDTupleTypedArrayWrapper<llvm::DIType>, llvm::MDTupleTypedArrayWrapper<llvm::DINode>, llvm::StringRef) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/DIBuilder.cpp:854:7
    #5 0xaaaad30c3a8c in (anonymous namespace)::FindDebugLocTest_DifferentIterators_Test::TestBody() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/unittests/CodeGen/MachineBasicBlockTest.cpp:47:11
    #6 0xaaaad90750e0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc

(this was the closest Differential revision I could find for this file)

Thanks!
I've actually been working on a fix (struggled a bit to verify it since I'm not so used to running single unittests with asan, so it took some time to rebuild/rerun all tests using asan).
But I've pushed a fix now that hopefully should work: https://reviews.llvm.org/rG28d418b16c6cd4836ad9d8bc8006d3dabe3c5559

https://reviews.llvm.org/rGa23f9846163956b74ab578bc972415c015022d10
added llvm/unittests/CodeGen/MachineBasicBlockTest.cpp
which is leading to uninitialized memory errors:

https://lab.llvm.org/buildbot/#/builders/239/builds/2416

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0xaaaad2ec0254 in operator new(unsigned long) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:95:3
    #1 0xaaaad821b360 in operator new /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/Metadata.cpp:531:40
    #2 0xaaaad821b360 in llvm::MDTuple::getImpl(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>, llvm::Metadata::StorageType, bool) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/Metadata.cpp:937:20
    #3 0xaaaad800a32c in getTemporary /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/include/llvm/IR/Metadata.h:1395:24
    #4 0xaaaad800a32c in llvm::DIBuilder::createFunction(llvm::DIScope*, llvm::StringRef, llvm::StringRef, llvm::DIFile*, unsigned int, llvm::DISubroutineType*, unsigned int, llvm::DINode::DIFlags, llvm::DISubprogram::DISPFlags, llvm::MDTupleTypedArrayWrapper<llvm::DITemplateParameter>, llvm::DISubprogram*, llvm::MDTupleTypedArrayWrapper<llvm::DIType>, llvm::MDTupleTypedArrayWrapper<llvm::DINode>, llvm::StringRef) /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/lib/IR/DIBuilder.cpp:854:7
    #5 0xaaaad30c3a8c in (anonymous namespace)::FindDebugLocTest_DifferentIterators_Test::TestBody() /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/llvm/unittests/CodeGen/MachineBasicBlockTest.cpp:47:11
    #6 0xaaaad90750e0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-aarch64-linux-bootstrap-asan/build/llvm-project/third-party/unittest/googletest/src/gtest.cc

(this was the closest Differential revision I could find for this file)

Thanks!
I've actually been working on a fix (struggled a bit to verify it since I'm not so used to running single unittests with asan, so it took some time to rebuild/rerun all tests using asan).
But I've pushed a fix now that hopefully should work: https://reviews.llvm.org/rG28d418b16c6cd4836ad9d8bc8006d3dabe3c5559

Awesome, thanks Bjorn for the quick fix and reply!