This is an archive of the discontinued LLVM Phabricator instance.

[NFC][SuffixTree] Add debug log and some comments for suffix tree
Needs ReviewPublic

Authored by dongAxis1944 on Dec 15 2021, 11:58 PM.

Details

Summary

This is an NFC change for the suffix tree.

We use MachineOutliner to reduce our binary size recently, so I just look into the code of the suffix tree.
But I think it is hard to debug, so I use this NFC change to enhance this.
The log just like:

Node[Root#0]: StartIdx=4294967295, EndIdx=4294967295, Childrens = [3,2,1,]
Node[Leaf#1]: StartIdx=2, EndIdx=5, Childrens = []
Node[Leaf#2]: StartIdx=0, EndIdx=5, Childrens = []
Node[Leaf#3]: StartIdx=1, EndIdx=5, Childrens = []

BTW, I found the suffix tree does not have a default terminator in the array.

Diff Detail

Unit TestsFailed

Event Timeline

dongAxis1944 created this revision.Dec 15 2021, 11:58 PM
dongAxis1944 requested review of this revision.Dec 15 2021, 11:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2021, 11:58 PM
dongAxis1944 retitled this revision from [NFC] Add debug log and some comments for suffixtree to [NFC][SuffixTree] Add debug log and some comments for suffix tree.Dec 15 2021, 11:59 PM
dongAxis1944 added a reviewer: AndrewLitteken.
dongAxis1944 edited the summary of this revision. (Show Details)Dec 16 2021, 12:02 AM
dongAxis1944 edited the summary of this revision. (Show Details)

This may greatly increase memory usage. You can use #ifndef NDEBUG to not affect production usage.

BTW, I found the suffix tree does not have a default terminator in the array.

A (generalized) suffix tree does not necessarily need the '\0' terminal character as some textbooks require. See https://codeforces.com/blog/entry/16780

address comment

This may greatly increase memory usage. You can use #ifndef NDEBUG to not affect production usage.

BTW, I found the suffix tree does not have a default terminator in the array.

A (generalized) suffix tree does not necessarily need the '\0' terminal character as some textbooks require. See https://codeforces.com/blog/entry/16780

thanks, I will check the page you send. But considering the ut in https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Support/SuffixTreeTest.cpp#L108, the developer needs to set the unique terminator by hand or they might get the wrong result.

paquette added inline comments.Dec 17 2021, 10:56 AM
llvm/include/llvm/Support/SuffixTree.h
150

More concise:

The last element of Str is expected to be unique.

372

May be worth printing "<null>" or something here, if this is for debugging purposes.

374

Maybe it would make sense to use SmallString?

https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallstring-h

375
383

A little easier to read with a space?

384

More consistent spacing?

385

Grammar

387

Adding space since it's a little easier to read.

Also can we omit the extra ", " at the end?

E.g.

Children = [1, 2, 3]

rather than

Children = [1, 2, 3,]
llvm/lib/Support/SuffixTree.cpp
79

Comment explaining why this is here?