This is an archive of the discontinued LLVM Phabricator instance.

[Expression parser] Look up module symbols before hunting globally
ClosedPublic

Authored by spyffe on May 10 2017, 6:48 PM.

Details

Summary

When it resolved symbol-only variables, the expression parser currently looks only in the global module list. It should prefer the current module.

I've fixed that behavior by making it search the current module first, and only search globally if it finds nothing. I've also added a test case.

Diff Detail

Repository
rL LLVM

Event Timeline

spyffe created this revision.May 10 2017, 6:48 PM
clayborg edited edge metadata.May 10 2017, 7:12 PM

There is a SymbolContext function sorts types based on the context contained in a SymbolContext. Seems like we should do the same thing here? See SymbolContext::SortTypeList().

Yes, this seems obviously right.

Actually, I take that back. Why do you have to call FindGlobalDataSymbol twice? Shouldn't FindGlobalDataSymbol do that work for you. After all you passed in the module. It should itself prefer symbols in the module...

It also seems wrong that we're just picking the first one in the case where we find two symbols at the same level distant from the current module. Shouldn't that be an error?

Actually, I take that back. Why do you have to call FindGlobalDataSymbol twice? Shouldn't FindGlobalDataSymbol do that work for you. After all you passed in the module. It should itself prefer symbols in the module...

It also seems wrong that we're just picking the first one in the case where we find two symbols at the same level distant from the current module. Shouldn't that be an error?

Yes, that is why I suggested always doing a full search and then with a SymbolContextList that results by sorting things from the currently module (maybe more than 1), and then from other modules (maybe more than 1). Then the picking would start:
1- prefer exported symbols from the current module, if more than 1 match this criteria, then error
2- prefer exported symbols from other modules, if more than 1 match this criteria, then error
3 - prefer private symbols from current module, if more than 1 match this criteria, then error
4 - prefer private symbols from other modules, if more than 1 match this criteria, then error

Maybe 2 and 3 should be switched? I don't think so, but others might.

Bonus point for implementing a global symbol search that is aware of the above rules and can stop after step 1 if there are matches, and after step 2 if there are matches, etc.

If you just have symbols, you can't tell whether private symbols were visible to the current frame or not, but the likelihood is not 0 (I guess it's something like 1/number of CU's). OTOH, your expression could never have seen a private symbol from another module, and 1/nCU > 0. So I would weakly argue for flipping 2 & 3.

But note that the real solution to that problem is implementing some syntax for "symbol from CU", "symbol from Function" etc, as we've discussed in the past. Then what we do by default will be less important, since you have a way to override it.

I also would flip 2 and 3. I'll give this a shot.

spyffe updated this revision to Diff 98672.May 11 2017, 1:23 PM

Improved symbol lookup per Greg and Jim's suggestion, with flipped steps 2 and 3.
Also added testing for the error case.

The searching code is indeed better and what we now want, but I question of this code should live in ClangExpressionDeclMap::FindGlobalDataSymbol. A better home for this might be in SymbolContext::FindGlobalDataSymbol? Then others can use this functionality. Other clients must want this kind of functionality (other expression parsers). Seems a shame to have it live somewhere where someone would have to copy the code. Thoughts anyone?

I think Greg's right. I can't see anything expression parser specific to how you do this search. I was a little worried that a lot of the other searches will return multiple names (maybe sorting to indicate closeness.) So this one might be confusing. But the API name says you are looking for ONE data symbol, so the behavior when you can't tell amongst the matches actually makes sense.

Yeah, we need to document exactly what this is doing: looking for the best symbol as an expression parser would want it, preferring symbols from the current module in the SymbolContext, and then looking everywhere. Errors if multiple exist in current module, or in any other modules if not in current, bla bla.

labath added a subscriber: labath.May 15 2017, 3:42 AM

Is this feature really darwin specific? Isn't the __private_extern__ thingy equivalent to __attribute__(visibility("hidden"))), which is supported by gcc and clang alike?

packages/Python/lldbsuite/test/lang/c/conflicting-symbol/Makefile
1 ↗(On Diff #98672)

For a portable way to write a Makefile with multiple shared libraries, please look at testcases/lang/cpp/namespace_definitions. The solution is not very elegant, but at least it avoid hardcoding .dylib everywhere.

spyffe updated this revision to Diff 99084.May 15 2017, 5:11 PM

Two changes after Greg and Pavel's suggestions:

  • Moved the symbol lookup function into the global context; and
  • Rewrote the test case's build system to be more generic.
labath added inline comments.May 16 2017, 2:06 AM
packages/Python/lldbsuite/test/lang/c/conflicting-symbol/One.mk
1 ↗(On Diff #99084)

Thanks for the effort. It almost works for me :), except for the part where you clear out the CFLAGS. We cannot do that, as CFLAGS sometimes contains important flags that we cannot compile without. The "canonical" way to compile without debug info is to use the CFLAGS_NO_DEBUG flag and write the compile rule yourself.

This makefile worked for me, and I hope it will work for you as well as it was pieced together from existing tests:

LEVEL := ../../../make

DYLIB_NAME := Two
DYLIB_C_SOURCES := Two/Two.c Two/TwoConstant.c
DYLIB_ONLY := YES

include $(LEVEL)/Makefile.rules

CFLAGS += -fPIC

Two/TwoConstant.o: Two/TwoConstant.c
	$(CC) $(CFLAGS_NO_DEBUG) -c $< -o $@
packages/Python/lldbsuite/test/lang/c/conflicting-symbol/TestConflictingSymbol.py
16 ↗(On Diff #99084)

Since the test does not depend on debug info, you may want to consider adding
NO_DEBUG_INFO_TESTCASE = True here to avoid running it multiple times.

spyffe updated this revision to Diff 99186.May 16 2017, 1:31 PM

Updated to reflect Pavel Labath's suggestions:

  • Used CFLAGS_NO_DEBUG where appropriate
  • Marked the testcase as having no debug info variants
clayborg requested changes to this revision.May 16 2017, 1:37 PM

Just a few makefile changes where CFLAGS is being modified and this will be good to go. Much better location for this. Sean, please keep an eye out for this kind of thing in the future, I believe there is a lot of type searching code and function searching code that could have the same thing done in the future.

packages/Python/lldbsuite/test/lang/c/conflicting-symbol/Makefile
6 ↗(On Diff #99186)

CFLAGS_EXTRAS is the way to add CFLAGS:

CFLAGS_EXTRAS += "-g -O0"
packages/Python/lldbsuite/test/lang/c/conflicting-symbol/One.mk
9 ↗(On Diff #99186)

avoid modifying CFLAGS, modify CFLAGS_EXTRAS and then use it if needed

packages/Python/lldbsuite/test/lang/c/conflicting-symbol/Two.mk
9 ↗(On Diff #99186)

avoid modifying CFLAGS, modify CFLAGS_EXTRAS

This revision now requires changes to proceed.May 16 2017, 1:37 PM
spyffe updated this revision to Diff 99211.May 16 2017, 3:01 PM
spyffe edited edge metadata.

Used CFLAGS_EXTRAS instead of CFLAGS at Greg Clayton's suggestion.

clayborg accepted this revision.May 16 2017, 3:02 PM

Very nice.

This revision is now accepted and ready to land.May 16 2017, 3:02 PM
This revision was automatically updated to reflect the committed changes.