This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] [NFC] Rename SymbolRef to SymExprRef
Needs ReviewPublic

Authored by ASDenysPetrov on Oct 22 2020, 2:44 PM.

Details

Summary

Declaration of SymbolRef is confusing and embarrassing to percept the code easily, especially when working along with symbolic regions, values, expressions (SVal, SymbolVal, SymbolData, etc.). Every time it takes time to recall a correct aliasing in mind. Rename SymbolRef to SymExprRef would have a benefit.

Before:

using SymbolRef = const SymExpr *;

After:

using SymExprRef = const SymExpr *;

P.S. In some files you can see bigger chunks than one-word-renaming. It is just auto-formatting changes.

Diff Detail

Event Timeline

ASDenysPetrov created this revision.Oct 22 2020, 2:44 PM
ASDenysPetrov requested review of this revision.Oct 22 2020, 2:44 PM
ASDenysPetrov edited the summary of this revision. (Show Details)
ASDenysPetrov edited the summary of this revision. (Show Details)Oct 22 2020, 2:47 PM

Since SymbolRef is just a const SymExpr * in the current codebase, I'd prefer using const SymExpr * directly, just like how MemRegion is used, which would be clearer than both SymbolRef and SymExprRef as far as I am thinking.

Different from ProgramStateRef which is an alias to IntrusiveRefCntPtr, or StoreRef which is a wrapper object, an alias to a const SymExpr * makes no sense to me.

And this is also where I have been confused for a long while.

Or on the opposite, do we also need a similar alias for a const MemRegion *, maybe say MemRegionRef? After all, it is shorter. :-)

@ASDenysPetrov
Please grep for the SymbolRef and rename the other symbols/comments as well, especially the compound names.

Since SymbolRef is just a const SymExpr * in the current codebase, I'd prefer using const SymExpr * directly, just like how MemRegion is used, which would be clearer than both SymbolRef and SymExprRef as far as I am thinking.

IMO using SymExprRef is more readable. However, I don't have any strong opinion on this.

Different from ProgramStateRef which is an alias to IntrusiveRefCntPtr, or StoreRef which is a wrapper object, an alias to a const SymExpr * makes no sense to me.
And this is also where I have been confused for a long while.

Or on the opposite, do we also need a similar alias for a const MemRegion *, maybe say MemRegionRef? After all, it is shorter. :-)

Yes, that would make the codebase more consistant.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymExpr.h
111

SymbolRefSmallVectorTy -> SymExprRefSmallVectorTy

ASDenysPetrov added a comment.EditedOct 23 2020, 6:35 AM

@OikawaKirie

Different from ProgramStateRef which is an alias to IntrusiveRefCntPtr, or StoreRef which is a wrapper object, an alias to a const SymExpr * makes no sense to me.

Yes. I omit this, because in such case we should go further and rename all those which are not real Ref to Ptr or smth that would emphasise that it's just a pointer alias, not a wrapper or another class.
That's why I prefered to change the name a little in favor of complex approach of renaming all the rest.

And this is also where I have been confused for a long while.

So have been I. The patch is called to make it more clear :)

Thanks to @steakhal comment I investigated more in terms of what other names use Symbol but mean SymExpr. They are:

class SymbolManager;
using SymbolID;
using SymbolDependTy;
class SymbolData;
class SymbolMetadata;
class SymbolReaper;
enum SymbolStatus;
using SymbolSetTy;
using SymbolMapTy;
class SymbolCast;
class SymbolVal;
class symbol_iterator;
etc.

This is not a full list! I also didn't count methods and file names.
There is also a list of objects names which straightly use SymExpr. They are less spread. Mostly they are derived classes:

class BinarySymExprImpl;
class SymIntExpr;
class IntSymExpr;
class SymSymExpr;
some enums, several methods, etc.

As a result we should accurately define the difference between Symbol and SymExpr.
I see the next options:

  1. Symbol and SymExpr are different. Leave the names as are. Fix minor mismatches if presented. And follow that definitions.
  2. Symbol and SymExpr are the same. SymExpr is a dominant one. Change all the names from Symbol to SymExpr and get rid of Symbol.
  3. Symbol and SymExpr are the same. Symbol is a dominant one. Change all the names from SymExpr to Symbol and get rid of SymExpr.
  4. Ignore the naming due to loss of git blame (or any other reason) like in an llvm naming rules topic.
  5. For now just rename SymbolRef alias as the beggining of the story in scope of 2nd option.

What do you think?

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymExpr.h
111

Thanks for a reasonable remark!

@OikawaKirie

Different from ProgramStateRef which is an alias to IntrusiveRefCntPtr, or StoreRef which is a wrapper object, an alias to a const SymExpr * makes no sense to me.

Yes. I omit this, because in such case we should go further and rename all those which are not real Ref to Ptr or smth that would emphasise that it's just a pointer alias, not a wrapper or another class.
That's why I prefered to change the name a little in favor of complex approach of renaming all the rest.

And this is also where I have been confused for a long while.

So have been I. The patch is called to make it more clear :)

Thanks to @steakhal comment I investigated more in terms of what other names use Symbol but mean SymExpr. They are:

class SymbolManager;
using SymbolID;
using SymbolDependTy;
class SymbolData;
class SymbolMetadata;
class SymbolReaper;
enum SymbolStatus;
using SymbolSetTy;
using SymbolMapTy;
class SymbolCast;
class SymbolVal;
class symbol_iterator;
etc.

This is not a full list! I also didn't count methods and file names.
There is also a list of objects names which straightly use SymExpr. They are less spread. Mostly they are derived classes:

class BinarySymExprImpl;
class SymIntExpr;
class IntSymExpr;
class SymSymExpr;
some enums, several methods, etc.

As a result we should accurately define the difference between Symbol and SymExpr.

There are some cases, when Symbol refers to the base-class of the inheritance hierachy. In those cases the renaming would be appropiate.
Though, I'm sure there are cases, when the Symbol refers simply to SymbolData - in which cases we should rename accordingly.

At the previous LLVM SA round table @xazax.hun also mentioned that we could have chosen better names. Maybe it's the time to have consensus about this.

I see the next options:

  1. Symbol and SymExpr are different. Leave the names as are. Fix minor mismatches if presented. And follow that definitions.
  2. Symbol and SymExpr are the same. SymExpr is a dominant one. Change all the names from Symbol to SymExpr and get rid of Symbol.
  3. Symbol and SymExpr are the same. Symbol is a dominant one. Change all the names from SymExpr to Symbol and get rid of SymExpr.
  4. Ignore the naming due to loss of git blame (or any other reason) like in an llvm naming rules topic.

Losing git blame would have a signifficant impact in deed, however we always have the option to add the renaming commit to the .git-blame-ignore-revs.
To achieve that, we should probably have a better reason behind it besides simply renaming stuff.

  1. For now just rename SymbolRef alias as the beggining of the story in scope of 2nd option.

What do you think?

NoQ added a comment.Oct 23 2020, 10:29 AM

Honestly i'd rather eliminate SymExpr and go with Symbol everywhere. It's an overloaded term but appending "Expr" to it doesn't really make it significantly less overloaded. We're also properly namespaced so there wouldn't be any linking issues with the rest of LLVM. "Symbol" is catchy and it's one of the most important concepts to grasp in the static analyzer, i'd love to keep it.

In D89987#2350649, @NoQ wrote:

Honestly i'd rather eliminate SymExpr and go with Symbol everywhere. It's an overloaded term but appending "Expr" to it doesn't really make it significantly less overloaded.

Why? I think calling it Symbol would make the situation a lot worse. We have symbolic expressions, values, memory regions, why shouldn't we dedicate a name to each?

"Symbol" is catchy and it's one of the most important concepts to grasp in the static analyzer, i'd love to keep it.

Hmm, am I lacking some prerequisite knowledge that would make sense out of calling this Symbol?

ASDenysPetrov added a comment.EditedDec 2 2020, 5:41 AM

@steakhal

Losing git blame would have a signifficant impact in deed, however we always have the option to add the renaming commit to the .git-blame-ignore-revs.

Using this git feature I can see no more obstacles to implement the renaming. Moreover LLVM repo already has similar instances in .git-blame-ignore-revs.

To achieve that, we should probably have a better reason behind it besides simply renaming stuff.

Untill we start this process we will mix this up every time. So, I would start the renaming with the blame-ignore marking.

Let's answer the question: What bad things can occur if we rename SymbolRef to SymExprRef? The only thing I'm not sure is that the name SymExprRef is not suit enough.