This is an archive of the discontinued LLVM Phabricator instance.

Add an expression parser for Go
ClosedPublic

Authored by ribrdb on Sep 22 2015, 3:12 PM.

Details

Summary

Add a parser and interpreter for go expressions.

Diff Detail

Repository
rL LLVM

Event Timeline

ribrdb updated this revision to Diff 35424.Sep 22 2015, 3:12 PM
ribrdb retitled this revision from to Add an expression parser for Go.
ribrdb updated this object.
ribrdb set the repository for this revision to rL LLVM.
ribrdb added a subscriber: lldb-commits.
clayborg resigned from this revision.Sep 23 2015, 2:25 PM
clayborg removed a reviewer: clayborg.

We need to get Sean Callanan as a reviewer on this. Jim is out for a few weeks, so Sean will need to OK this.

Even though clang isn't done this way for historical reason, I would like to see the Go expression parser files (.cpp and .h) over into "source/Plugins/ExpressionParser/Go". The following files should be moved:

include/lldb/Expression/GoAST.h
include/lldb/Expression/GoLexer.h
include/lldb/Expression/GoParser.h
include/lldb/Expression/GoUserExpression.h
source/Expression/GoLexer.cpp
source/Expression/GoParser.cpp
source/Expression/GoUserExpression.cpp
source/Expression/gen_go_ast.py
unittests/Expression/GoParserTest.cpp
ribrdb updated this revision to Diff 35565.Sep 23 2015, 3:47 PM
ribrdb added a reviewer: spyffe.

Moved the new files into Plugins/ExpressionParser/Go.

I see you moved the JIT code over into ClangUserExpression. I was wondering if all languages might not want to generate llvm IR and then let the JIT make code from the llvm IR? This is the reason were originally left the IR in the UserExpression. IR is very generic and can be applied to just about anything. Any interest in using llvm IR for the Go expression parser? I am assuming you have made a expression tree that you evaluate. This would be very similar to what we do. When we evaluate an expression currently we generate the llvm IR and then we say "that is easy to emulate (just a bunch of loads and stores), so we don't actually need to JIT it up, we can just emulate the IR. If it is too complex, we pass this along to the JIT and let it generate code for us.

Hmm. I assumed you're using clang to generate llvm IR, is that not the case?
I don't really want to write a whole go compiler.
Right now I'm interpreting the AST using the ValueObject API. This also lets me reuse the error checking and things implemented in the type system. Reimplementing this in IR seems like it would be more complicated and harder to understand/maintain.
Another concern I have is that the go stack is not compatible with c. If we're going to execute jited code we'd probably also need to allocate a separate stack.

I suppose JIT might make more sense if we want to implement more complicated expressions like looping and calling in to go code.
There's lots of issues to solve before we can call into go code however. In addition to the stack I'm not sure what to do about the GC. Also I believe at least some parts of the go runtime are assuming knowledge of the complete call graph and may have issues if we're calling functions from an unexpected place.
So want to just start with something simple.

ribrdb updated this revision to Diff 36293.Oct 1 2015, 1:49 PM

Merged with head.

brucem added a subscriber: brucem.Oct 13 2015, 7:55 PM

Hopefully some helpful comments that will help keep this code in line with changes that we're making to other parts of the codebase in bulk ...

include/lldb/Symbol/GoASTContext.h
394

Don't need virtual here since it already has override (most of the code that has override doesn't have virtual as well).

source/Plugins/ExpressionParser/Go/GoAST.h
182

I think a lot of these destructors could be ~GoASTArrayType override = default; ? (for this and other classes here)

184

This should have override on it.

210

This should have override as well.

242

override here too and the rest of the GetKindName overrides.

2005

Could this default go away if all of the possible values for that enumeration are covered here? If so, we'd get a compile time warning here when someone adds something to the enumeration.

ribrdb updated this revision to Diff 37787.Oct 19 2015, 1:39 PM

Style updates

jingham requested changes to this revision.Oct 20 2015, 1:25 PM
jingham edited edge metadata.

So before getting into the details of the patch, there's a structural bit it would be nice to fix.

When I was separating out bits of the Expression machinery, I assumed that even though all expression parsers would use different front-ends to llvm, they would all use llvm & the llvm JIT to run the expressions. Your Go expression parser doesn't work that way. However, the correct thing to do, then, is the make UserExpression the empty base class, then what used to be in UserExpression should go in LLVMUserExpression, then ClangUserExpression derives from LLVMUserExpression, but GoUserExpression derives from UserExpression directly. That structure will make it easier to share the LLVM back end bits of expression running machinery, which really shouldn't go in ClangUserExpression since it isn't Clang specific.

Would you mind reworking it this way?

This revision now requires changes to proceed.Oct 20 2015, 1:25 PM
ribrdb updated this revision to Diff 37944.Oct 20 2015, 4:26 PM
ribrdb edited edge metadata.

Thanks for taking a look!
I moved the expression jit'ing code into LLVMUserExpression.

The generic parts of this change look fine to me, with a few inlined style comments.

I didn't read the Go specific parts of this in detail, I assume you're going to get those right. I mentioned a couple of style things inline, though I didn't mark everywhere that they occurred. Particularly, the construct:

Foo::Foo () :

m_ivar1
, m_ivar2

etc. looks very odd, and is not the way we do it anywhere else in lldb. Please don't write it that way.

Also, we try to always call shared pointers foo_sp and unique pointers foo_up. The life cycle of these guys is enough different from pointers that it is good to be able to see what they are in code without having to trace them back to the definition.

With those style changes this is okay by me.

source/Expression/LLVMUserExpression.cpp
44–61

Don't write initializers with the commas in front like this. We don't do it this way anywhere else in lldb, and it looks really weird...

source/Expression/UserExpression.cpp
48–55

Ditto, we don't write initializers this way. Move the commas to the end of the line.

source/Plugins/ExpressionParser/Go/GoAST.h
209–210

Same comment about formatting...

251–252

We usually postpend with _sp or _up variables that are shared or unique pointers. They have sufficiently interesting lifecycle that it is helpful to know at a glance what kind of thing they are.

263–264

Here too...

325–326

Maybe m_lhs_up, etc.

ribrdb updated this revision to Diff 38050.Oct 21 2015, 2:29 PM
ribrdb edited edge metadata.

More formatting fixes.
Also updated .clang-format to fix the settings for constructor initializers.

Is this ready to submit?

source/Expression/LLVMUserExpression.cpp
44–61

I've updated the clang-format config to fix this.

jingham accepted this revision.Oct 29 2015, 1:44 PM
jingham edited edge metadata.

Yes, that looks good.

This revision is now accepted and ready to land.Oct 29 2015, 1:44 PM
dawn closed this revision.Nov 9 2015, 1:19 PM
dawn added a subscriber: dawn.

This was committed in svn r251820.