This is an archive of the discontinued LLVM Phabricator instance.

Keep sqlalchemy session separate from database objects
ClosedPublic

Authored by MatzeB on Aug 31 2017, 3:33 PM.

Details

Summary

SQLAlchemy sessions should be handled independently of the lifetimes of
objects such as V4DB and TestSuiteDB (which basically represent sqlalchemy models).

See also the SQLAlchemy documentation:
http://docs.sqlalchemy.org/en/latest/orm/session_basics.html esp. the
"When do I construct a Session, when do I commit it, and when do I close
it?"; "As a general rule, keep the lifecycle of the session
separate and external from functions and objects that access and/or
manipulate database data. This will greatly help with achieving a
predictable and consistent transactional scope."

This requires a bunch of mechanical code changes to make sure the
session object is passed around separately and a session needs to be
created explicitely now rather than getting a long lived session as part
of V4DB.

Advantages:

  • We can construct the DB object and the testsuite table descriptions once when LNT starts. This avoids a bunch of database queries and sqlalchemy model building logic running again for each request.
  • We can switch session.rollback() to session.close() in an upcoming commit, avoid countless unnecessary database queries happening on rollback().
  • Avoids a bunch of obscure caching effects with long lived sessions helping to clean up LNT code (the cleanup already happened in r312059)
  • I hope this will help some of the server instabilities we experience (it's hard to pinpoint the cause though, it may be something different).

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB created this revision.Aug 31 2017, 3:33 PM

I have felt before that LNT's connection and interaction with the database is too obscure in the code.
Therefore, I like this proposal, even if it means passing around session objects more.

I've even wondered before if we shouldn't just get rid of SqlAlchemy and write sql directly to allow us to speed up some of the heavy queries.
That would make the interaction with the database in LNT's code clearer again, removing an obfuscating layer of indirection.
However, I do see that with now supporting 3 database engines (sqlite, postgres, mysql), we may not want to have to reinvent the wheel on handling the database differences that SqlAlchemy already handles.
Anyway, the SqlAlchemy thought is clearly independent of this particular change.

I've been out of the loop on python & database interactions for too many years to be able to properly review this, but as said above, I do like the direction this is taking LNT's design/code base.

grosser edited edge metadata.Aug 31 2017, 11:50 PM

I agree with Kristof. Even though I did not write DB code since a while, the reasoning sounds very good. Hope this fixes some of the errors we see!

MatzeB added a comment.Sep 1 2017, 1:38 PM

I have felt before that LNT's connection and interaction with the database is too obscure in the code.
Therefore, I like this proposal, even if it means passing around session objects more.

I've even wondered before if we shouldn't just get rid of SqlAlchemy and write sql directly to allow us to speed up some of the heavy queries.
That would make the interaction with the database in LNT's code clearer again, removing an obfuscating layer of indirection.
However, I do see that with now supporting 3 database engines (sqlite, postgres, mysql), we may not want to have to reinvent the wheel on handling the database differences that SqlAlchemy already handles.
Anyway, the SqlAlchemy thought is clearly independent of this particular change.

I've been out of the loop on python & database interactions for too many years to be able to properly review this, but as said above, I do like the direction this is taking LNT's design/code base.

Thanks for the review. Chris wanted to see this change too when I talked with him yesterday so I'm going ahead and commit it.

As for the rest: I feel the same way. I also think we would be better off with a set of helper functions to construct SQL queries rather than a full blown ORM layer. While doing simple things with sqlalchemy is simple enough, getting really accustomed to it and knowing how to debug and create performant queries takes weeks. This is unfortunate as most potential LNT developers will have a background in compiler technology and not in database programming.

That said I won't be the person rewriting all the code to use plain SQL (or some lower level abstraction as sqlalchemies DDL without using the full blown ORM); I'm also sure we can get the existing code working well so sqlalchemy is mainly a higher entry barriers for developing LNT (at least if someone wants to make fundamental changes).

This revision was automatically updated to reflect the committed changes.