So if my understanding is correct, the RethinkDB driver translates a Python function object into its "abstract interpretation" by invoking the function on a proxy object x that defines operators such that (simplifying) x.__op__() returns Op(x). Recursively, this yields the computation tree of the function. This looks very similar to what is done in C++ metaprogramming through "expression templates", a technique that is taken to its extremes in libraries such as Boost.Proto. C# solves this more cleanly by having in the language an explicit concept of "expression trees" that are given as arguments of LINQ queries.
I wonder if this imposes any limitations on the functions that can be written. For example, how are if, for statements handled? What happens if I do "lambda x: y + x" where "y" is an object that defines __add__? If these cases are not handled, is a well-defined exception raised?
We overload operators when we can (unfortunately no operator overloading for Javascript folks) and it works really well.
You don't always get a great exception, but most of the time this turns out to be fine. Novices don't need to worry about this too much, and we found that as people get more into ReQL, they pick everything up extremely quickly, and they rarely confuse things or make mistakes.
I wonder if, for "if," you could just detect that your magical variable was being cast to a boolean, and run the function twice, once where it returns true, and once false, so you could exercise both branches. I guess this could get a little pathological if you had multiple conditionals and their nesting was really complicated, but I bet you could do simple conditionals...
There are two alternatives to what RethinkDB does: parsing the function's associated source file (yucky and breaky when multiple lambdas per line) or disassembling its func_code..
I've looked at this problem before (and ran away screaming every time), but the approach I tended towards was func_code disassembly, pretty much for the reasons you mention. Was this ever considered?
We did look into disassembly, but decided that it's far too much work because it is not portable across languages and drivers, and would require an enormous engineering effort (both for ourselves and for community driver maintainers).
From a purely theoretical POV, I do like the idea of disassembly better, but unfortunately we couldn't make this work and remain pragmatic.
Well this is just lovely! It's so similar and such a good fit for what we're doing in Delver (we build very similar abstract queries out of natural language, but for compilation into SQL etc) that it makes me want to force everyone to use RethinkDB. Our lives are maybe a little easier cos we do it in Clojure so it's s-expressions in, s-expressions out.
Can't see much evidence of a clojure driver yet? Is there an existing project we could contribute to?
I'd love to see a project bring RethinkDB to the Clojure community-- we haven't seen one emerge yet. If you'd like to contribute (or better yet, start the project!) shoot me an email at mike [at] rethinkdb.com.
I don't believe there is one, but if you choose to contribute one, please shoot me an e-mail to slava@rethinkdb.com if you have any questions. I'd love to help you get started! (given the current proliferation of drivers, you can probably create a reasonably complete one in a very short period of time)
and thus we learn that RethinkDB is an elaborate scheme to get programmers to learn Higher Order Abstract Syntax techniques, not just have a neat db :)
Hoas, and its sibling PHOAS, are great ways of embedding a mini language inside another language, they're especially powerful in language like Haskell, Scala, Idris, and the like.
I don't know. I have been working on a Java/Scala driver too and thinking how you could translate a scala lambda to this kind of representation It would be really cool to do
db("tablename").filter( x => x.name == "foo" )
and be able to translate that to the relevant query, but I can't really think of a way without using Macros or reflection.
I was thinking maybe a path type syntax, yeah you have to use strings for your field names but this way you can wrap it in a 'get' node .. Seems to have worked well with Play! json classes
slava @ rethink here. If you didn't already you might want to join rethinkdb-dev google group (https://groups.google.com/forum/?fromgroups#!forum/rethinkdb...). We don't have Scala people, but we if you post a question we'll do the best we can to help you out!
This looks a lot like LINQ. C# is probably one of those "less dynamic languages" but has had this for years. It powers a lot of ORMs and other frameworks that need a queryable expression language. It supports writing custom 'providers' so that you can write LINQ against almost anything.
I'm the author of a .NET RethinkDB client (https://github.com/mfenniak/rethinkdb-net), and I can add that RethinkDB's lambda functions were very easy to implement in .NET by using C#'s expression tree feature (http://msdn.microsoft.com/en-us/library/bb397951.aspx). With that feature, a lambda can automatically be converted to an AST by creating a function that takes an Expression<Func<...>> parameter, which can easily be traversed and converted into RethinkDB's expression tree.
Expression trees were introduced at the same time as LINQ, to support LINQ.
That's a good question. I've given some thought to how we might build drivers in languages that don't support lambda functions and unfortunately, there don't seem to be any good answers.
The actual underlying S-expression syntax for ReQL lambda functions is pretty ugly by itself and I'm glad that so far we haven't had to expose it. The python `lambda x,y: x + y` actually gets compiled to something like the following: `(func [1,2] (add (var 1) (var 2)))`. Variable references have to be constructed manually no matter how you do it. With native lambda functions though you can at least hide those constructions behind the scenes and bind them to the function's formal arguments.
Fortunately though both C++ and Java, the two "traditional" languages we would most like to support, have either just (in C++11) or are about to (in Java 8) introduce lambda functions rendering the point moot.
It wouldn't be hard to have a very shallow translation from a lisp-alike syntax - to borrow Clojure's: "(fn [a b] (+ a b))". Named parameters translate trivially to (var n).
In the go driver, it does basically what the Javascript one does and just takes an Expression object, with methods like .Add(<Expression>) and .Mul(<Expression>) that also return Expressions
To be fair, this is an advanced article written by a computer scientist that designed the innards of the drivers for other computer scientists that want to understand how the drivers work.
Novices can do most things without having to understand any of these issues, and even very advanced users don't ever have to learn this if they don't want to.
Disclaimer: I'm one of the people responsible for designing ReQL.
Maybe it needs to be more clear that there is a simple and a hard way to work with ReThinkDB, and whenever someone finds themself in the "hard way" documentation, that you provide clear instructions that this is the hard way and the documentation for the easy way is somewhere else. The assumption for the novice developer is that "here is the documentation, there's often just one way to do it, so I'd better read this and learn it". It's not clear that "this is the hard way, you should avert your eyes".
Maybe try renaming "ReQL" to ThinkSimpleQuery and the other one to ThinkBigQuery. Something more self explanatory.
This post covers all aspects of lambda functions in ReQL from
concept to implementation and is meant for third party driver
developers and those interested in functional programming and
programming language design. For a more practical guide to using
ReQL please see our docs where you can find FAQs, screencasts, an
API reference, as well as various tutorials and examples.
I wonder if this imposes any limitations on the functions that can be written. For example, how are if, for statements handled? What happens if I do "lambda x: y + x" where "y" is an object that defines __add__? If these cases are not handled, is a well-defined exception raised?