Jamie McCarthy made an interesting point about type safety in embedded SQL on String-Plus:
SQL is a great example for this. Relational databases are more useful with strong typing, so EMPLOYEE_ID is incompatible with PRODUCT_ID even if they are both implemented as INT. It'd be a great idea to see those constraints implemented at the perl level, presumably by giving perl more knowledge of the database schema than even the database engine has.
Imagine that you have, or can write, a little language parser for a SQL-like language. My simple example was:
SQL {{
UPDATE users SET address = { Address $address } WHERE user = { User $user }
}}
This can decompose into several operations:
- Get the value of the
$address
variable. - Get the primary key of the
$user
variable. - Prepare a database query with a rewritten query string which uses placeholders for the
$address
and$user
variables to avoid SQL injection and other interpolation errors. - Execute the query.
That's a nice interface, but you can do better. As I suggested, you can add error checking if you know the structure of the database:
- Get the metadata which describes the
users
table. - Verify that the required fields (
address
anduser
exist). - Get the value of the
$address
variable. - Get the primary key of the
$user
variable. - Prepare a database query with a rewritten query string which uses placeholders for the
$address
and$user
variables to avoid SQL injection and other interpolation errors. - Execute the query.
You can take advantage of type checking too:
- Get the metadata which describes the
users
table. - Verify that the required fields (
address
anduser
exist). - Verify that the type of
$address
is compatible with the type of theaddress
field. Repeat for$user
anduser
. - Get the value of the
$address
variable. - Get the primary key of the
$user
variable. - Prepare a database query with a rewritten query string which uses placeholders for the
$address
and$user
variables to avoid SQL injection and other interpolation errors. - Execute the query.
If you know the structure of the database when the program starts, you can start to push some of this type checking to the point of compilation. (You may not be able to perform all of the type checking at compilation time, but you can do as much as possible as early as possible to prevent as many errors as possible.)
That's simple and easy. Now imagine something more interesting:
SQL {{
SELECT name, address FROM users, addresses GIVEN { User $user }
}}
It's obvious from the syntax of the query language that the database needs to perform a join operation, and it's obvious that the primary key of the $user
object is the important key of the operation. If the program knows the relationship of the users
and addresses
tables, it can join them effectively as well.
Don't get caught up in the syntax or the semantics of the remainder of examples here; they exist to demonstrate possibilities, not the final form of battle-tested code. Even so, imagine a dynamic query:
SQL {{
SELECT @fields FROM { Table $table_one }, {Table $table_two } }
}}
Again the structure and intent of the code is obvious. The operations are now:
- Find the primary keys for
$table_one
and$table_two
. - Verify that they're joinable.
- Verify that all members of
@fields
are present in either$table_one
or$table_two
. - Construct the query.
If I were to implement this, I'd make a join_tables
multimethod. It takes two arguments (generalizable to more, but follow along with two for now). Imagine that it looks something like this:
multi join_tables( Table $t1, Table $t2 ) { ... }
multi join_tables( Any, Any ) { fail() }
Given two Table
arguments, the first multi candidate matches and gets called. Given any other combination of arguments, the second candidate matches and produces an error.
Knowing that you have two Table
objects isn't enough, however. The tables might have no relationship to each other. Imagine if you somehow could verify that the tables have an appropriate relationship. If I were to implement this, I might check that the keys of the tables matched types, perhaps with a syntax something like:
multi join_tables ( Table $t1, Table $t2 where { $t1.primary_key eqv $t2.foreign_key( $t1 ) } ) { ... }
That is, the keys must be of equivalent types. If one key is a
user_id
and the other is an Integer
, the where clause
won't match for this candidate, so a different multi will get called.
Now imagine that for those embedded SQL minilanguage statements where table name is available at compilation time and sufficient type information exists to verify the statements themselves at compilation time:
SQL {{
SELECT name, address FROM { User users }, { Address addresses }
}}
... then everyone who uses this minilanguage (and has set up the table information appropriately) gets safety and correctness by default. Some of that can even occur before the program runs. The rest of it can occur as the program runs.
(A really, really good type checker and optimization system could infer that some errors are impossible even if it can't prove the use of a single type in every case.)
Now imagine that you have a language which allows you to build minilanguages like this, to build APIs which specify correct operations and fall back to good error reporting on incorrect operations, and which do so without interfering with other code and other extensions.
Welcome to Perl 6.