tern
is a refreshingly lightweight tool for handling database schema changes over time.
The best way to show this is with an example.
But first, an aside on terminology. In some contexts, "database migration" describes transitioning from one physical database to another, such as from Oracle to Postgres. But in this context, a "database migration" is simply a change to an existing database's schema.
Now, on to the example.
Setting the Stage
Let's pretend we have a new, greenfield project.
Our project will need a directory to hold our migrations..
$ mkdir migrations
Our new project starts with an empty database that will need the tables that will support our app. Our first migration file, named 001_creates_initial_setup.sql
, will create these initial items.
$ cat << EOF > ./migrations/001_creates_initial_setup.sql
create schema myapp;
create table myapp.users(
id bigint constraint users_pk primary key not null,
first_name text constraint first_name_required check (first_name != '') not null);
EOF
We've not even started using tern
and already there are LOTS of assumptions that tern
will work with this desired setup:
- We want to keep all of our migration files in the same directory
- We want our migration files to be plain text so that we can check them into git
- We want our migration files to be SQL and not some vendor-specific language
Running the First Migration
Let's run tern
and see if it can work with this setup.
$ tern migrate \
--migrations ./migrations \
--version-table myapp_schema_version
The above command ran fine!
Let's check to see if our application's schema and table exist in the app database:
$ psql -c '\dn myapp'
List of schemas
┌───────┬───────┐
│ Name │ Owner │
├───────┼───────┤
│ myapp │ app │
└───────┴───────┘
$ psql -c '\d myapp.users'
Table "myapp.users"
┌────────────┬────────┬───────────┬──────────┬─────────┐
│ Column │ Type │ Collation │ Nullable │ Default │
├────────────┼────────┼───────────┼──────────┼─────────┤
│ id │ bigint │ │ not null │ │
│ first_name │ text │ │ not null │ │
└────────────┴────────┴───────────┴──────────┴─────────┘
Note that we used tern
's --version-table
argument to tell tern
what to name our project's version tracking table. The myapp_schema_version
table should therefore exist in the public schema, and it should say our project's schema version is at "1".
$ psql -c '\d public.myapp_schema_version'
Table "public.myapp_schema_version"
┌─────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │ Type │ Collation │ Nullable │ Default │
├─────────┼─────────┼───────────┼──────────┼─────────┤
│ version │ integer │ │ not null │ │
└─────────┴─────────┴───────────┴──────────┴─────────┘
$ psql -c 'select version from public.myapp_schema_version'
┌─────────┐
│ version │
├─────────┤
│ 1 │
└─────────┘
tern
is making it easy for us to follow a useful pattern: we put all of our project's tables in a namespace (schema) named myapp
. Other apps can put tables in their own namespaces, allowing multiple apps to use the same database without tripping over each other.
Let's spell out two additional criteria that describe this pattern:
- We want our project/app to have its own namespace in the database
- We want our migration tool to keep schema version data per project in the same database
Evolving the Schema
Naturally, our project's schema needs to evolve over time. Let's add a last_name
column to my myapp.users
table.
$ cat << EOF > ./migrations/002_adds_last_name_to_users.sql
alter table myapp.users add column last_name text constraint last_name_required check (last_name != '') not null;
EOF
Now we have these migration files:
$ ls -1 migrations/
001_creates_initial_setup.sql
002_adds_last_name_to_users.sql
(Note that we have named migration files after what will happen if we apply them. Of course, you can adopt any naming scheme that works for your team.)
We run tern
again, and check the state of the database.
$ tern migrate \
--migrations ./migrations \
--version-table myapp_schema_version
$ psql -c '\d myapp.users'
Table "myapp.users"
┌────────────┬────────┬───────────┬──────────┬─────────┐
│ Column │ Type │ Collation │ Nullable │ Default │
├────────────┼────────┼───────────┼──────────┼─────────┤
│ id │ bigint │ │ not null │ │
│ first_name │ text │ │ not null │ │
│ last_name │ text │ │ not null │ │
└────────────┴────────┴───────────┴──────────┴─────────┘
$ psql -c 'select version from public.myapp_schema_version'
┌─────────┐
│ version │
├─────────┤
│ 2 │
└─────────┘
Any developer who checks out the git repo for this project and runs these tern
migrations will have a database schema in the same state as any other developer who does the same.
Cautionary Advice
Let's introduce a scenario where we decide that we don't need a last name column after all. One possibility is that we could git rm ./migrations/002_adds_last_name_to_users.sql
. That way, anyone who runs the migrations from start to finish on a fresh database will end up with a users table that only has the id
and first_name
columns.
However, downstream users and databases (such as production!) will already have the last_name
column in their users
table, and they will be unaware that the migration files were changed!
The Golden Rule of Database Migrations
- NEVER edit or delete an existing migration file.
- ALWAYS add a new migration file to make a schema change.
Always assume that downstream databases have applied every migration file, as that migration file was originally written and checked into git. If we decided that adding a last_name
column was a mistake that needs to be corrected, we would make a third migration file to undo the change (003_drops_last_name_from_users.sql
):
alter table myapp.users drop column last_name;
Simplifications Exposed
Obviously, the example above is kept simple to show how straightforward it is to use tern
.
In this next section, let's explain some things that got ignored. At the end, tern
will still be simple.
Env Vars
You will notice in all of the above examples, We can just run tern
and psql
and both somehow magically connect to the database without providing a single connection parameter! Fun fact: all of the above examples will run if the following env vars are set in your environment:
PGHOST
PGPORT
PGDATABASE
PGUSER
PGPASSWORD
psql
works because it uses libpq which works with every environmental variable listed in the docs.
tern
, despite not relying on libpq, still recognizes the basic connection env vars used by libpq!
If you set the above env vars in your bash session, both psql
and tern
use them.
This is another smart design decision of tern
:
- We want our migration tool to play nicely with PostgreSQL's ecosystem and conventions.
Downloading and installing
All of the examples assume tern
is installed and in the PATH
.
Here's how to install tern
.
We download it, unpack it (there are just two files: LICENSE
and tern
), put the tern
binary in our PATH
, and run it.
$ tern version
tern v2.3.2
That's it. Seriously. tern
doesn't even rely on any of Linux's shared libraries:
$ ldd ./tern
not a dynamic executable
It's just one binary that we download and put in our PATH
.
No JVM required. No fiddling with a CLASSPATH
. No installing a required scripting language like JavaScript or Python (and the associated library management systems and virtual envs).
This is yet another smart design choice of tern
:
- We want our migration tool to be trivial to install and run by developers and by continuous integration workflows.
Rollbacks to earlier schema versions
We have not discussed tern
's ability to roll back to earlier versions of a schema, or apply a version a few steps behind the latest. But it is possible if we write your migration files with a special ---- create above / drop below ----
SQL comment that tern
knows to look for. Our migration files end up being written in the following style, with the SQL necessary to undo the current migration underneath the SQL that applies the migration:
alter table myapp.users add column last_name text constraint last_name_required check (last_name != '') not null;
---- create above / drop below ----
alter table myapp.users drop column last_name;
But be warned! In production, if last names are added to users, and then the last_name
column goes away, and then the last_name
column is brought back, it won't be brought back with the original data — those are lost forever.
So despite tern
's ability to roll back and forth to different schema versions, it may be preferable to write migrations in the simpler style shown above. With databases, (and data!) it's generally wise to only ever move forward.
Conclusion
A lot of other database migration tools are heavyweight, complex things that introduce whole ecosystems that must be installed and learned and essentially "bought into". These heavyweight tools introduce enough friction that it may be tempting to avoid using a migration tool at the start of a project.
tern
is so streamlined that it is almost frictionless to use. This raises the probability that more of us will use a database migration tool from the start of a project.
Let's recap.
We want our migration tool to...
- allow us to keep all of our migration files in the same directory
- allow our migration files to be plain text so that we can check them into git
- allow our migration files to be SQL and not some vendor-specific language
- enable our project to have its own namespace in the database
- keep a schema version data table per project in the same database
- play nicely with PostgreSQL's ecosystem and conventions
- be trivial to install and run by developers and by continuous integration workflows
It's a completely reasonable list of needs, and yet so many tools fail to satisfy them! tern
satisfies them all, so try it for your next project if you are not using it already.