Database migrations with Liquibase

Liquibase

Liquibase is very powerful and flexible database migration tool. It can be used as a stand-alone application and thus can update the schema of any RDBMS it supports, but most of the times it is more beneficial to have it integrated into Java-based applications. In this post I will cover the technique of integrating it with the project.

Database migration in general

Database migration is the process of updating the schema of a (developer, test or production) database from state A to state B. Database states are bound to different revisions of the application source tree. When updating the schema from state A to state B, A can be anything including an initial empty database, but B must be at least 1 revision later. These states often represent different versions of an application where schema evolution was part of the ongoing development process.  Database migration tools are designed to achieve the goal of making this process traceable, visible, convenient and more fault-tolerant. Using database migration tools provides pretty the same benefits as using revision control systems for source code.

Application development frameworks / ORM providers often have support for migrations but most of the time they are used for rapid prototyping in development time and don’t meet the requirements for production use. For example, Hibernate has the option hbm2ddl which can have various settings. One of them is hbm2ddl=update. This way, Hibernate automatically fires the necessary DDL statements to create or modify the schema to fit the application’s entities – but executing these statements might result in data loss. For more information, check the official documentation.

Leading migration solutions are database-agnostic, so the migration can be portable between RDBMS products (obviously with certain limitations). This unburdens the process of changing the RDBMS that our application uses.

At this point we have to mention that these tools are designed for schema migration but can be also used for data migrations (insert, update, delete) if we don’t have any better idea (we can use INSERT, UPDATE and any other query in the changeset at any time).

What is Liquibase?

Liquibase is one of the well known migration tools out there for Java applications. In general it can be used to:

  • apply a set of changes,
  • analyze the schema of a database and export the changeset for later use (can be useful for introducing liquibase to an already in-production project)

It has an extensive feature set yet it’s simple enough to get started with.

Changelog

The very heart of a Liquibase-managed migration is the changeLog file. It defines what changes are necessary in what order. Traditinally, it’s an XML document but these days there is support for the following formats:

  • XML
  • YAML
  • JSON
  • SQL (required metadata in commented rows, native sql changes inlined between them)
  • Groovy

We will stick to XML in the samples. Generally all practices are available in any formats and XML changesets are the most common in projects.

General approach

The changelog file describes all the necessary schema changes in a strict incremental approach, each change depending of all the previous changes in the order of definition (or to be more precise, expecting that previous changes are/will be ran in the order of definition). The changelog file is continuously appended by the developers and once a change is introduced into the changelog file (and potentially executed against some DB), it must never be altered, because it could mess up Liquibase’s expectations. This is because Liquibase maintains a DATABASECHANGELOG table that records what changes are already ran (with an MD5 hash computed from metadata and content of a changeSet). The hash is (re)computed on all described changes every time an update is executed and Liquibase checks for DATABASECHANGELOG table to determine if the actual changeSet is ran or not.

Example changelog


This is a basic changeLog XML file which uses a database agnostic refactor command to create a table with two columns. Straightforward, and portable.

The changeLog file is designed to hold your change definitions which is covered in the next section.

The full documentation on the changelog file can be found here.

The basics: refactor tags, native SQL, inclusions

Refactor tags, one example

In the terminology of Liquibase, refactors are common DDL tasks such as create, rename, drop, alter and so on.

The full list of available refactors is here.

In an optimal scenario, you can have a migration plan that is fully portable between about a dozen of RDBMS systems.

Here is an example:

Native SQL

You can use any native SQL queries as part of your migration. Three possible approaches are there:

  • Use the SQL file changeLog format and inline your queries between commented blocks, see the previous section
  • Use native SQL directly in your XML/JSON/etc changeLog file (preferred if the SQL query is very short)
  • Place your native SQL queries in seperated .sql files and include them into a changeSet (preferred if it’s more than 2-3 short queries)

The first approach is rarely used so we will include examples only for the second and third.

Inline native SQL


Load external SQL

ChangeLog inclusions

You can include any number of additional changeLog files into the main file. This can be useful for better arrangement of your migration to prevent an infinitely growing XML file.

In the example we divided the changeSets into a central import-only file, and two seperated changeLog files for our major revisions.

Included changeLog files


The order of included files are significant as usual.

Advanced features: preconditions, custom preconditions, boolean logic

Basics

You can define some preliminary requirements before a changeSet is run. If the precondition evaluates to true, the change will run. If it’s false, the developer can choose the outcome:

  • HALT: An error message will be emitted and the migration process will stop. The changeSet and following changeSets won’t be marked as run.
  • CONTINUE: Skips the actual changeSet and continue on the next. The changeSet won’t be marked as run and on the next update, Liquibase will attempt to execute it again
  • MARK_RAN: The changeSet won’t be run, but will be marked as run. This is commonly used when introducing Liquibase to an already used database (you expect some object to not exists, and on fail, you know it’s already there, so you mark it as ran)
  • WARN: A warning message will be emitted, continues on the next changeSet.

There are three types of preconditions:

  • database-agnostic precondition tags (similar to refactor commands): tableExists, foreignKeyConstraintExists and so on,
  • sql-check: native SQL query with a single integer result with a provided expected value.
  • custom preconditions: A Java class implementing the liquibase/precondition/CustomPrecondition interface. Arbitrary preconditions can be implemented this way.

The full list of available preconditions and detailed info can be found here.

Precondition tag example

Database-agnostic preconditions can be used just like refactor commands.

Note that the following example should not be used inside a changeSet but inside a databaseChangeLog, as it is related to the whole migration process and not a specific changeSet.

Example precondition

As you can see, this preConditions tag contains two preconditions, which is evaluated in the order of definition and they are evaluated using an implicit AND operator.

SQL checks example
Example SQL check precondition

Custom precondition example

A custom precondition is in fact a Java class created by the actual application developer which implements the liquibase/precondition/CustomPrecondition interface. Using the Liquibase API, an arbitrary precondition can be used. There are some conventions to make sure your class signals condition errors the right way, examine the handling of the exceptions in the example.

You can assign any number of attributes to your class. They should be non-final private members without assigned value. Liquibase will fill them using reflection with the values supplied in the corresponding XML attributes.

In the following example, by nesting the precondition into a not element, we expect false evaluation of the custom precondition, so we can achieve a ‘foreign key not exists’ precondition before actually creating one.

The following custom precondition can be used to check if a column on a table has a foreign key constraint for a foreign table without knowing the name of this constraint.

Custom precondition

Note that we deleted the boilerplate code which should be added for real use.

Custom precondition class

AND/OR/NOT logic

Nesting andor and not tags is also possible. Example can be found above, we simply inversed the output of the precondition with a not tag. Pretty straightforward feature.

Further advanced features: custom changes

Just like the custom preconditions with Java classes are available, a custom change can be created as a Java class implementing the interface liquibase/change/custom/CustomTaskChange.

Why would anyone implement a database DDL/DML in Java when there are DB agnostic refactor commands or as last resort native SQL queries. Well, the perfect example would be some special data migration, which is easily implemented in Java because it’s rich feature set (for example, complex String manipulations, hashing plain text password columns, and so on), but would be painful/hacky in the standard ways.

You can include a custom change implementation with the customChange tag.

Using the custom change

Custom change implementation

Liquibase commands

Liquibase supports a number of commands. The command is actually what Liquibase will do after invocation, and one invocation is initiated with exactly one command of the following:

  • Update: Look at the provided changeLog, calculate MD5 hashes for each changeSets. Look into the DATABASECHANGELOG table, and check which changeSets did not run yet. Run them and update DATABASECHANGELOG, marking them ran.
  • Rollback: Roll back previously executed changeSets
  • Diff: Checks the database and reports the differences between it’s schema and the provided changeLog
  • Generate changelog: Analyzes the database schema and generates a changeLog which can be used to produce the same schema on an initial database. (Generally it works well but the documentation states that for various RDBMS systems it lacks some features, such as functions for Oracle, which must be added by hand after).

Integration into the application

This topic is strictly about the technical details of integrating Liquibase into a Java SE or EE app and doesn’t cover the necessary steps when you have to introduce it to existing databases with schema and data. That is discussed in a seperated topic.

Integrating Liquibase

Liquibase is available from Maven Central. The current latest version is used here at the time of writing the article, so you should check if there’s a new version.

Liquibase Maven artifact

If you want to use the Liquibase Maven plugin to execute migrations from a maven goal, you have to include this:

Liquibase maven plugin artifact

If you want to use Liquibase from Gradle, the artifact is also available: http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22gradle-liquibase-plugin%22

Starting Liquibase from some build goal

There are a number of options.

  • For ant, info here
  • For maven, info here
  • For gradle, there seems to be a mature plugin to do the job, info here
Java EE application

Liquibase can be integrated into the deployment process of Java EE applications. For this, you must edit the deployment descriptor (web.xml) and add the Liquibase servlet listener.

web.xml

Important: always place Liquibase listener before any other listeners. In the Java EE specification, there is a strict rule for container implementations to call listeners in the defined order. Not putting into the first position might cause interference with ORM providers and so on.

This way, at each (re)deployment, the schema migration will be handled automatically by Liquibase.

There is another option for the cases when the application uses a CDI container. The official documentation is here.

Introduction into existing databases

Introduction into existing databases means you have one or more databases with schema and data (which are valuable so the introduction of Liquibase must not alter them in any way).

Technically speaking, the only missing stuff from such a database is the DATABASECHANGELOG table which is created and maintained by Liquibase to track the changes. Also, from the application’s side, you need a correct databaseChangeLog that’s execution results in a schema that equals to our existing schema.

We will divide the process into two different operations:

  1. Creating the correct databaseChangeLog which can produce exactly the same schema we need
  2. Creating the correct meta table, to make Liquibase think it has been taking care of migrations since the beginning

Creating the correct databaseChangeLog

There are a number of possibilities.

  • The developer analyzes schema with his/her eyes, and writes all the changeSets with his/her hand. This is error-prone and could take a very long time (and will drive the developer mad)
  • (preferred) The developer uses Liquibase’s “generate changelog” feature, and checks the documentation after what schema constructs it skipped, and appends them by hand (for Oraclefunctions won’t be exported for sure. Each missing support for various RDBMS systems are documented)

Let’s assume we have a complete and correct changeLog that describes our schema with 100% precision, and hop to the next step.

Creating the correct meta table

Again, we have two possibilities. This time we won’t mark any of them as preferred.

  • As I wrote in the previous section, we have a correct changeLog. We could execute it against an initial empty database, cut Liquibase meta table and insert it into our production databases. Well, tempting. We could save valuable time, but needs strong attention to not mess up anything.
  • Take our changeLog and insert correct preconditions before each and every change
    • This could take some time depending the complexity of our schema
    • In general, we should create precondition which expects some database object to NOT exist (that object we want to create). Upon fail (it does exist), mark this change RAN with MARK_RAN option (see above).
    • The obvious advantage is that we will have a changeLog that can be applied to any of the in-use databases and an empty database also

Extensions

Hibernate Integration

Allows automatic generation of changeSets based on the differences of the entity classes and the database schema. Can be found here.

Oracle extensions

Allows many new refactor tags for Oracle like CreateMaterializedView, AddDeferredPrimaryKey and so on. Can be found here.

Conclusions

I used Liquibase for about 3 years on a Java application development project. The tool passed the test of time and its various features were used daily. As the time passed, the application went under several structure changes, like migration from a container-based operation to a standalone operation (GlassFish -> self-contained Jetty), invocation from build tool and direct invocation was introduced (using JCommander), etc, but we always kept Liquibase around to do its work.

As a personal opinion, the best thing about the Liquibase integration into the source tree was that when the developer was checking older revisions of the source tree (with some visual Git repository viewer), the commits were composed of application source code changes and also database schema changes, hand in hand, and it was easy to track which part of the schema was modified by whom and when.

There were occasional complex data migration tasks where we used Java-based custom changes. It was very easy to do the necessary DML operations, because we got a standard JDBC connection and we could use Java’s rich feature set for the task, instead of trying to achieve our goals with raw SQL statements.

Never miss an article by subscribing to our monthly digest!

External resources

Liquibase home

Documentation home

Database changelog file

Refactor commands

Preconditions

Issue tracker

Stack overflow

Gyorgy Abraham

Latest posts by Gyorgy Abraham (see all)

Summary
Database migrations with Liquibase
Article Name
Database migrations with Liquibase
Description
Liquibase is very powerful and flexible database migration tool. In this post I will cover the technique of integrating it with the project.
Author
Publisher Name
Atos Consulting CH
Publisher Logo

One thought on “Database migrations with Liquibase

  1. Avatar
    Lucas Araújo Reply

    Can you help me?
    I have two includes in my db.changelog-master.xml, just like this:

    The first include is database schema, and my second include:

    I run the liquibase:update command in maven for create and put the insert’s in my database.

    But i have a problem, if i need insert more data in data.sql file i receive this error:
    Failed to execute goal org.liquibase:liquibase-maven-plugin:3.4.1:update (default-cli) on project project-name: Error setting up or running Liquibase: Validation Failed: 1 change sets check sum

    Theres a way to add more data in the data.sql file and the liquibase insert just the rows that isn’t in the database ?

    Thank you

Leave a Reply

Your email address will not be published. Required fields are marked *