Database versioning part 2 - Liquibase - how to use [video]

liquibase screenshotThis is a second part of a presentation about database schema versioning. If you haven’t watched/read the first part, please do so. It describes the problem of versioning and a concept of incremental SQL scripts that is a solution to it.

This time I will present one of the tools that supports that concept. This part will be very practical concentrated less on theory more on usage of the theory from the first part. The tool I am going to us is Liquibase. I choice was determined by number of developers using it, long life and frequent updates. All above together may mean, the is worth of our attention. I plan to shortly describe the idea behind it and to show how it works in a practical real life scenario.

The article has a video version attached.

Let’s assume, our company has a product. Currently the latest version is 3. As the product uses a database, a version of database schema is also 3. The schema contains some tables, columns, foreign keys etc. A team is working on a next release – version 4. During a development process, a new table is created, some columns are added, some other are dropped.

When version 4 development and testing is finished, not all the work is done.

Then the key question is – how to upgrade existing product installations of version 3 to version 4 without losing data stored in version 3. And here is where Liquibase comes into play.

 

Liquibase installation

You can find Liquibase on www.liquibase.org website. It is licensed under the Apache 2.0 License. Liquibase is a software installed on a developer computer. It works with the most popular database engines: MySQL, PostgreSQL, Oracle, SQL Server, Sybase, DB2 and more. For purpose of this presentation I will use SQL Server 2014 to show you how it works. And when first connected to a database it creates two tables for its internal purposes.

Before I start showing you how it can be used, take a look how to set it up. Getting it ready to work is extremely simple:

  1. You can download it from www.liquibase.org.
  2. Extract it to a local directory.
  3. Add the directory path to environment PATH variable.

And that is it. Liquibase is ready to work.

In a result I can execute liquibase.bat and it prints me a help screen with a list of commands and parameters.

 

New database development

Let’s assume I have just made a decision to develop a new application with my team. I have created a database that will be used by this application. The database is called MyDatabase and it is empty. There are no tables in it.

I create databaseChangeLog.sql file in my project directory. This file has to have a header that will tell Liquibase that it is a SQL file.

--liquibase formatted sql

It is necessary because a few different file formats are supported. I will tell you more later about the formats. In this presentation I will use SQL format.

I will be adding SQL scripts to this file so it should be treated as all other source code files in my project. For example it should be checked into a source code repository like Git or Subversion. As I haven’t made any change yet, I leave this file just with the header.

 

Liquibase initialization

As you remember I said Liquibase requires two tables which are not yet created. Fortunately, I do not need to do it manually, I can let Liquibase do the job. I just need to make Liquibase connect to MyDatabase. I can use update command from liquibase.bat. Additionally, I need to pass a few parameters to tell the tool which JDBC driver to use, its location, database URL and credentials. You can see them below.

liquibase --driver=com.microsoft.sqlserver.jdbc.SQLServerDriver --classpath="c:\Program Files\Microsoft JDBC Driver 4.0 for SQL Server\sqljdbc_4.0\enu\sqljdbc4.jar" --changeLogFile=databaseChangeLog.sql --url="jdbc:sqlserver://localhost\DEVSQL01;databaseName=MyDatabase" --username=sa --password=saPass update

It takes a few seconds to connect to the database, check its state … and it is done. Two tables were created in MyDatabase.

Liquibase DatabaseChangeLog table

Now, let me quickly describe what those tables are for. On the image below, you can see a sample content of the DATABASECHANGELOG table.

Liquibase DatabaseChangeLog table content

Each row represents a change made to the database. It contains many, more or less useful, information about the change like timestamp when it was applied, who created a script for that change.

The first two columns identify a change – ID and AUTHOR. Even if you may think change ID should be enough to know which change it is about, developers of Liquibase decided to use AUTHOR column also to avoid ID conflicts. It is easier to tell each developer to maintain their own sequence of numbers making sure they are unique than to secure uniqueness of IDs across the whole development team. As a result, each change is identified by author name and change id.

Each change visible in the table comes from the change log file which in our case is called databaseChangeLog.sql.

liquibase table and file

If Liquibase is requested to upgrade a database structure to the latest version using databaseChangeLog.sql, there is nothing simpler. It looks into DATABASECHANGELOG table, compare it to content of the file and very easily decide which changes are missing in the table. It is worth to mention that scripts in the file are marked with id and author. Then, there is nothing left besides executing the missing scripts from the file on the database and adding proper rows to the table.

Finally, it is time to see how it works in practice.

 

Upgrading database schema with Liquibase

During the development process, I have written a script to create a table accounts. My change starts with a header that marks the change as the first one of all my changes. Below, you can also notice, there is a rollback section.

It tells Liquibase how to roll back the change in case of problem or on request because later I can tell Liquibase to roll back my change. Now, I want to apply it on the database. To do so, I only need to execute liquibase.bat with update option. The same I used to create the two tables at the beginning. After doing this, the table shows up in MyDatabase.
I think it is time to take a look at DATABASECHANGELOG table content … The first row showed up with a change ID=1 of AUTHOR=Me. Thanks to that, Liquibase will know in the future that this change will have already been executed and to not apply it again.

However, the development has not finished, yet and I need one more table and some initial data in both tables. All I need to do is to add the changes at the end of the file. Please notice, each of them has rollback section at the end. Rollback for CREATE TABLE is to DROP TABLE, rollback for inserting rows is deleting them.

--liquibase formatted sql
--changeset Me:1
CREATE TABLE [dbo].[accounts](
[accountId] [int] NOT NULL,
[creationDate] [datetime] NOT NULL,
[name] [varchar](100) NULL,
CONSTRAINT [PK_accounts] PRIMARY KEY CLUSTERED
(
[accountId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[accounts] ADD CONSTRAINT [DF_accounts_creationDate] DEFAULT (getdate()) FOR [creationDate]
GO
--rollback drop table [dbo].[accounts];

--changeset Me:2
CREATE TABLE [dbo].[transactions](
[transId] [int] IDENTITY(1,1) NOT NULL,
[creationDate] [datetime] NOT NULL,
[accountId] [int] NOT NULL,
[value] [money] NULL,
CONSTRAINT [PK_transactions] PRIMARY KEY CLUSTERED
(
[transId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
--rollback drop table [dbo].[transactions]

Now, I am going to use update option again to apply the latest database structure. The second table shows up and initial data is inside.

liquibase --driver=com.microsoft.sqlserver.jdbc.SQLServerDriver --classpath="c:\Program Files\Microsoft JDBC Driver 4.0 for SQL Server\sqljdbc_4.0\enu\sqljdbc4.jar" --changeLogFile=databaseChangeLog.sql --url="jdbc:sqlserver://localhost\DEVSQL01;databaseName=MyDatabase" --username=sa --password=saPass update

I am sure you can easily guess content of DATABASECHANGELOG table. Yes, it now contains three rows. Two new rows have just been added.

databaseChangeLog content - three rows

 

Rolling database changes back

Sometimes it happens that unexpected events occur like a defect created during the development or just design alters and some previous changes are no longer needed or even must be rolled back.

Although, Liquibase has a few options to roll back, I am going to use just one of them – rollbackCount. It allows to rollback a bunch of last changes. In my example I provide number 2 which means to rollback last two changes.

liquibase --driver=com.microsoft.sqlserver.jdbc.SQLServerDriver --classpath="c:\Program Files\Microsoft JDBC Driver 4.0 for SQL Server\sqljdbc_4.0\enu\sqljdbc4.jar" --changeLogFile=databaseChangeLog.sql --url="jdbc:sqlserver://localhost\DEVSQL01;databaseName=MyDatabase" --username=sa --password=saPass rollbackCount 2

Let’s see what happens... Transactions table disappeared like it would have never existed. It is interesting to verify content of DATABASECHANGELOG table.

databaseChangeLog after rolling back

Now it contains only the first row, which means the next two can still be applied to the database. But I am not going to do that now.

 

Generating upgrade and rollback scripts

Another developer, Miranda, looks at my scripts and discovers that a foreign key between accounts and transactions tables is missing. She creates a script to add it with her name and id=1 in the header. The script is added to databaseChangeLog.sql.

...
--changeset Miranda:1
alter table [dbo].[transactions]
add constraint FK_transactions_accountId foreign key (accountId) references dbo.accounts (accountId);
--rollback alter table [dbo].[transactions] drop constraint FK_transactions_accountId;

As you already know how to update database structure using update command, I will do a similar task using updateSQL option. If you do not have administrator access to a database you want to update, for example the database is hosted by your customer or there is a production DBA in your company and you need to provide him with SQL scripts then updateSQL option is for you. Instead of applying the scripts to a database it creates a SQL file that can be executed later. This is how syntax looks like.

liquibase --driver=com.microsoft.sqlserver.jdbc.SQLServerDriver --classpath="c:\Program Files\Microsoft JDBC Driver 4.0 for SQL Server\sqljdbc_4.0\enu\sqljdbc4.jar" --changeLogFile=databaseChangeLog.sql --url="jdbc:sqlserver://localhost\DEVSQL01;databaseName=MyDatabase" --username=sa --password=saPass updateSQL > update.sql

By default, SQL statements are produced to the standard output but I redirect it to a file – update.sql. After execution, update.sql file appears in the project directory and the file contains three missing changes: mine 2 and 3, and Miranda’s first one.

-- *********************************************************************
-- Update Database Script
-- *********************************************************************
-- Change Log: databaseChangeLog.sql
-- Ran at: 02.12.14 07:14
-- Against: sa@jdbc:sqlserver://localhost\devsql01:60245;authenticationScheme=nativeAuthentication;xopenStates=false;sendTimeAsDatetime=true;trustServerCertificate=false;sendStringParametersAsUnicode=true;selectMethod=direct;responseBuffering=adaptive;packetSize=8000;multiSubnetFailover=false;loginTimeout=15;lockTimeout=-1;lastUpdateCount=true;encrypt=false;disableStatementPooling=true;databaseName=MyDatabase;applicationName=Microsoft JDBC Driver for SQL Server;applicationIntent=readwrite;
-- Liquibase version: 3.2.2
-- *********************************************************************
-- Lock Database
-- Changeset databaseChangeLog.sql::2::Me
CREATE TABLE [dbo].[transactions](
	[transId] [int] IDENTITY(1,1) NOT NULL,
	[creationDate] [datetime] NOT NULL,
	[accountId] [int] NOT NULL,
	[value] [money] NULL,
 CONSTRAINT [PK_transactions] PRIMARY KEY CLUSTERED 
(
	[transId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [dbo].[DATABASECHANGELOG] ([ID], [AUTHOR], [FILENAME], [DATEEXECUTED], [ORDEREXECUTED], [MD5SUM], [DESCRIPTION], [COMMENTS], [EXECTYPE], [LIQUIBASE]) VALUES ('2', 'Me', 'databaseChangeLog.sql', GETDATE(), 2, '7:a367eeef674297d94bf89b3dfb99f2a2', 'sql', '', 'EXECUTED', '3.2.2') GO
-- Changeset databaseChangeLog.sql::3::Me insert into dbo.accounts (accountId, creationDate, name) values (1, getDate(), 'account1') GO
insert into dbo.accounts (accountId, creationDate, name) values (2, getDate(), 'account2') GO
insert into dbo.transactions (creationDate, accountId, value) values (getDate(), 1, 10.34), (getDate(), 1, 0.99), (getDate(), 2, 300) GO
INSERT INTO [dbo].[DATABASECHANGELOG] ([ID], [AUTHOR], [FILENAME], [DATEEXECUTED], [ORDEREXECUTED], [MD5SUM], [DESCRIPTION], [COMMENTS], [EXECTYPE], [LIQUIBASE]) VALUES ('3', 'Me', 'databaseChangeLog.sql', GETDATE(), 3, '7:6726de1d7120531df2c31926b5bb9b21', 'sql', '', 'EXECUTED', '3.2.2') GO
-- Changeset databaseChangeLog.sql::1::Miranda alter table [dbo].[transactions] add constraint FK_transactions_accountId foreign key (accountId) references dbo.accounts (accountId) GO
INSERT INTO [dbo].[DATABASECHANGELOG] ([ID], [AUTHOR], [FILENAME], [DATEEXECUTED], [ORDEREXECUTED], [MD5SUM], [DESCRIPTION], [COMMENTS], [EXECTYPE], [LIQUIBASE]) VALUES ('1', 'Miranda', 'databaseChangeLog.sql', GETDATE(), 4, '7:9f6b6316dbc551dc22f68140e6edacb2', 'sql', '', 'EXECUTED', '3.2.2') GO
-- Release Database Lock

 

You can see insert statements generated at the end of each change to add rows to DATABASECHANGELOG. If your customer or a production DBA requires also rollback scripts just in case, you can generate it by using futureRollbackSQL option.

liquibase --driver=com.microsoft.sqlserver.jdbc.SQLServerDriver --classpath="c:\Program Files\Microsoft JDBC Driver 4.0 for SQL Server\sqljdbc_4.0\enu\sqljdbc4.jar" --changeLogFile=databaseChangeLog.sql --url="jdbc:sqlserver://localhost\DEVSQL01;databaseName=MyDatabase" --username=sa --password=saPass futureRollbackSQL > rollback.sql

After execution, rollback.sql appears and it contains rollback scripts for each changes from update.sql file in a reversed order.

-- *********************************************************************
-- SQL to roll back currently unexecuted changes
-- *********************************************************************
-- Change Log: databaseChangeLog.sql
-- Ran at: 02.12.14 07:15
-- Against: sa@jdbc:sqlserver://localhost\devsql01:60245;authenticationScheme=nativeAuthentication;xopenStates=false;sendTimeAsDatetime=true;trustServerCertificate=false;sendStringParametersAsUnicode=true;selectMethod=direct;responseBuffering=adaptive;packetSize=8000;multiSubnetFailover=false;loginTimeout=15;lockTimeout=-1;lastUpdateCount=true;encrypt=false;disableStatementPooling=true;databaseName=MyDatabase;applicationName=Microsoft JDBC Driver for SQL Server;applicationIntent=readwrite;
-- Liquibase version: 3.2.2
-- *********************************************************************
-- Lock Database
-- Rolling Back ChangeSet: databaseChangeLog.sql::1::Miranda
alter table [dbo].[transactions] drop constraint FK_transactions_accountId
GO
DELETE FROM [dbo].[DATABASECHANGELOG] WHERE ID='1' AND AUTHOR='Miranda' AND FILENAME='databaseChangeLog.sql' GO
-- Rolling Back ChangeSet: databaseChangeLog.sql::3::Me delete from dbo.transactions where accountId in (1, 2) GO
delete from dbo.accounts where accountId in (1, 2) GO
DELETE FROM [dbo].[DATABASECHANGELOG] WHERE ID='3' AND AUTHOR='Me' AND FILENAME='databaseChangeLog.sql' GO
-- Rolling Back ChangeSet: databaseChangeLog.sql::2::Me drop table [dbo].[transactions] GO
DELETE FROM [dbo].[DATABASECHANGELOG] WHERE ID='2' AND AUTHOR='Me' AND FILENAME='databaseChangeLog.sql' GO
-- Release Database Lock

Please notice, the file contains changes that has not yet been applied to the database. The file makes sense only after running update.sql. It is a very useful option because it allows preparing all scripts without actually making any change in the database.

 

Other options

This is pretty much all what I wanted to show you. I can only tell you a little more about Liquibase features that I did not covered in this presentation.

You probably remember me mentioning at the beginning that for purpose of this presentation that I was going to use SQL format of database change log file. So now, I can tell you that you have a few other options. Liquibase supports XML in the first place. If you use this format, Liquibase understands your changes and can automatically infer what should rollback script contain. For example, if your change is a table creation, you do not need to explicitly write that dropping the table is a rollback action, as I had to in case of SQL format, Liquibase just knows it. If for some reasons you do not want to use XML, SQL, you can also choose YAML or JSON formats.

Regarding other cool features, Liquibase commands can be executed not only from a command line, but it also integrates with Ant and Maven. It can be easily used in Java.

During my presentation I used only a few options out of many available in Liquibase. All of them can be found in the documentation. There exist various update and rollback options, but also you can tag a state of a database, generated JavaDoc documentation, validate changelog for errors and many more. I think you will find there exactly what you need.

 

Video presentation