2.10.12

HAcid: multi-row transactions in HBase

HAcid is a client library for HBase applications to enable the support for multi-row ACID transactions in HBase. The isolation level of these transactions is Snapshot Isolation, but Serializability is currently supported  as an unstable feature.

HAcid open-source repo at bitbucket.org 


This is the end result of my Master's Thesis at Aalto University. It was inspired by Google Percolator, but is still different in many ways. While Percolator uses locks for controlling concurrency, HAcid is lock-free because it employs optimistic concurrency control.

Using HAcid is straightforward:
import fi.aalto.hacid.*;

Configuration conf = HBaseConfiguration.create();
HAcidClient client = new HAcidClient(conf);

// To enable transactions, use this wrapper instead of HTable
HAcidTable table = new HAcidTable(conf, "mytable");

// Start a new transaction
HAcidTxn txn = new HAcidTxn(client);

// Use HAcidGet instead of HBase's Get
HAcidGet g = new HAcidGet(table, Bytes.toBytes("row1"));
g.addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("qual1"));
Result r = txn.get(g);

// Use HAcidPut instead of Put
HAcidPut p = new HAcidPut(table, Bytes.toBytes("row1"));
p.add(Bytes.toBytes("fam1"), 
      Bytes.toBytes("qual1"), 
      Bytes.toBytes("value"));
txn.put(p);

// Commit the transaction
boolean outcome = txn.commit(); 
// true is "committed", false "aborted"

The algorithms in the HAcid library rely heavily on HBase's single-row transactions, in particular CheckAndPut. One of the tricks employed was explained on this blog already. My Thesis is a complete documentation of the system.

The license is Apache 2.0.
Feel free to comment, ask, fork at bitbucket, etc.