Showing posts with label HBase. Show all posts
Showing posts with label HBase. Show all posts

2.10.12

HAcid: multi-row transactions in HBase

HAcid is a client library for HBase applications to enable the support for multi-row ACID transactions in HBase. The isolation level of these transactions is Snapshot Isolation, but Serializability is currently supported  as an unstable feature.

HAcid open-source repo at bitbucket.org 


This is the end result of my Master's Thesis at Aalto University. It was inspired by Google Percolator, but is still different in many ways. While Percolator uses locks for controlling concurrency, HAcid is lock-free because it employs optimistic concurrency control.

Using HAcid is straightforward:
import fi.aalto.hacid.*;

Configuration conf = HBaseConfiguration.create();
HAcidClient client = new HAcidClient(conf);

// To enable transactions, use this wrapper instead of HTable
HAcidTable table = new HAcidTable(conf, "mytable");

// Start a new transaction
HAcidTxn txn = new HAcidTxn(client);

// Use HAcidGet instead of HBase's Get
HAcidGet g = new HAcidGet(table, Bytes.toBytes("row1"));
g.addColumn(Bytes.toBytes("fam1"), Bytes.toBytes("qual1"));
Result r = txn.get(g);

// Use HAcidPut instead of Put
HAcidPut p = new HAcidPut(table, Bytes.toBytes("row1"));
p.add(Bytes.toBytes("fam1"), 
      Bytes.toBytes("qual1"), 
      Bytes.toBytes("value"));
txn.put(p);

// Commit the transaction
boolean outcome = txn.commit(); 
// true is "committed", false "aborted"

The algorithms in the HAcid library rely heavily on HBase's single-row transactions, in particular CheckAndPut. One of the tricks employed was explained on this blog already. My Thesis is a complete documentation of the system.

The license is Apache 2.0.
Feel free to comment, ask, fork at bitbucket, etc.

22.5.12

First and last rows in an HBase table

HBase's Client API is quite limited.
This is because HBase itself must support only simple operations so that the data store can be "cloudy", that is, highly distributed and highly scalable. For this reason the API will probably stay simple.

One interesting question is:
How do I get the last row in an HBase table?

And the disappointing answer is:
You have no direct access to the last row of an arbitrary HBase table.

The word "direct" is important, because technically you can get the last row, in an indirect and inefficient way. Just scan the table starting from any row, and when the scan stops, the last result returned is the last row in the table. This is obviously impractical if your table is even slightly big.

However, if you can afford changing the schema of the row keys of a table, it is efficiently possible.

The strategy is to keep the last row key constant, so you get direct access to it, and to allow the table to grow at the beginning or in the middle. Getting the first row requires a plain scan that stops after returning the first result.

This strategy works well with tables with row keys that are consecutive integers, like timestamps or integers as IDs. Normally, when row keys are integers, the table receives rows in increasing order of the row keys. However, if you can have the table receiving rows always in decreasing order of the row keys, you then have easy access to the first and last rows. This is possible because HBase tables are always sorted by row key. 

The first inserted row should have the highest row key. Subsequent inserted rows are "prepended" to the table.


Essentially, we are turning the table "upside down" with regard to row keys.

Consider an example of a table where rows are comments in a blog post. Row keys are (long) integers, such that the first inserted row has the highest row key, and subsequent inserted rows have the smallest row key when they are inserted. For simplicity, in our example, assume that the largest long value is 20.
The table starts with the first comment inserted:
rowkey   commentmsg     name      date
20       "Nice post"   "Smith"  "May 22"

The next comment added should have a row key that is smaller than the largest row key. For example, the number before 20.
rowkey   commentmsg     name      date
19       "I agree"     "John"   "May 23"
20       "Nice post"   "Smith"  "May 22"

When row keys are integers as IDs, it is convenient to use consecutive decreasing numbers, so essentially we are always prepending the table when inserting a new row.
rowkey   commentmsg     name      date
18       "Cool"        "Roger"  "May 24"
19       "I agree"     "John"   "May 23"
20       "Nice post"   "Smith"  "May 22"

So for retrieving the first row, one just needs to:
Scan scan = new Scan();
ResultScanner scanner = commentsTable.getScanner(scan);
Result firstRow = scanner.next();
scanner.close();

Retrieving the last row is simple, since the row key for the last should be constant:
Get get = new Get(Bytes.toBytes(20));
Result lastRow = commentsTable.get(get);

If you are wondering how to "prepend" for inserting rows, that is the hardest part, but not that hard actually. Basically, one just needs to: (1) retrieve the row key of the first row, and (2) try to insert a row immediately before the existing first row.

Step 1 is simply retrieving the first row, and getting its row key. Step 2 uses checkAndPut to deal with concurrent clients. This is necessary because between step 1 and step 2 another client could have prepended the table, hence changing the knowledge of the first row.

The prepending code should look like:
// Get the first row key (Step 1)
Scan scan = new Scan();
ResultScanner scanner = commentsTable.getScanner(scan);
Result firstRow = scanner.next();
scanner.close();

long prependedRowKey = Bytes.toLong(firstRow.getRow()) - 1;

boolean prependSucceeded = false;
// Try to prepend (Step 2)
do {
    prependSucceeded = commentsTable.checkAndPut(
        Bytes.toBytes(prependedRowKey),
        Bytes.toBytes("colfam"),
        Bytes.toBytes("commentmsg"),
        null,
        new Put(Bytes.toBytes(prependedRowKey).add(
            Bytes.toBytes("colfam"),
            Bytes.toBytes("commentmsg"),
            Bytes.toBytes("I like your post")
        )
    );

    if(!prependedSucceeded) {
        prependedRowKey--;
    }
} while(!prependedSucceeded);

The checkAndPut on line 12 is really important. In one atomic step, it checks whether colfam:commentmsg is null in the row we are trying to create, and writes to colfam:commentmsg if the check was positive. The column you use for checking should be a column that always has a value if the row exists. If you do not have such column in the table, you can just create a flag column called e.g. "exists". The Put that is in the checkAndPut can also be customized according to your needs.


Conclusion:
If your table fits the requirements for this strategy, you have an easy access to the first and last rows of the table. These are really efficient operations, and also safe in concurrent scenarios. The prepending algorithm could suffer from race conditions in extremely concurrent situations, but likely that will not give you large latencies.

Regarding region management and general performance of this approach, there are no apparent problems.

I have been using this strategy in a project I am working on, and it has been unit tested, benchmarked and used without headaches.