In a previous post ( OpenTSDB and Google Cloud Bigtable) we discussed OpenTSDB, an open source distributed database specifically designed for storing timeseries data. We also explained how OpenTSDB relies on Apache HBase for a reliable and scalable data backend. However, deployment and administration of an HBase cluster is not a trivial task, as it requires a full Hadoop setup. This means that it takes a big data engineer (or better a team of them) to plan for the cluster sizing, provision the machines and setup the Hadoop nodes, configure all services and tune them for optimal performance. If this is not enough, Operations teams have to constantly monitor the cluster, deal with hardware and service failures, perform upgrades, backup regularly, and a ton of other tasks that make maintenance of a Hadoop cluster and OpenTSDB a challenge for most organizations. With the release of Google Bigtable as a cloud service and its support for the HBase API, it was obvious that if we managed to integrate OpenTSDB with Google Bigtable, we would enable more teams to have access to the powerful functionality of OpenTSDB by removing the burden from maintaining an HBase cluster. Nevertheless, integration of OpenTSDB with Bigtable was not as seamless as dropping a few jars in its release directory. This happened because the OpenTSDB developers went over and above the standard HBase libraries, by implementing their very own asynchbase library. Asynchbase is a fully asynchronous, non-blocking, thread-safe, high-performance HBase API. And no one can put it better than the asynchbase developers themselves who claim that ‘ This HBase client differs significantly from HBase's client. Switching to it is not easy as it requires one to rewrite all the code that was interacting with any HBase API.’ This meant that integration with Google Bigtable required OpenTSDB to switch back to the standard HBase API. We saw the value of such an effort here at Pythian and set about developing this solution.
git clone -b bigtable git@github.com:pythian/opentsdb.git cd opentsdb sh build-bigtable.sh
OpenTSDB provides a script that uses HBase shell to create its tables. To create the tables run the following command: env COMPRESSION=NONE HBASE_HOME=/path/to/hbase-1.1.2 \ ./src/create_table.sh
export HBASE_CONF=/path/to/hbase-1.1.2/conf mkdir -p <tmp_dir> ./build/tsdb tsd --port=4242 --staticroot=build/staticroot \ --cachedir=<tmp_dir>
We would like to thank the OpenTSDB developers (Benoît Sigoure and Chris Larsen) for their brilliant work in building such great software and for embracing the asyncbigtable library. Their insights and code contributions helped us deal with some serious issues. Also, we would like to thank the Google Cloud Bigtable team because they expressed genuine interest in this project and they were very generous in providing us with cloud infrastructure and excellent support.
Ready to optimize your use of Google Cloud's AI tools?