How to build your very own Cassandra 4.0 release

Tags:
Google Cloud Platform,
Technical Track,
Cloud,
Google Cloud Platform (Gcp),
Devops,
Serverless
Over the last few months, I have been seeing references to Cassandra 4.0 and some of its new features. When that happens with a technology I am interested in, I go looking for the preview releases to download and test. Unfortunately, so far, there are no such releases. But, I am still interested, so I’ve found it necessary to build my own Cassandra 4.0 release. This is in my humble opinion not the most desirable way to do things since there is no Cassandra 4.0 branch yet. Instead, the 4.0 code is on the trunk. So if you do two builds a commit or two apart, and there are typically at least three or four commits a week right now, you get a slightly different build. It is, in essence, a moving target. All that said and done, I decided if I could do it, then the least I could do is write about how to do it and let everyone who wants to try it learn how to avoid a couple of dumb things I did when I first tried it. Building your very own Cassandra 4.0 release is actually pretty easy. It consists of five steps:
- Make sure you have your prerequisites
- Java SDK 1.8 or Java 1.11 Open Source or Oracle
- Ant 1.8
- Git CLI client
- Python >=2.7<3.0
- Download the GIT repository
- git clone https://gitbox.apache.org/repos/asf/cassandra.git
- Build your new Cassandra release
- Cd cassandra
- Ant
- Run Cassandra
- Cd ./bin
- ./cassandra
- Have fun
- ./nodetool status
- ./cqlsh
Step 1) Verify, and if necessary, install your prerequisites
For Java, you can confirm the JDK presence by typing in: john@Lenny:~$javac -version javac 1.8.0_191
For ant:
john@Lenny:~$ ant -version Apache Ant(TM) version 1.9.6 compiled on July 20 2018
For git:
john@Lenny:~$ git --version git version 2.7.4
For Python:
john@Lenny:~$ python --version Python 2.7.12
If you have all of the right versions, you are ready for the next step. If not, you will need to install the required software which I am not going to go into here.
Step 2) Clone the repository
Verify you do not already have an older copy of the repository: john@Lenny:~$ ls -l cassandra ls: cannot access 'cassandra': No such file or directory
If you found a Cassandra directory, you will want to delete or move it or your current directory elsewhere. Otherwise:
john@Lenny:~$ git clone https://git-wip-us.apache.org/repos/asf/cassandra.git Cloning into 'cassandra'... remote: Counting objects: 316165, done. remote: Compressing objects: 100% (51450/51450), done. remote: Total 316165 (delta 192838), reused 311524 (delta 189005) Receiving objects: 100% (316165/316165), 157.78 MiB | 2.72 MiB/s, done. Resolving deltas: 100% (192838/192838), done. Checking connectivity... done. Checking out files: 100% (3576/3576), done.
john@Lenny:~$ du -sh * 294M cassandra At this point, you have used up 294 MB on your host and you have an honest-for-real git repo clone on your host - in my case, a Lenovo laptop running Windows 10 Linux subsystem. And your repository looks something like this:
john@Lenny:~$ ls -l cassandra
total 668
drwxrwxrwx 1 john john 512 Feb 6 15:54 bin
-rw-rw-rw- 1 john john 260 Feb 6 15:54 build.properties.default
-rw-rw-rw- 1 john john 101433 Feb 6 15:54 build.xml
-rw-rw-rw- 1 john john 4832 Feb 6 15:54 CASSANDRA-14092.txt
-rw-rw-rw- 1 john john 390460 Feb 6 15:54 CHANGES.txt
drwxrwxrwx 1 john john 512 Feb 6 15:54 conf
-rw-rw-rw- 1 john john 1169 Feb 6 15:54 CONTRIBUTING.md
drwxrwxrwx 1 john john 512 Feb 6 15:54 debian
drwxrwxrwx 1 john john 512 Feb 6 15:54 doc
-rw-rw-rw- 1 john john 5895 Feb 6 15:54 eclipse_compiler.properties
drwxrwxrwx 1 john john 512 Feb 6 15:54 examples
drwxrwxrwx 1 john john 512 Feb 6 15:54 ide
drwxrwxrwx 1 john john 512 Feb 6 15:54 lib
-rw-rw-rw- 1 john john 11609 Feb 6 15:54 LICENSE.txt
-rw-rw-rw- 1 john john 123614 Feb 6 15:54 NEWS.txt
-rw-rw-rw- 1 john john 2600 Feb 6 15:54 NOTICE.txt
drwxrwxrwx 1 john john 512 Feb 6 15:54 pylib
-rw-rw-rw- 1 john john 3723 Feb 6 15:54 README.asc
drwxrwxrwx 1 john john 512 Feb 6 15:54 redhat
drwxrwxrwx 1 john john 512 Feb 6 15:54 src
drwxrwxrwx 1 john john 512 Feb 6 15:54 test
-rw-rw-rw- 1 john john 17215 Feb 6 15:54 TESTING.md
drwxrwxrwx 1 john john 512 Feb 6 15:54 tools
Step 3) Build your new Cassandra 4.0 release
Remember what I said in the beginning? There is no branch for Cassandra 4.0 at this point, so building from the trunk is quite simple: john@Lenny:~$ cd cassandra john@Lenny:~/cassandra$ ant Buildfile: /home/john/cassandra/build.xml … BUILD SUCCESSFUL Total time: 1 minute 4 seconds
That went quickly enough. Let's take a look and see how much larger the directory has gotten:
john@Lenny:~$ du -sh * 375M cassandra
Our directory grew by 81MB pretty much all in the new build directory which now has 145 new files including ./build/apache-cassandra-4.0-SNAPSHOT.jar. I am liking that version 4.0 right in the middle of the filename.
Step 4) Start Cassandra up. This one is easy if you do the sensible thing
john@Lenny:~/cassandra$ cd .. john@Lenny:~$ cd cassandra/bin john@Lenny:~/cassandra/bin$ ./cassandra john@Lenny:~/cassandra/bin$ CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.deserializeLargeSubset (Lorg/apache/cassandra/io/util/DataInputPlus;Lorg/apache/cassandra/db/Columns;I)Lorg/apache/cassandra/db/Columns; CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubset (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;ILorg/apache/cassandra/io/util/DataOutputPlus;)V CompilerOracle: dontinline org/apache/cassandra/db/Columns$Serializer.serializeLargeSubsetSize (Ljava/util/Collection;ILorg/apache/cassandra/db/Columns;I)I … INFO [MigrationStage:1] 2019-02-06 21:26:26,222 ColumnFamilyStore.java:407 - Initializing system_auth.role_members INFO [MigrationStage:1] 2019-02-06 21:26:26,234 ColumnFamilyStore.java:407 - Initializing system_auth.role_permissions INFO [MigrationStage:1] 2019-02-06 21:26:26,244 ColumnFamilyStore.java:407 - Initializing system_auth.roles
We seem to be up and running. Its time to try some things out:
Step 5) Have fun
We will start out making sure we are up and running by using nodetool to connect and display a cluster status. Then we will go into the CQL shell to see something new. It is important to note that since you are likely to have nodetool and cqlsh already installed on your host, you need to use the ./ in front of your commands to ensure you are using the 4.0 version. I have learned the hard way that forgetting the ./ can result in some very real confusion.
john@Lenny:~/cassandra/bin$ ./nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 115.11 KiB 256 100.0% f875525b-3b78-49b4-a9e1-2ab0cf46b881 rack1
john@Lenny:~/cassandra/bin$ ./cqlsh
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 4.0-SNAPSHOT | CQL spec 3.4.5 | Native protocol v4]
Use HELP for help.
cqlsh> desc keyspaces;
system_traces system_auth system_distributed system_views
system_schema system system_virtual_schema
cqlsh>
We got a nice cluster with one node and we see the usual built-in key spaces. Well um… not exactly. We see two new key spaces system_virtual_schema and system_views. Those look
very interesting. In my next blog, I’ll be talking more about Cassandra's new virtual table facility and how very useful it is going to be someday soon. I hope.