Cassandra RandomPartitioner Tokenizing

Nov 2, 2010 / By Laine Campbell

Tags:

So I’m creating a new cluster, and after setting up I needed to get my tokens.  As we’re told in http://wiki.apache.org/cassandra/Operations:

Token selection:

Using a strong hash function means RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify InitialToken to your first nodes as i * (2**127 / N) for i = 0 .. N-1. In Cassandra 0.7, you should specify initial_token in cassandra.yaml.

Here’s a nice simple code snippet to figure out your RandomPartitioner tokens based on the size of your cluster:

#! /usr/bin/python

#nodes = int(raw_input( “How many nodes?” ))

import sys

nodes=int(sys.argv[1])

def tokens(nodes):

    for i in range(1, nodes + 1):

        print (i * (2 ** 127 – 1) / nodes)

This should give something like this:

[root@cassandra06 conf]# ./tokenizer.py 6

28356863910078205288614550619314017621

56713727820156410577229101238628035242

85070591730234615865843651857942052863

113427455640312821154458202477256070484

141784319550391026443072753096570088105

170141183460469231731687303715884105727

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>