Skip to content

Insight and analysis of technology and business strategy

Scala TreeSet: mutable vs immutable

I was exploring immutable RedBlackTree tree implementation in Scala (2.12.4) when I noticed something that wasn't clear to me: [code language="scala"] private[this] def upd[A, B, B1 >: B](tree: Tree[A, B], k: A, v: B1, overwrite: Boolean)(implicit ordering: Ordering[A]): Tree[A, B1] = if (tree eq null) { RedTree(k, v, null, null) } else { val cmp =, tree.key) if (cmp < 0) balanceLeft(isBlackTree(tree), tree.key, tree.value, upd(tree.left, k, v, overwrite), tree.right) else if (cmp > 0) balanceRight(isBlackTree(tree), tree.key, tree.value, tree.left, upd(tree.right, k, v, overwrite)) else if (overwrite || k != tree.key) mkTree(isBlackTree(tree), k, v, tree.left, tree.right) else tree } [/code] Comparing keys via compare is followed by explicit key not " equals" condition. I compared similar parts of mutable RedBlackTree implementation and haven't found anything similar. So I started to look for the collections that may work differently for the mutable and immutable version. So TreeSet, overwrite is passed as false compared to true TreeMap: this makes sense, for tree map, inserting the same key will replace the value, while this is not needed for TreeSet. Although key equality is still checked, it's still not clear why. Simple test show that mutable and immutable TreeSet work really differently in the following example. Let's define class extending Ordered with two fields that compare distinguishes only the first one: [code language="scala"] case class Dummy(x: Int, y: Int) extends Ordered[Dummy] { override def compare(that: Dummy): Int = Ordering[Int].compare(this.x, that.x) } [/code] Now try to insert the different one by " equals" instances into mutable and immutable versions of sets: [code language="scala"] val mutableTree = scala.collection.mutable.TreeSet.empty[Dummy] mutableTree.add(Dummy(0, 0)) mutableTree.add(Dummy(0, 1)) mutableTree var immutableTree = scala.collection.immutable.TreeSet.empty[Dummy] immutableTree += Dummy(0, 0) immutableTree += Dummy(0, 1) immutableTree [/code] So were the results (2.12.4): [code language="scala"] Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_121). Type in expressions for evaluation. Or try :help. scala> case class Dummy(x: Int, y: Int) extends Ordered[Dummy] { | override def compare(that: Dummy): Int = Ordering[Int].compare(this.x, that.x) | } defined class Dummy scala> val mutableTree = scala.collection.mutable.TreeSet.empty[Dummy] mutableTree: scala.collection.mutable.TreeSet[Dummy] = TreeSet() scala> mutableTree.add(Dummy(0, 0)) res0: Boolean = true scala> mutableTree.add(Dummy(0, 1)) res1: Boolean = false scala> mutableTree res2: scala.collection.mutable.TreeSet[Dummy] = TreeSet(Dummy(0,0)) scala> var immutableTree = scala.collection.immutable.TreeSet.empty[Dummy] immutableTree: scala.collection.immutable.TreeSet[Dummy] = TreeSet() scala> immutableTree += Dummy(0, 0) scala> immutableTree += Dummy(0, 1) scala> immutableTree res5: scala.collection.immutable.TreeSet[Dummy] = TreeSet(Dummy(0,1)) [/code] As we can see mutable TreeSet has not got Dummy(0, 1) while the immutable TreeSet did. More interesting, is what will happen if we try to remove an element: [code language="scala"] scala> mutableTree.remove(Dummy(0, 1)) res6: Boolean = true scala> mutableTree res7: scala.collection.mutable.TreeSet[Dummy] = TreeSet() [/code] Mutable version worked exactly based on compare. For Immutable version: [code language="scala"] scala> immutableTree -= Dummy(0, 0) scala> immutableTree res9: scala.collection.immutable.TreeSet[Dummy] = TreeSet() [/code] Nice, immutable did the same (it has Dummy(0, 1) and removing Dummy(0, 0) works). If we look at immutable RedBlackTree implementation for the del function, we can see that there is simple: [code language="scala"] val cmp =, tree.key) if (cmp < 0) delLeft else if (cmp > 0) delRight else append(tree.left, tree.right) [/code] and there is no specific case for cmp == 0 && k != tree.key. This is not a bug as this violates Ordered contract
It is important that the equals method for an instance of Ordered[A] be consistent with the compare method. However, due to limitations inherent in the type erasure semantics, there is no reasonable way to provide a default implementation of equality for instances of Ordered[A]. Therefore, if you need to be able to use equality on an instance of Ordered[A] you must provide it yourself either when inheriting or instantiating.
I was still interested in finding out why this condition was added, so I went to git and found an original commit. It was exactly what I had expected for an overwrite parameter but there was no clear explanation why key compare was added:
Here we had an issue that RedBlack does not work the same way for sets - which are not supposed to replace an element if it is the same (wrt equals) and maps - which should replace the corresponding values. Adding an overwrite parameter which decides whether to overwrite added keys if they are the same in the ordering.
And also worth noting is that HashSet has nothing to do with ordering so they end up having both values inserted: [code language="scala"] scala> var immutableHash = scala.collection.immutable.HashSet.empty[Dummy] immutableHash: scala.collection.immutable.HashSet[Dummy] = Set() scala> immutableHash += Dummy(0, 0) scala> immutableHash += Dummy(0, 1) scala> immutableHash res11: scala.collection.immutable.HashSet[Dummy] = Set(Dummy(0,0), Dummy(0,1)) [/code]

Top Categories

  • There are no suggestions because the search field is empty.

Tell us how we can help!