### Red-Black tree¶

A Red-Black tree is a self-balancing binary search tree. The main difference to a normal binary tree is an extra bit that stores the parity (the color) of each node. This bit can be either be 0 (black) or 1 (red). An example of a red-black tree is shown in Figure 1 (Hardcoded)

The following invariant must be satisfied every time an operation is being made on node of a Red-Black tree:

1. Every node is either red or black
2. The root is always black
3. Every leaf is black
4. If a node is red, then both its children are black
5. All paths from a node to descendant nodes contain the same number of black nodes

The node of a Red-Black tree can be described by the following data type:

In [8]:
from enum import Enum

class RBNode:

def __init__(self, color, key, left, right, parent):
self.left = left
self.right = right
self.parent = parent
self.key = key
self.color = color

def __str__(self):
return str("(%s)%s" % (self.color, self.key))

class NodeColors(Enum):
BLACK = 0;
RED = 1;


The data structure of the tree is nothing more than a collection of nodes with the root node as the starting point. The constructor initializes a root node to a provided value

In [9]:
class RBTree:

def __init__(self, key):
self.nil = RBNode(NodeColors.BLACK, None, None, None, None)
self.root = RBNode(NodeColors.BLACK, key, self.nil, self.nil, self.nil)


Your task for this assignment is to construct algorithms that manipulate Red-Black trees.

## Constructing a Red-Black Tree¶

We start constructing a Red-Black tree by calling the constructor

tree = RBTree(5)

Inserting a node consists of three steps:

1. Finding the appropriate branch to insert the node in the tree
2. Inserting the node and creating links to the tree
3. Rebalancing the tree

For Step 1, the procedure is not much different from a normal binary search tree. Specifically, the algorithm first finds the appropriate position for the new node by searching through the tree. In the following snippet, z is the value we would like to insert. However, the snippet has a mistake.

def search(self, tree, z):
y = tree.nil
x = tree.root
while x != tree.nil:
y = x
if z.key < x.key:
x = x.left
else:
y = x.right
return y


T1 (10 points): Spot and fix the mistake in the search method

In [10]:
def search(tree, z):
y = tree.nil
x = tree.root
while x != tree.nil:
y = x
if z.key < x.key:
x = x.left
else:
#             y = x.right
x = x.right
return y


Using the (hopefully correct by now!) search method, we can locate the node to insert our new node as a child. This happens by creating the node and linking it to the appropriate super/sub nodes. Note that for new nodes, the starting color is always RED.

T2 (20 points): Write code to construct and link a new node. Hint: Use the search method inside the insert method. Hint: Make the method return the new node, this is usefull for later

In [11]:
def insert(tree, z):
newNode = RBNode(NodeColors.RED, z, tree.nil, tree.nil, None)
parent = search(tree, newNode)

if newNode.key < parent.key:
parent.left = newNode
else:
parent.right = newNode

newNode.parent = parent;
return newNode

In [12]:
#Test you implementation here
tree = RBTree(5)
insert(tree, 10)

print(tree.root.right)

(NodeColors.RED)10


After inserting the node, the tree potentially needs to be rebalanced. An example of why this is so can be seen in the following plot. What we see here is that the insertion of node 4 violates invariant number 4.

The rebalancing operation can be quite complicated; the pseudocode below takes care of all the cases.

proceduce rb_balance(T, z)
'' T is an RBTree
'' z is the newly inserted node

while z.parent.color == RED
if z.parent == z.parent.parent.left
y <- z.parent.parent.right

if y.color == RED
z.parent.color <- BLACK
y.color <- BLACK
z.parent.parent.color <- RED
z <- z.parent.parent
else
if z == z.parent.right
z <- z.parent
left_rotate(T, z)
z.parent.color <- BLACK
z.parent.parent.color <- RED
right_rotate(T, z.parent.parent)
else
y <- z.parent.parent.left
if y.color == RED
z.parent.color <- BLACK
y.color <- BLACK
z.parent.parent.color <- RED
z <- z.parent.parent
else
if z == z.parent.left
z <- z.parent
right_rotate(T, z)
z.parent.color <- BLACK
z.parent.parent.color <- RED
left_rotate(T, z.parent.parent)
T.root.color <- BLACK

You might be wondering with left_rotate and right_rotate are. They are functions that represent common operations on trees. To understand how they work have a look at the plot below.

Below you can find the Python code for the left_rotate method.

In [13]:
def left_rotate(tree, x):
y = x.right
x.right = y.left

if y.left != tree.nil:
y.left.parent = x
y.parent = x.parent

if x.parent == tree.nil:
tree.root = y
elif x == x.parent.left:
x.parent.left = y
else:
x.parent.right = y
y.left = x
x.parent = y


T3 (20 points): Implement the code for right_rotate

In [14]:
def right_rotate(tree, x):
y = x.left
x.left = y.right

if y.right != tree.nil:
y.right.parent = x
y.parent = x.parent

if x.parent == tree.nil:
tree.root = y
elif x == x.parent.right:
x.parent.right = y
else:
x.parent.left = y
y.right = x
x.parent = y

In [15]:
tree = RBTree(5)
insert(tree, 3)
insert(tree, 2)
insert(tree, 4)
insert(tree, 7)

print(tree.root)
print(tree.root.left)
print(tree.root.left.left)
print(tree.root.left.right)
print(tree.root.right)
print()

right_rotate(tree, tree.root)

print(tree.root)
print(tree.root.left)
print(tree.root.right)
print(tree.root.right.left)
print(tree.root.right.right)

(NodeColors.BLACK)5
(NodeColors.RED)3
(NodeColors.RED)2
(NodeColors.RED)4
(NodeColors.RED)7

(NodeColors.RED)3
(NodeColors.RED)2
(NodeColors.BLACK)5
(NodeColors.RED)4
(NodeColors.RED)7


T4 (25 points): Implement the rebalancing pseudocode in Python

In [16]:
def balance(tree, z):
while z.parent.color == NodeColors.RED:

if z.parent == z.parent.parent.left:
y = z.parent.parent.right
if y.color == NodeColors.RED:
z.parent.color = NodeColors.BLACK
y.color = NodeColors.BLACK
z.parent.parent.color = NodeColors.RED
z = z.parent.parent
else:
if z == z.parent.right:
z = z.parent
left_rotate(tree, z)
z.parent.color = NodeColors.BLACK
z.parent.parent.color = NodeColors.RED
right_rotate(tree, z.parent.parent)

else:
y = z.parent.parent.left
if y.color == NodeColors.RED:
z.parent.color = NodeColors.BLACK
y.color = NodeColors.BLACK
z.parent.parent.color = NodeColors.RED
z = z.parent.parent
else:
if z == z.parent.left:
z = z.parent
right_rotate(tree, z)
z.parent.color = NodeColors.BLACK
z.parent.parent.color = NodeColors.RED
left_rotate(tree, z.parent.parent)

tree.root.color = NodeColors.BLACK

In [17]:
#Test you implemention here
tree = RBTree(5)
first_node = insert(tree, 10)
balance(tree, first_node)

second_node = insert(tree, 15)
balance(tree, second_node)

print(None)
print(tree.root)
print(tree.root.left)
print(tree.root.right)
print()

third_node = insert(tree, 12)
balance(tree, third_node)
print(tree.root)
print(tree.root.left)
print(tree.root.right)
print(tree.root.right.left)

None
(NodeColors.BLACK)10
(NodeColors.RED)5
(NodeColors.RED)15

(NodeColors.BLACK)10
(NodeColors.BLACK)5
(NodeColors.BLACK)15
(NodeColors.RED)12


## Validating the Red-Black tree invariants¶

T5 (25 points): Create a method that will exhaustively search whether the five invariants described above are maintained in the tree.

In [69]:
import queue

def check(tree):
"@return: True, only if tree satisfies all the Red-Black tree invariants"
if not either_red_black(tree):
print("Not all nodes either red or black")
return False

if not root_is_black(tree):
print("Root is not black")
return False

if not every_leaf_black(tree):
print("Not every leaf is black")
return False

if not red_node_children_black(tree):
print("There is/are red node(s) with non-black children")
return False

if not same_number_black(tree):
print("Not every path down had the same amount of black nodes")
return False

print("All invariants are met")
return True

def either_red_black(tree):
q = queue.Queue()
q.put(tree.root)

while (not q.empty()):
current = q.get()

if current == None:
continue

if current.color != NodeColors.BLACK and current.color != NodeColors.RED:
return False

q.put(current.left)
q.put(current.right)

return True

def root_is_black(tree):
return tree.root.color == NodeColors.BLACK

def every_leaf_black(tree):
q = queue.Queue()
q.put(tree.root)

while( not q.empty()):
current = q.get()

if current == None:
continue

if current.left == None and current.right == None:
if current.color != NodeColors.BLACK:
return False

q.put(current.left)
q.put(current.right)

return True

def red_node_children_black(tree):
q = queue.Queue()
q.put(tree.root)

while( not q.empty()):
current = q.get()

if current == None:
continue

if current.color == NodeColors.RED:
if current.left.color == NodeColors.RED:
return False
elif current.right.color == NodeColors.RED:
return False

q.put(current.left)
q.put(current.right)

return True

def same_number_black(tree):

try:
__same_number_black(tree.root)
except:
return False

return True

def __same_number_black(node):
if node == tree.nil:
return 1

left = __same_number_black(node.left)
right = __same_number_black(node.right)

if left != right:
print("ErrorError")
raise InvariantException()

if node.color == NodeColors.BLACK:
left += 1

return left

In [70]:
#Test all of the five invariants here
import copy

print(tree.root.color)
print(check(tree), "\n")

tree1 = copy.deepcopy(tree)
wrongNode = insert(tree1, 55)
wrongNode.color = None
print(check(tree1), "\n")

tree2 = copy.deepcopy(tree)
tree2.root.color = NodeColors.RED
print(check(tree2), "\n")

tree3 = copy.deepcopy(tree)
wrongNode = insert(tree3, 55)
wrongNode.left = None
wrongNode.right = None
print(check(tree3), "\n")

tree4 = copy.deepcopy(tree)
wrongNode = insert(tree4, 55)
wrongNode.parent.color = NodeColors.RED
print(check(tree4), "\n")

tree5 = copy.deepcopy(tree)
wrongNode = insert(tree5, 55)
wrongNode.color = NodeColors.BLACK
print(check(tree5), "\n")

NodeColors.BLACK
Not every path down had the same amount of black nodes
False

Not all nodes either red or black
False

Root is not black
False

Not every leaf is black
False

There is/are red node(s) with non-black children
False

Not every path down had the same amount of black nodes
False



## Proving performance properties¶

T6 (10 points): Prove that the average and worse case complexity for the insertion in Red-Black trees is $O(log(n))$.

## References¶

1. A Red-Black tree in Java
2. A Red-Black tree in C
3. Wikipedia on Red-Black Trees