Huffman coding problem
  • array
  • huffmancoding
  • string
  • greedy
  •   
  • Posted: 5 years ago
  • Edit
  • answers (1)
  • views (7425)

You are given a string and the frequency of its characters in an integer array. Write an encoding algorithm to compress the data in it losslessly.


Posted Answers

We can use Huffman coding using min heap implemented as priority queue. The algorithm works as :


  • Build the Huffman tree :-

    • Create leaf nodes for each unique character in the string and add it to the min heap.
    • Remove the two nodes with minimum frequency from the heap.
    • Create a new internal node with frequency equal to the sum of the extracted two nodes in the last step,
      as a parent of these two nodes. Add this new internal node to the heap.
    • Repeat the last two steps until there is one node in the heap. This node will be the root and completes the
      Huffman tree.


  • Print the Huffman code from the Huffman tree :-

    • Traverse the binary Huffman tree from root to leaves. Print 0 for one branch and 1 for the other at each internal node. The accumulated 0s and 1s at each leaf constitutes the Huffman encoding corresponding
      the character.


Since each insertion in the priority queue takes O(logn) time and number of nodes for n unique characters in
the string is 2n-1
Therefore Time complexity = O(nlogn).
Example -

/* Main function to build Huffman tree and print code */
public static void HuffmanCode(String word){
/* Count the frequency of each character in the string */
for(int i=0; i < word.toCharArray().length; i++)
frequency[word.toCharArray()[i]]++;

/* Build the huffman tree */
Node node = buildTree(frequency);

/* Print the Huffman code */
System.out.println("Character\t Frequency\t Huffman Code");
printCode(node, new StringBuffer());
}


/* This function builds the Huffman tree */
public static Node buildTree(int[] frequency){
/* Initialize the priority queue */
PriorityQueue pq = new PriorityQueue ();

/* Create leaf node for each unique character in the string */
for(int i = 0; i < frequency.length; i++){
if(frequency[i] > 0)
pq.offer(new Node(i, frequency[i], null, null));
}

/* Until there is only one node in the priority queue */
while(pq.size() > 1){
/* Minimum frequency is kept in the left */
Node left = pq.poll();
/* Next minimum frequency is kept in the right */
Node right = pq.poll();
/* Create a new internal node as the parent of left and right */
pq.offer(new Node('\0', left.frequency+right.frequency, left, right));
}

/* Return the only node which is the root */
return pq.poll();
}

/* This function prints the Huffman code from the Huffman tree (starting from root to leaf) */
public static void printCode(Node root, StringBuffer s){
/* If it is a leaf node */
if(root.isLeaf()){
/* Print the character, its frequency and the code */
System.out.println(root.data + root.frequency + s);
}

/* If it is not a leaf node */
if(!root.isLeaf()){
/* Traversing left branch */
s.append('0');
printCode(root.left, s);
s.deleteCharAt(s.length()-1);
/* Traversing right branch */
s.append('1');
printCode(root.right, s);
s.deleteCharAt(s.length()-1);
}
}

public static class Node(){
private final char char;
private final int frequency;
private final Node left;
private final Node right;

Node(char char, int frequency, Node left, Node right){
this.char = char;
this.frequency = frequency;
this.left = left;
this.right = right;
}

public boolean isLeaf(){
/* Return true if the node has no left and right child */
return (left == null & & right == null);
}
}

You need to Sign In to post your solution.