HUFFMAN CODING

Introduction

Huffman Coding is used to encode the words in transmission. If we use a fixed length encoding, say ASCII codes(ASCII Codes: 0 to 127 = 0 to 27-1 i.e 7 bits for each character) then for a word of length 10 with 4 characters will take 10 * 7 = 70 bits to be transmitted irrespective of the frequency of each character.

To reduce this, we can use Huffman coding algorithm(for variable length encoding) and it is based on the frequency of occurrence of a character. Most frequently occurring character will use less number of bits for encoding. Similarly less frequently occurring character will use more number of bits for encoding

It generates prefix codes in which no code word generated is a prefix of another code in the word to be transmitted.

Working Procedure

  1. Create n node trees where each node corresponds to the character specifying in the text to be encoded and label with it's frequency/weight.
  2. Find two trees with smallest frequency, create a new tree with the label equal to sum of frequency of the two nodes picked. Make sure the node with smaller frequency(among the two nodes picked) should be the left child.
  3. Repeat step 2 until one single tree is obtained. All the characters of the word to be encoded will be the leaf nodes in this tree.
  4. Then, assign 0's to left edges and 1's to right edges of the tree.
  5. To get the code words for a character, record the label of the edge along the path from root to that character.

That's it from this blog post. If you liked it then do share this blog with your friends or people who wanna get into programming world. Thank You!

Copyright © NStF Blogs 2021