August 7, 2018

Run Length Encoding - Part 2

In the last post we introduced the very simple Run Length Encoding data-compression scheme. Here's how we do it. The code below is interactive, try editing it to observe how the functions behave.

Encoder - parsing

We begin by parsing the string with a RegEx into groups of repeating characters:

(defn encoder-groups [string]
  (re-seq #"(.)\1*" string))
  
(encoder-groups "AABBBCCCC")

Destructuring

Now that we've parsed the string into a data structure, we can extract the appropriate values. We return each character preceded by its number, but only if it is more than 1:

(defn encoder-values [[group char]]
  (str (if (> (count group) 1)
		    (count group)
	            "")
              char))
              
(encoder-values ["AA" "A"])

map, apply str

Finally, we combine this into one function that maps the result of one to the other and stitches it all together:

(defn run-length-encode [s]
  (let [groups (re-seq #"(.)\1*" s)]
    (apply str
           (map encoder-values groups))))
              
(run-length-encode "aabbbcccc")              

Decoder - parsing

(defn decoder-groups [string]
 (re-seq #"(\d+)?(.)" string))
  
(decoder-groups "12WB12W3B24WB")

Destructuring

(defn decoder-values [[_ n x]]
  (apply str
         (repeat (js/parseInt (or n "1")) x)))

(decoder-values ["12W" "12" "W"])

map, apply str

Here is the complete decoder function:

(defn run-length-decode [s]
  (let [groups (re-seq #"(\d+)?(.)" s)]
    (apply str
      (map decoder-values groups))))

(run-length-decode "12WB12W3B24WB")
Tags: coding exercises Exercism Clojure