Circular Lists with Clojure
I was working on some code that repeatedly executed a defined set of queries against a database for a give amount of time. Think throughput testing. The list of queries was very small (like 5) but the number of times each was executed was pretty high (500+). After it executed the 5th query, I wanted it to execute the first one again and continue on. So I thought the easiest way to go about this was to have a circular-type list. I figured in Clojure it probably wouldn't be circular, but rather a lazy list that repeated itself. I looked around the source a bit (the internet connection at work was down, so there was no Google searching) and I wasn't coming up with anything. Not finding anything that fit my needs, I thought about it and made a couple of attempts at a solution. I was pretty unhappy with the solutions I was coming up with. I went back to the Clojure source trying to figure out how best to implement this and I stumbled on cycle. It was exactly what I needed. What struck me about cycle, was how concise and simple it was. First, it's use:
(take 10 (cycle (list 1 2 3 4)))
;=> (1 2 3 4 1 2 3 4 1 2)
It takes the collection passed in and creates a lazy sequence of the items in the list repeated infinitely. I would keep consuming items from this infinite list for some defined amount of time and then stop:
(doseq [query (cycle (query-list...))
:while (keep-going? @run-state)]
(execute-query query...))
In the above code, run-state is an atom that is updated once a particular point of time has been reached and then the query run completes. What struck me about the cycle function, was how simple it was:
(defn cycle
"Returns a lazy (infinite!) sequence of repetitions of the
items in coll."
{:added "1.0"}
[coll]
(lazy-seq
(when-let [s (seq coll)]
(concat s (cycle s)))))
It's a very concise function. Here's a rundown of what it does. The lazy-seq macro takes as it's body an expression that yields a sequence (or nil). It's lazy, so it doesn't actually fully realize the sequence. Inside the lazy-seq call is a when-let which just ensures that what is passed in is a collection that has at least one element in it. If it is not a collection of at least one element, the lazy-seq is just nil. If it does, the sequence that the lazy-seq function is looking for comes from the concat statement. Concat is simple, it takes one or more collections and returns a single collection containing all of the elements in each passed in collection:
(concat (list 1 2 3 4) (list 5 6 7 8))
; =>(1 2 3 4 5 6 7
With that info, the concat in cycle glues together the entire collection that was passed in, followed by a recursive call to itself (passing in the same collection). So with that, if the same 1, 2, 3, 4 list is passed in to cycle, the last line looks like:
(concat (list 1 2 3 4) (cycle (list 1 2 3 4)))
After the first four items are consumed, the recursive invocation of cycle will happen, again producing (list 1 2 3 4) followed by another recursive invocation. Since it's lazy and infinite, this can continue until the query execution time is up, whether it's 5 seconds or 5 hours.
August 12th, 2010 - 08:20
Very nice post, if someone is stuck with Java there is a cycle in Google Guava in the Iterators package that helps dealing with Iterators (Guava is highly recommended for their Iterator helpers like filter etc.).
http://google-collections.googlecode.com/svn/trunk/javadoc/index.html?com/google/common/collect/Iterators.html
Stephan
http://codemonkeyism.com