Object Commando languages, development and design


Being Lazy with Clojure

I've done development in some other functional languages, but I've done the most development in OCaml. OCaml is a functional language, but not lazy by default. There are libraries that can be used to cause data structures and such to be lazy, but you have to go out of your way to use them. Clojure is lazy by default. One interesting ramification of this is in the lazy lists that it creates. I'm still trying to learn the language, so I figured I'd parse a CSV type of file that contained contact information that I dumped from a contact application I use. I wanted to get the lines in the CSV into a list so I could take a look at them from a Clojure perspective. To do this, I wrote some code like below:

(with-open [rdr (reader "/some/directory/contactinfo.csv")]
(def lst (line-seq rdr)))

The code basically opens the file, executes the code in the body and then closes the file (think of the def lst... part as in the try block). What I was expecting to happen was line-seq to read each line and store it in lst. Much to my surprise, when I attempted to look at the first element of lst, I received an error message:

java.io.IOException: Stream closed
[Thrown class java.lang.RuntimeException]

This pointed out two interesting things to me. First, when I accessed the list (since it was lazy) it tried to pull the first line out of the file. Since I used the with-open function, that file was closed. Next, I realized that if it was trying to read in from the closed file, (line-seq) was not behaving as I was expecting. The book I'm reading by Stuart Halloway did discuss this, I just forgot. My first reaction was to pass a function that does the parsing to the function above. This would cause all of the operations on the file to occur before the closing of the file. The file I was parsing ended up not being easily parsed line by line, since most of the data items spanned several lines. I found the right function was slurp:

([f] [f enc])
Reads the file named by f using the encoding enc into a string
and returns it.

Slurp will just pull all of the file into a String. I was able to differential the contact title from the contact details easily with the Clojure re-partition function:

([re string])
Splits the string into a lazy sequence of substrings, alternating
between substrings that match the pattern and the substrings
between the matches. The sequence always starts with the substring
before the first match, or an empty string if the beginning of the
string matches.

The only trick to using it was to go back through the list and discard the substrings between the matches. I was easily able to do this is the filter function. More on Clojure and lazy lists to come.

Filed under: Languages Leave a comment
Comments (0) Trackbacks (0)

No comments yet.

Leave a comment


No trackbacks yet.