The last couple of weeks has been busy! First Strange Loop then the first Clojure Conj. The conference was the first Clojure conference of it's kind and was packed with technical content. It was single track so I didn't have to worry about missing anything. Below I'll summarize some of the highlights of the conference from my perspective.
(not= DSL macros) Christophe Grand
A very valuable talk. Christophe's central theme was that many of the gains in syntax that can be obtained from macros can also be achieved using functions and other Clojure fundamentals. This was the beginning of a theme for the conference. He gave specific examples of this with his framework enlive. His stories were around users wanting to extend some functionality provided for by the library. Since much of this was implemented with macros, the users of his library ended up being frustrated and were not able to do what they wanted. He then made some substantial changes to the codebase to implement much of the functionality that was in the macros as functions. Backing these by functions with a smaller macro layer on top enabled his users to better take advantage of his framework. This boiled down to the realization that the DSL can exist in functions through clear naming and composibility as well as in the macros. He gave some excellent examples in the construction of a function based DSL for regular expressions (code here)that ended up having functionality not possible with the normal regular expression DSL (or with macros). I've got it on my TODO list to took over Christophe's enlive and regular expression code more in-depth.
Clojure Protocols - Sean Devlin
I'm pretty familiar with protocols and have used them at Revelytix quite a bit, but I still got a lot of value out of Sean's talk. His talk was mainly around creating a protocol of java.util.Date and other date like types in Java. His example created a protocol with a to-ms function that basically converted some sort of date representation to a long representation of that date. From that small abstraction he built many functions that were consumers of that abstraction. I think this was interesting because it was a small abstraction that was really only used internally in that source file. The functions that were more useful to consumers of the abstraction were not extensions of the protocol, but rather used the protocol internally. I thought this was particularly elegant because they need to know nothing about protocols or even that code used them.
Finger Trees - Chris Houser
Chouser gave a great talk on a data structure I had not heard of before, called Finger Trees. This is a new (not quite done yet) feature that will be included in Clojure that can have benefits over the other Clojure persistent data structures for specific use cases. His slides can be found out on github for more info. Seemed to me like the biggest wins of the data structure are the amortized constant time splitting/appending and counting. It achieves this through storing metadata at each tree root node, which makes summarization very easy. I read over some of the code as he was speaking (the code can be found here) and am looking forward to going over it more in-depth.
Keynote - Rich Hickey
It was good to hear Rich Hickey speak in person. This was the first time I've heard him. He went over some of the things to come in the near term for Clojure. I would say the main focus of his talk was on improving the performance of Clojure. One of those performance improvements was declaring when variables are allowed to be rebound. He discussed the need to do some checks to see if something has been rebound and most of the time there is little to no chance that something has been rebound. A good examples of this are the vast majority of the time functions are not rebound or redefined in production code (typically only at the REPL when doing development), yet this overhead always happens. He went over a new bit of metadata that would declare a variable as able to be rebound. I think this is a good change even without performance in mind. Right now we already have a convention that we put asterisks (or ear muffs) around variables that we intend for rebinding. This just codifies that convention and we get a performance boost as a bonus. He also went in depth on some Java primitive performance changes he was making. This also follows a similar theme in that it is a reduction in the complexity of the various primitive types and a performance boost as a bonus. The mismatch being Java primitives, auto-boxing and Clojure can be pretty confusing. His changes simplify many of those problems and through some new function subclasses, auto-boxing can be avoided. There was a lot of useful nuggets from Rich's talk.
One Ring to Bind Them - Mark McGranaghan
This was a talk on ring specifically but I would say it had more to do with design of good Clojure APIs. He covered how abstractions built on top of Clojure sequences with well-named functions could create a very appealing API. This talk was the crescendo of the theme that using Clojure fundamentals can lead to great code. As a testament to the ease of use of the library, there are quite a few other libraries that have been built on top of it.
From Concurrency to Parallelism - David Liebke
David gave a talk on some future Clojure functionality of providing more parallelism options. This was a great talk to have attended after hearing Guy Steele at Strange Loop. Much of what Guy discussed is in this experimental branch of Clojure. Slides of his talk can be found here. He started with the comparing map and pmap and then went over how pmap is different from the new parallel reduce stuff. He gave some example stats on the performance characteristics of these things. He had a pretty big caveat on how useful the stats were, but I think they did a good job of illustrating his point. Implementing this was definitely complex and David went into that a bit, but using the functions seemed very idiomatic and not much different from using plain reduce. It will be exciting to see this progress.
Step Away from the Computer - Rich Hickey
It was the keynote of the conference from Rich. It was a non-technical talk that focused on thinking things through without distractions before writing code or "solving" a problem. He went over some "how your brain works" that seemed similar to Pragmatic Thinking and Learning. He stressed the importance of thorough research, notes and evaluation when solving a problem. I think the areas that he discussed are something that developers can always improve upon.
There were many other good talks that I did not mention. It was good to get an update on Clojure support in Eclipse, hear some of the motivations behind lazy test and zippers. The lightening talks were also good, especially the ones covering Aleph and Infer. There was also a lightening talk by Alex Miller on zippers over records that we have been doing here at Revelytix. There was definitely a lot of excitement around Clojure at the conference. It was a very friendly atmosphere and lots of discussion in-between and after sessions. I talked with several people interested in semantic web technologies, triplestores and the work we're doing at Revelytix. It has encouraged me to blog more about it!
I thought I'd punch up my thoughts on Strange Loop last week here. I think the conference went very well. Sitting next to Alex at work keeps a steady stream of Strange Loop excitement going for months before the conference and I think it delivered. The Pageant was a great venue, with plenty of space. The Moonrise rooms could get a little cramped (standing room only, people sitting on the floor etc) but the bigger rooms were definitely adequate. Having lunch and dinner open is a nice bonus because the Loop has a ton of great restaurants. I didn't watch much of the Strange Passion Talks, but enjoyed having a few beers and talking with other developers. Below are the highlights of the conference from my perspective.
Hilary Mason - Machine Learning: A Love Story
I really enjoyed Hilary’s talk and her presentation style. She wouldn’t necessarily have bulleted lists of what she was going to cover, rather just a background image and she would talk through it (slides here). I thought that this was a good high-level overview of data mining basics. It brought back memories of grad school working through Data Mining by Jiawei Han which she mentioned after a question from the audience (it's the "purple" data mining book she referred to). The talk had several very practical applications of data mining and some good pointers for developers wanted to learn more.
Java Provisioning in the Cloud - Adrian Cole
Adrian gave a rundown of the goals of the JClouds project, the niche that it's trying to fill and some of companies using it (along with info on how they are using it). He also covered some of the philosophy of what goes into JClouds and what doesn't. It really helped me appreciate the fine line that the JClouds folks walk in working with the various cloud providers. There was an example or two in Java with several in Clojure. It was nice to meet Adrian face to face after quite a few discussions online!
Expression problem - Chouser
Great talk. He walked through a somewhat typical code path in Java that would necessitate a “wrapper” class. He then improved the code slightly by monkey patching in a Java like syntax. He then did the same code in Clojure, walking through the implementation then moving it into multi-methods. He then walked through a refactor of multi-methods to protocols (something we have been doing recently at Revelytix as well). He then gave a rundown of the pros/cons of protocols vs. multi-methods. I found most of this talk review, but the material was covered well. The slides from the talk are here, there should also be video of it soon.
How to Think about Parallel Programming: Not! - Guy Steele
Guy Steele's talk was on how we can make programs parallel. His focus was on certain operations having wiggle room in how the code is executed. Specifically his example was around a reduce type of operation where the calculations are not linear but more divide and conquer, where the divide happens on separate threads and the conquer step merges the results from the threads. His examples were in Fortress but seems easily applicable to other languages. I found that I have a big lack of knowledge in this area and am looking forward to doing more research in this topic.
Enterprise NoSQL: Silver Bullet or Poison Pill? - Billy Newport
It's easy to get carried away with new technology. With all the hype around NoSQL, it's certainly ripe for misuse. I liked Billy's reality check on the technology. Slides from the talk are here. His point wasn't that you shouldn't use a NoSQL database, but rather there are trade-offs. He enumerated examples of clear wins and clear losses. He reminded the audience several times that the laws of physics don't change, even with Map/Reduce. I think I knew a lot of what he was saying already, it just wasn't as firmed up as after his talk. Thanks for keeping our feet on the ground Billy!
Querying Big Data Rapidly and Robustly with Cascalog - Nathan Marz
This was a talk given by Nathan Marz about his Clojure based Hadoop/Cascade library. There was a lot of excitement during/after this talk. Nathan created a nice Clojure DSL that made a very succinct representation of Hadoop queries. It shows what the power of a language with as flexible syntax as Clojure can do. For those familiar with SPARQL, the syntax will look especially familiar since his DSL and SPARQL are both based on Datalog. I definitely have that library on my "To Learn" list.
Outside In TDD - Brian Marick
Brian Marick went through some examples of his Clojure based Midje mocking framework. I had some difficulty following some of the syntax and examples, so I think I'll need to download it and try it out before coming to any conclusions about the framework. I did like how Brian had molded Emacs and Slime into a even quicker REPL environment and included it with Midje. His point on filing down the rough edges of your development process (in this case Emacs extensions) is a good take-away from the talk. The combination of some Emacs extensions and the Midje Clojure code looks like you could focus on the testing task at hand in a pretty fast and slick way.
In conclusion, Strange Loop was a great investment of time with a lot of top notch speakers and great topics. There was a lot of technical content with very little marketing. There are several other talks I wanted to make and am hoping to catch the video of when they are released. I think Strange Loop's content is broad enough to appeal to any passionate developer and would definitely recommend going next year.
I gave a talk yesterday at Strange Loop giving a high level of what I've been working on for the last 6 months. The abstract of the talk was:
Determining what RDF repository to use for a project can be a
daunting task. With so many repository choices, benchmarks and usage
scenarios, where do you start? This talk discusses how Revelytix
answered that question. The talk will cover the test framework
written by Revelytix in Clojure, including a language for defining
tests, a harness for executing the tests and using CouchDB to store
the results. Example Clojure code will be included along with a
discussion around the available RDF benchmarks. The talk will also
discuss the test harness using EC2 instances for cheap performance
testing and how we interpreted those results using Incanter.
It was a short talk, so I went through the material pretty fast, but the slides can be found here.