Object Commando languages, development and design

13Jun/100

Mocking Clojure Protocols with Atticus

Atticus is a mocking library written by Hugo Duncan. For more information on Atticus, you can see Hugo's blog post from May here. I have added the ability to mock protocols to Atticus and would like some feedback as to the best approach for binding the mocked protocol instance. There's a survey below, but first a little background on the implementation.

Mocking Protocols

What makes the protocol instances a bit tough to mock is that they're not just straight Clojure functions. They dip a bit into Java and a bit into Clojure. With that in mind, using Atticus without modification, wouldn't allow the mocking of those protocols. Below is an example of some code that uses this new functionality:

(defprotocol Squared
  (square [impl x]))

(deftest mock-protocol-test
  (expects
   [(instance Squared
	      (square [impl y] (once (* y y))))]
   (is (= 9 (square instance 3)))))

Originally I had a marker function name mock-protocol-instance which didn't serve much purpose and was a bit awkward. Talking with Hugo, I switched it to the above syntax. The first item in that list is instance and is the symbol that the mocked protocol instance is bound to. The next item is the protocol followed by functions. The (once...) wrapped around the body of the function is existing functionality in Atticus that expands into the code to ensure the function was called exactly once. An example of mocking a regular function in Atticus is below:

(deftest test-cube
  (expects
   [(cube [y] (once (* y y y)))]
   (is (= 8 (cube 2)))))

The difference in this code and the first is that the binding syntax is familiar because it is similar to letfn. Below is an example of letfn:

(letfn [(add5 [x] (+ x 5))
	(subtract5 [x] (- x 5))]
  (= 7 (add5 (subtract5 7))))

In the style above, the first argument is the function name and so anything that refers to add5 in the body of the letfn gets the function bound above in the letfn. This letfn type of binding makes sense for Atticus when mocking functions. They both have similar goals, binding a function temporarily. Where this is a little more tricky is in mocking the protocol. In the first example above, the first element in the list is special. It's the symbol bound to the protocol instance. This is really more appropriate for a let style of binding. Where one element is the symbol and the other is an expression. Unfortunately, switching to a let style binding for the expects macro will make the syntax a little more cumbersome for mocking functions because you would have to add "fn". This would probably look something like this:

(deftest mock-protocol-test-alt
  (expects
    [instance (Squared
	       (square [impl y] (once (* y y))))
     cube (fn [y] (once (* y y y)))]
        (is (= 9 (square instance 3)))
	(is (= 8 (cube y 2)))))

The above is just spit balling, but the key point is that expects would use a let binding style. The first is a let style binding of the protocol instance and the second is binding a function.

Put it to a vote!

The question is, which one is better? Is sticking with the letfn binding and the brevity that it allows worth it even though the protocol mocking is a bit different? The letfn style is the first and second code examples above. Or is it confusing enough to warrant a little extra code around mocking functions (the example immediately above). Is there another approach that would be better? Below is just a quick survey on which one is preferable. Thanks for giving your input!


9May/101

Narrowing the Scope of Globals with Let

I've been reading OnLisp by Paul Graham. It's about becoming a better Lisp programmer. It's written for Common Lisp, but I have found quite a bit of it carries over into Clojure. One interesting code snippit I found in the book was on using let when two functions required the same value. Previously I would have done this through a def, like below:

(def step-by 7)
(defn increment [x] (+ x step-by))
(defn decrement [x] (- x step-by))

Paul Graham instead approaches it like:

 (let [step-by 7]
	   (defn increment1 [x] (+ x step-by))
	   (defn decrement1 [x] (- x step-by)))

Obviously the example above is trivial, however there are times that shared immutable data is necessary. Database connection properties, connection URI information etc. I often have a small number of functions that need that sort of data. For these kinds of situations, I like this second approach. It more narrowly scopes the variable and I can't think of any drawbacks. I think this also highlights a difference in approach from imperative languages and functional languages in general. Paul (on page 30 of OnLisp) describes imperative language code structure as more "solid and blockish" whereas functional code tends to be more "fluid". The first example above fits the blockish imperative model where you define your variables and functions at the same level. This is exactly how I would go about it in Java. I've noticed that with the Clojure code I've been writing, and thinking back to code I've written in OCaml, it is definitely more fluid, a less rigid block structure. I've gone through about 6 chapters of the book thus far and am looking forward to reading through the chapters on macros.

Tagged as: , 1 Comment
19Jan/101

Spring Remoting – A Step Toward SOA?

Spring Remoting

Spring Remoting is an RMI type of facility built into the Spring framework. Basically you define an interface and an implementation on a remote application. Spring then places a proxy in your application and when it is called, goes over HTTP to the remote implementation and returns it as if the implementation was local. It's really quite easy to implement using Spring. A few lines of configuration of where to find the remote implementation, a few lines to expose the remote implementation over HTTP and you're set. This ends up being a very cheap way to start having services exposed in your applications. There are definitely some downsides to this approach. The first is that it's only Java. There are some options to use Hessian/Burlap extensions that you can use, but deeper object graphs have difficulty travelling across the wire. Another is the potential set of dependency problems that can occur when using an RMI-like solution.

RMI and Dependencies

Probably the most significant downside to RMI doesn't really occur until you have used it for a while. Maybe you just have a few services that need to be exposed, Spring Remoting seems easy, so you use it. But then it grows, maybe other applications use it and it becomes more critical. The question is what objects are being transferred over RMI? So if you call the service and try to find the address associated with user John Doe, how is the address returned? Probably this is some type of Address class. Then the big question. Where does Address class live? The problem is the Address class needs to be available to both the server (which knows how to look up addresses) and each of the clients calling it. Changes to the service or to the objects can have significant ripple effects in the application. The problem is easy to understand, but slowly creeps up on a project and becomes a dependency nightmare. I thought that this was a logical first step toward a true web service. The problem with this line of thinking is that if it stays this way too long, theres already too much damage and the refactor is too costly.

Why not start with a web service? Web services are somewhat expensive to create so you have to make sure it's necessary. First you must develop some form of input to be accepted. Maybe this is an XML, or a JSON object and whether or not there is a proper schema doesn't really matter. It still needs to be thought about and defined, formally or informally. Next, code needs to be written to translate between the request and the business objects of the back end system. The same translation needs to happen for the response. The client also needs to translate to/from this same intermediate format. There are obviously things that can make this easier like code generation and such, but it's still additional work. In early phases of a project where the inputs/outputs might be changing substantially, this can lead to a developers thrashing with the services and producing very little.

Hibernate and RMI

Another potential RMI gotcha is attempting to transfer Hibernate POJOs. First, Hibernate POJOs are special. They have lazy loaded collections and other proxied objects that are more complex than just your basic JDK objects. The immediate consequence of this is every caller of the RMI service not only needs to have the POJO classes in their classpath, but also the Hibernate jars. The more subtle consequence of this, is what happens when one of those lazily loaded collections is transferred to the caller? The objects can't be lazy loaded from the client, the client doesn't have the database connection etc. From here you really have three options. The first option is to enable remote lazy loading (example here). I've not used this, it seems far too complex and error prone. The second option involves just marking all associations non-lazy (or using joins). Lazy fetching is a nice performance feature of Hibernate and the service will no longer be able to leverage it. The third options is to add a custom object serializer to Spring remoting that will exchange the lazy collections for a real collection. This will remove the dependency on Hibernate and essentially force non-lazy loading of all associations. All of these solutions make RMI less attractive and all of them are a good indication that you should rethink the need for service remoting, or rethink using RMI over a proper web service.

Other workarounds - is it worth it?

There are several techniques that can be used that can reduce the symptoms of these problems. Aside from Hibernate, interfaces for each request and response for the data passed in and returned from the RMI service. This will reduce the amount of data available to the service, require well defined input and output and will be easier to refactor to a service later. All of this adds up to a decent amount of extra work. I think in the end, the extra time involved evens out or becomes more than a proper web service.

Lesson Learned

I think that the lesson I have learned is that Spring remoting does not give you cheap services. Rather it gives you services with a low cost of entry, but that cost climbs much more quickly. With web services, you pay more up front, and less over the long term. I think maybe the best of both worlds is to use RMI/Spring remoting for the very early stages of the project (i.e. before going to prod) so that the service can be ironed out. What input data is really needed? What should be returned? Do we know most of what the service needs to do? With answers to these questions (which will only be known after some development) we are better armed for creating a real web service. At this point, the RMI implementation can be swapped and refactored to a web service, hopefully avoiding the longer term RMI issues discussed above.

Filed under: Best Practices 1 Comment
12Feb/094

Clean Code By Robert Martin – Part 1

I'm just started reading through Clean Code by Robert C. Martin. It's a book on writing good software at a low level. Things like how to name variables, how methods should look etc. His examples have been in Java, but the concepts are very general. I like what he has to say in the book and I agree with a lot of what he says. I have some comments on things I found interesting below.

Switch Statements

He shares a dislike for switch statements like I do. He gave the rather typical refactoring scenario of changing the switch code to be more object oriented through an Abstract Factory and Polymorphism. The example code he gave was:


public Money calculatePay(Employee e) throws InvalidEmployeeType {
  switch (e.type) {
    case COMMISSIONED:
      return calculateCommissionedPay(e);
    case HOURLY:
      return calculateHourlyPay(e);
    case SALARIED:
      return calculateSalariedPay(e);
    default:
      throw new InvalidEmployeeType(e.type);
  }
}

This is a pretty text book refactor to an Abstract Factory. Create a polymorphic method like "caluclatePay()" and have the employee subclasses provide their implementation of it. In the book he discusses burying that code deeper in the stack. In the Enterprise Java world, I find myself puting that logic into Hibernate. In this case, the Employee has a type, COMMISSIONED, HOURLY and SALARIED. Typically what I do is have the Employee become abstract and in the Hibernate mapping, indicate the descriminator as maybe "employeeType". Then I map all three employee types as their own subclass of employee (maybe using single table inheritence). I end up with three more classes CommisionedEmployee, HourlyEmployee and SalariedEmployee, but they should probably be pretty small classes. With this set up Hibernate does the dirty work previously done by the Abstract Factory described in Clean Code.

How to Name Your Interfaces

This always seems to be a hot button issues. I'm not entirely sure why. There is the more straightforward case of several up front implementations of an interface. For example in a bank scenario, you have Account and implementers of Account are maybe SavingsAccount, CheckingAccount etc. What seems to cause controversy is when initially there is only one implementation. As an example, maybe there is an Account Service. In Clean Code, the suggestion is having AccountService and then AccountServiceImpl. As a nice counter example, in Implementation Patterns by Kent Beck, I remember him taking the approach of IAccountService and AccountService. In general, I have found fierce opposition to prefixing interfaces with I. My personal preference is, if it can be named differently (such as Account above) it should be, if there's only one, I prefer Kent Beck's approach. To me, it seems like Impl is redundant and doesn't tell me much. Every implementation of the interface is an "impl". However, I have found that there are far more pressing issues in the code than whether or not to have Impl on the end of a class name.

Comments

After reading through the chapter on comments and giving it some thought, I agree with a lot of what he has to say. One quote in particular at first struck me as odd, "The proper use of comments is to compensate for our failure to express ourself in code. Note that I used the word failure. I meant it. Comments are always failures. " But when I thought about it, it made a lot of sense. The code is what matters, it's what is executed and in the end, comments are just decoration. We know the code has to be up to date, the comments are questionable. If I'm putting comments in to help understand the code, then the code is not very understandable on it's own, and I should work to make it better. Several times he mentions that when too many comments appear, or the same comment appears too frequently, we automatically block it out. I agree with this and I find myself blocking out most comments most of the time. I think the reason why these noisy or redundant comments make their way into the code because of a culture of comments. Early on, we are taught that comments are good and we should write them. So when a code review happens, and there's no comments on the methods of a class, there's usually a "oh, there should be a comment here" response. It usually has nothing to do with the actual method and most of the time (myself included), we don't ask, why there should be a comment there. Is it because the method name isn't intention revealing? Is the parameter or return type ambiguous?

Filed under: Best Practices 4 Comments