Object Commando languages, development and design


The Magical Island of Kanren – core.logic Intro Part 1


Previously I blogged on Appendo the Great without going into much detail on core.logic, what it is, how to use it etc. Hopefully this blog post will begin to fix that. This post was also inspired by a tweet from Ambrose Bonnaire-Sergeant on explaining a basic logic program with an unbound variable. That tweet got me thinking about how I would explain it.

First, core.logic is a Clojure implementation of miniKanren. miniKanren is a Logic Programming library embedded in Scheme. As the site states, it's mini in that there is another Kanren system with more bells and whistles, but is quite a bit more complex. The core.logic implementation follows the Scheme version pretty closely from a library user's perspective, so examples in books/papers such as Byrd's dissertation or The Reasoned Schemer translate pretty closely to the equivalent Clojure code.

The approach taken by the miniKanren folks gives the developer a lot of power by letting them pick the right tool for the job. Embedding the logic programming paradigm into a functional language allows you to use functional tools to solve your problems and delve into logic when it makes things easier. There's no need to interface with another language, you're in Clojure and you can use it directly.


Getting core.logic incorporated into your project works the same as any other Java or Clojure library, just include it in your project dependencies. Including it in your Leiningen project.clj would look like:

  :dependencies [...
                  [org.clojure/core.logic "0.6.5"]

Then lein deps and you're set. The examples below assume having a namespace setup like:

(ns logic-examples.ancenstors
  (:refer-clojure :exclude [==])
  (:use [clojure.core.logic]))

The Island of Kanren

Imagine the magical island of Kanren. On this island, just declaring things makes them so. There can be multiple right answers to a question and sometimes there are no answers at all. Here, seemingly familiar operations take on a different meaning and the natives speak with a unique dialect. Your code lives in functions and macros, just like any Clojure program. The key difference is your Kanren programs are really more like puzzles that you want the Kanren-ites to solve. Your logic functions on their own aren't much good, but the Kanren natives can do amazing things with them to solve your puzzle. The only way to get answers off of the island is to send your puzzle over with a raft [ ]. Unfortunately, there are some small holes in the raft and we need to put a bucket on it before we send the raft over. I'll call this bucket q, but you can call it whatever you like. Armed with our raft and bucket we can send it over with a puzzle and the Kanren natives will solve the program and send the bucket back. This bucket will contain the things (bindings) that the natives needed to solve the program. The easiest, yet still confusing program, just declares something that is always true:

(run* [q]
  (== 1 1))
= > (_.0)

It may seem strange at first, but the natives of Kanren have sent back a bucket with a single value _.0 in it. In the language of Kanren, this means anything and nothing. It means to solve the puzzle I sent them (which just declares that 1 is 1) q can take on any value. The number 7, a list of strings, nil etc. This makes sense as the Kanren natives can solve the puzzle with anything since the q does not actually play a role in solving the puzzle. We could make it a little more difficult and send over another raft asking for a 1

(run* [q]
  (== q 1))
= > (1)

Here we sent over a puzzle that could only be solved with q having the value of 1. There is no other way for the Kanren-ites to solve the puzzle (and still have it work), so they sent back a value of 1 for q. The reason I relate q to a bucket in a raft, is that's our only way of getting information off of the island. All other information stays on the island:

(run* [q]
  (fresh [x y]
    (== x "foo")
    (== y "bar")
    (== q 1)))
= > (1)

The fresh call here is just a way to get a new variable (er, new buckets). This new variable is just like q, but it always stays on the island. Note our answer is the same as before because nothing has changed with the bucket vs. the program before. It's possible that q has more than one value (examples will be below) and it's also possible for q to be a collection:

(run* [q]
  (fresh [x y]
    (== x "foo")
    (== y "bar")
    (== q [1 x y])))
= > ([1 "foo" "bar"])

The above gets the information from x and y by stuffing it in q. So q is where we want the answer to be stored since that is what comes back on the raft. If we don't put the answer in the q bucket, we won't get anything useful. It's also possible that q has no answer. We could play a trick on the Kanren-ites and send them an unsolvable puzzle:

(run* [q]
  (== q 1)
  (== q 2))
= > ()

They saw through my ruse. It's not possible to solve a puzzle where q is both 1 and 2, so they send back an empty bucket.

Less Metaphor more Examples

Above, I covered the mechanics of running a core.logic program, er, puzzle. A logic program follows some different rules, so they need to be ran inside the run* block. The above examples only used run* which asks for all possible solutions to the program. Another key point from the above program is that I never assigned a value to q. I declared that q should equal 1. The Kanren-ites needed to take it from there. Instead I could have declared it equal to anything, a string, list, character etc. To cover some of the core.logic building blocks, I'll side-step integers and lists for the time being. To demonstrate some of these basic features, I'll be using some of the relation and fact features of core.logic. First I'll start with a father and mother relation:

(defrel father Father Child)
(defrel mother Mother Child)

(facts father [['Vito 'Michael]
               ['Vito 'Sonny]
               ['Vito 'Fredo]
               ['Michael 'Anthony]
               ['Michael 'Mary]
               ['Sonny 'Vicent]
               ['Sonny 'Francesca]
               ['Sonny 'Kathryn]
               ['Sonny 'Frank]
               ['Sonny 'Santino]])

(facts mother [['Carmela 'Michael]
               ['Carmela 'Sonny]
               ['Carmela 'Fredo]
               ['Kay 'Mary]
               ['Kay 'Anthony]
               ['Sandra 'Francesca]
               ['Sandra 'Kathryn]
               ['Sandra 'Frank]
               ['Sandra 'Santino]])

From this basic setup we can ask some basic questions:

(run* [q]
  (father 'Vito q))
= > (Sonny Michael Fredo)

(run* [q]
  (father q 'Michael))
= > (Vito)

These questions use the father relation. Given Vito is the father, what values of q would satisfy the father relation? We could do the same thing for the reverse. Given a child, who's the father? Note in the first example that there are three valid solutions to the relation, and any of them are correct. If I only asked for one solution:

(run 1 [q]
  (father 'Vito q))
= > (Sonny)

I would just get Sonny.


To ask better questions, we'll need conde. First lets define as a function, what the definition of parent is:

  (defn parent [p child]
      ((father p child))
      ((mother p child))))

(run* [q]
  (parent q 'Michael))
= > (Vito Carmela)

The function parent above uses something new. It's one of the core building blocks of core.logic, called conde. I think of conde as a very fancy or. If you think of an or statement in a functional or imperative languages, it finds the first true value, then stops. On the Island of Kanren, there can be many correct answers to a puzzle, and again the Kanren-ites are trying to solve the puzzle every way possible. So in the above, there are two possibilities when using parent according to the defined relation. It tries the father relation first. If that passes, it will count as one answer. It will then come back through, act as if the first condition failed and will run the second statement. In this case, the first and second line are mutually exclusive, so it's pretty similar to a straight or.

(run* [q]
  (fresh [x y z]
      ((== x "1st"))
      ((== y "2nd"))
      ((== z "3rd")))
    (== q [x y z])))
= > (["1st" _.0 _.1] [_.0 "2nd" _.1] [_.0 _.1 "3rd"])

The above illustrates this trying of the various clauses. Each of the above three clauses will always pass if executed. The three answers returned are all separate, so the first answer only has a value for x. The y and z are still fresh (and could be any value). The same is true for the second and third answer.

More Relations

These functions can also stack on each other:

(defn grandparent [gparent child]
  (fresh [p]
    (parent gparent p)
    (parent p child)))

(run* [q]
  (grandparent q 'Anthony))
= > (Vito Carmela)

(run* [q]
  (grandparent 'Vito q))
= > (Francesca Frank Vicent Mary Kathryn Santino Anthony)

The definition of grandparent is that the grandparent gparent is the parent of some new person p and that person p is the parent of child. The Kanren-ites can also solve more complex puzzles where the grandparent and child is unknown but the parent is known:

(run* [q]
  (fresh [gparent p child]
    (== p 'Michael)
    (parent p child)
    (grandparent gparent child)
    (== q {:grandparent gparent :parent p :child child})))

= > ({:grandparent Vito, :parent Michael, :child Mary}
     {:grandparent Vito, :parent Michael, :child Anthony}
     {:grandparent Carmela, :parent Michael, :child Mary}
     {:grandparent Carmela, :parent Michael, :child Anthony})

There are a few things interesting about the above example. First no child or grandparent was specified. In this puzzle, we specified the parent (in this case Michael) but nothing else. The Kanren-ites had to use this constrant in trying to find the grandparent(s) and children. The second interesting piece of the above is the usage of the map as the value of q. Since there are several answers and several pieces of information for each answer, I stuffed the results in a map. For the puzzle to be solved, the appropriate values needed to be plugged into the map.

What Next?

There's a lot more to cover in core.logic. Specifically, I was pretty light on the magic and "special-ness" of relation functions, variables etc. I'll cover that in my next post along with doing useful things with lists and matching.

Since the library is fairly new, there's not a ton of resources on it yet. For more information on core.logic specifically, Ambrose's Logic Starter has good information and there's some in the core.logic README. Byrd's dissertation and The Reasoned Schemer are very good. They are written in Scheme, but they translate to Clojure easily.


Appendo the Great

I have been spending a some time recently trying to get a better understanding of logic programming. I was very much inspired by David Nolen's core.logic library. I am reading several different papers and books trying to understand the paradigm. All of them talked about an append function, but I was having difficulty understanding why it was so important.

From the Reasoned Schemer, conso, the relational form of cons, gets the title conso the magnificent. And conso certainly is magnificent, but I think appendo should get some credit too. In core.logic and miniKanren, append is called appendo. Like concat in Clojure it appends two lists together to create a third list. It can be used as a building block for many other functions. The definition of the function (from core.logic) is pretty small:

(defne appendo [x y z]
  ([() _ y])
  ([[?a . ?d] _ [?a . ?r]] (appendo ?d y ?r)))

defne above is syntax sugar around matche. It creates a function (named appendo) that accepts three arguments. It matches x, y and z with the three arguments in the list at the beginning of each clause. In this case the first clause is matched cons style. ?a is declared as the first item and ?d is declared as the rest. The second item is ignored, the third is split in the same way. Note that the ?a is shared here on the first and third parameter of the match. This makes the first item in x, the first item in z. Assuming the x and z can be matched as above, appendo is recursively called on the rest of each list (including y, untouched). Something similar to the shared ?a is going on when the recursion bottoms out (when x is empty). In this clause y is declared as z. This makes y the tail of a cons like expression (specifically the ?r) from the previous recursive call. That's a lot packed into such a small amount of code! Below I will build the case for appendo's new title.

Building Blocks

To run the below code, you need to use the right namespaces. Something like this:

(ns logic-examples.seq
  (:refer-clojure :exclude [inc reify ==])
  (:use [clojure.core.logic minikanren prelude nonrel match disequality]))

I thought I understood the appendo function, but I didn't understand why each source spent a lot of time on it. It seemed straight-forward enough. Then I had that ah-ha moment while reading up on Prolog's prefix and suffix functions. Here's something roughly equivalent in core.logic that I wrote before my ah-ha:

(defne prefixo [x y]
  ([[] _])
  ([[?a . ?xs] [?a . ?ys]] (prefixo ?xs ?ys)))

This declares that x is a prefix of y. So it could be used to delcare an x as [1 2] and y as [1 2 3 4 5]. Like above, it uses defne, matching in a similar way. I'm using the shared ?a? to ensure the first item of x is the first item of y and continue recursively. I then began to think of appendo as less like concat in Clojure and more like what list could I concat with this other list to yield some result list (user specified or not). That realization led me to change my prefixo function to be the following:

  (defn prefixo2 [x y]
    (fresh [a]
           (appendo x a y)))

This function appends x with a new (or fresh) variable to yield the value y. So substituting values into the appendo call (using the above example): (appendo [1 2 3] a [1 2 3 4 5]). When unified, a is obviously [4 5] because appending [1 2 3] with [4 5] would satisfy the result list [1 2 3 4 5]. With some great core.logic magic, a will match anything that would appear after items in x in order to satisfy y. Part of the mind-bending nature of logic programming is the mixture of declaring, binding, predicates and the wiggle room that result variable gives you. As an example, there's no need to provide a defined y in the above call:

(run* [q]
  (prefixo2 [1 2 3] q))
= > ((1 2 3 . _.0))

This code runs prefixo but leaves y fresh. This lets core.logic decide what the value should be. So it is asking what will satisfy prefixo2 when x is [1 2 3]. The answer from core.logic is (1 2 3 . _.0). ._0 is the tail of the list and could represent anything, meaning that y could be a list with 1, 2, 3 followed by nothing or 4 or ["Klaatu" "Barada" "Nikto"]. These arbitrary values are numbered to distinguish between different values in the case that there could be multiple arbitrary values that could be (but don't have to be) different. More examples of _.0 are below. The above code will also handle other constraints on the list:

(defn lasto
  "Declares x as the last item in y"
  [x y]
  (fresh [a]
         (appendo a [x] y)))

(run 3 [q]
  (lasto "Hail to the king baby" q)
  (prefixo2 [1 2 3] q))
= > ((1 2 3 "Hail to the king baby")
        (1 2 3 _.0 "Hail to the king baby")
        (1 2 3 _.0 _.1 "Hail to the king baby"))

This basically says give me three lists where 1, 2, 3 is the prefix and the last element is "Hail to the king baby". A proper [ Army of Darkness ending. The easiest one is the list containing only those four items, which is the first returned. The next solution says any one element can exist between 3 and "Hail to the king baby". The next solution says any two elements can be between the theird and the last elements etc. Passing 3 to run tells core.logic how many solutions you want it to come up with. Using run* will ask for all solutions (not a good idea with this example code). This code has two fresh variables _.0 and _.1. These could be any value. They could be different values or they could both be the number 42. They are labeled distinctly because they could have distinct values.

More appendo fun

Seeing what I could do with prefix, I then followed that with suffix:

(defn suffixo
  "Declares the elements of x as the end of list y"
  [x y]
  (fresh [a]
         (appendo a x y)))

Nothing new there. membero is a function in core.logic that declares that x (an element) occurs in a list y. A version can also be built out of appendo (though probably not as efficiently):

(defn my-membero
  "A (probably inefficient) function stating x is an element in y"
  [x y]
  (fresh [a b]
         (appendo a (llist x b) y)))         

(run* [q]
     (my-membero q [1 2 3 4 5]))
= > (1 2 3 4 5)

(run 5 [q]
  (my-membero 3 q))
= > ((3 . _.0) (_.0 3 . _.1) (_.0 _.1 3 . _.2) (_.0 _.1 _.2 3 . _.3) (_.0 _.1 _.2 _.3 3 . _.4))

The first example of it's use gives a list of [1 2 3 4 5] and asks core.logic to come up with all of the members (which is just each element). The second asks the opposite question. Declaring that 3 is in the list, what are the lists that would satisfy it. The first result has 3 as the first element (followed by anything), the second list has 3 as the second element (followed by anything) and so on.

Using appendo to find sublists was interesting:

(defn sublisto
  "The list x (in sequence) appears in y"
  [x y]
  (fresh [a b c]
         (appendo a b y)
         (appendo x c b)))

= > (set (run* [q]
        (sublisto q [1 2 3 4 5])))
= > #{() (1) (2) (1 2 3) (2 3) (3) (3 4) (4 5) (2 3 4) (1 2 3 4 5) (1 2) (2 3 4 5) (4) (3 4 5) (1 2 3 4) (5)}

Chaining it together with my-membero can find all sublists containing a particular number:

(set (run* [q]
       (sublisto q [1 2 3 4 5])
       (my-membero 2 q)))
= > #{(2) (1 2 3) (2 3) (2 3 4) (1 2 3 4 5) (1 2) (2 3 4 5) (1 2 3 4)}

Or with the help of adjacent, we can find all sublists where two numbers occur next to each other:

(defn adjacent
  "Declares x and y as adjacent items in list z"
  [x y z]
  (fresh [a b]
    (appendo a (llist x y b) z)))

(set (run* [q]
       (sublisto q (apply concat (repeat 2 [1 2 3 4 5])))
       (adjacent 5 1 q)))         

= > #{(1 2 3 4 5 1 2 3) (4 5 1 2 3) (3 4 5 1 2 3 4 5) ...}

This will only find sublists that have x followed by y, and it must be in that order. To make adjacent a little more flexible, the below version will find x and y next to each other, regardless of order:

(defn adjacent
  "Declares x and y as adjacent items in list z"
  [x y z]
  (fresh [a b]
          ((appendo a (llist x y b) z))
          ((appendo a (llist y x b) z)))))

(set (run* [q]
       (sublisto q (apply concat (repeat 2 [1 2 3 4 5])))
       (adjacent 5 1 q)))         

= > #{(1 2 3 4 5 1 2 3) (4 5 1 2 3) (3 4 5 1 2 3 4 5) ...}

(set (run* [q]
       (sublisto q (apply concat (repeat 2 [1 2 3 4 5])))
       (adjacent 1 5 q)))

= > #{(1 2 3 4 5 1 2 3) (4 5 1 2 3) (3 4 5 1 2 3 4 5) ...}

(= (set (run* [q]
          (sublisto q (apply concat (repeat 2 [1 2 3 4 5])))
          (adjacent 5 1 q)))
   (set (run* [q]
          (sublisto q (apply concat (repeat 2 [1 2 3 4 5])))
          (adjacent 1 5 q))))

= > true

That's a lot of power from a little function. I think appendo deserves the title. The code for the functions above can also be found here.


CouchDB with Clojure

IBM developerWorks just published an article I wrote on CouchDB and Clojure. It uses clj-http, Clutch and CouchDB. It aims to cover the REST APIs of CouchDB by way of Clojure, using parallel examples in Clutch and straight HTTP calls using clj-http. It also discusses CouchDB views and replication. Big thanks to Tunde for Clutch, Mark for clj-http, the CouchDB folks and Alex for reviewing the article.


Getting Specific Document Revisions with Clutch 0.2.4

I had intended to get this out last week, but didn't quite make it. On 2/3/2011, version 0.2.4 of Clutch. One of the new features is support for getting a specific revision of a document. The code to do that looks like:

(clutch/get-document document-id {:rev revision-id} #{:rev})

The call to get a specific revision is the same as getting the most recent version except for the two extra paramters to the get-document function. The first is {:rev revision-id} and is a query-paramter map that Clutch uses when asking CouchDB for the document. In the example above, I'm asking CouchDB for a document with document-id document-id and a revision of revision-id. Typically when sending query paramters to CouchDB, you want to JSON encode them, however there are some special built-in CouchDB paramters (such as revision) that you do not want to encode using JSON. The last argument is #{:rev}, telling Clutch to not encode the :rev keypair in the query map. The third argument above can be any function that determines which revision to exclude.

CouchDB 1.0.1

Included in this release is changes needed for the Clojure view server to work with CouchDB 1.0.1 (thanks Tunde!). Previously, attempting to use the Clutch view server with the latest version of CouchDB would result in server side timeouts calling to the Clutch view code. For more information on setting up the Clojure view server see the README.

Tagged as: , No Comments

Blogging with org2blog

It's been a while since I've blogged! I'm hoping to blog more this year than I did last year. One improvement I'm making to my blog process is org2blog mode. I spend a lot of my day in Emacs and so it's pretty typical that I write notes and at least a portion of blog posts in Emacs, often all of it. org-mode for Emacs is easy to use library for marking up plain text. I had been using it to write blog posts for sometime, however I would typically export from org-mode to HTML, then paste it into WordPress's editor. This was time consuming (though still easier than writing the whole entry in the WordPress editor). As a result, fewer blog posts made the transition from org-mode to WordPress. This process is now very stream-lined thanks to org2blog.

Setting up org2blog

The first snag I hit setting up org2blog was an outdated version of org-mode. Although the project page doesn't specify a version (just saying the latest) I found that it needs definitely newer than version 6.34. If you're using the version shipped with emacs, you will probably need to upgrade (23.1 which is what I'm using came with 6.21). When I tried to use org2blog to post a new draft of this blog post with that old version of org-mode, I received an error that it could execute the org-find-exact-headline-in-buffer function. It looks like that function didn't exist in the older copies of org-mode. I pulled from the org-mode git repo and followed these instructions to set up the new version.

To push up new blog entries to WordPress, org2blog uses the elisp xml-rpc library. This is easy to install if you have ELPA (if you don't have ELPA, follow these instructions). With the proper version of org and xml-rpc, org2blog is ready to be installed. I pulled down the org2blog source code (from git://github.com/punchagan/org2blog.git) and added the below elisp to my Emacs startup:

(setq load-path (cons "~/path/to/org2blog/" load-path))

(require 'org2blog)

(setq org2blog/wp-blog-alist
       :url "http://blog.address.com"
       :username "Blog User Name"
       :default-title ""
       :default-categories ("Some Category")
       :tags-as-categories nil)))

You can eval the above code or restart Emacs. To make sure it's working you can try M-x org2blog/wp-login to see if it can connect to your blog.

Using org2blog

org2blog mode has several interactive functions for blogging. To create a new entry use M-x org2blog/wp-new-entry. This creates a new (unsaved) buffer that has org-mode markup already in it to indicate category, blog title, date, etc. After adding content and setting a title etc, running M-x org2blog/wp-post-buffer will upload the contents of the buffer as a new draft (not yet publishing it), to the blog you specify. The blog matches blog-name that you configured in your org2blog setup. After transporting the blog post, it then asks you if you'd like to open the draft in a browser. I like this feature a lot, it makes it easy for me to verify that the markup converts properly (until I get more confidence in it) and assures me that it's still in draft mode (making sure an entry doesn't get published prematurely).

Something I found unexpected is that when it creates the post, your local copy is still live. When creating the entry it fputs identifier information in the heading along with the title. So you can post a draft, make a few changes and post that same draft again, and it will update that existing entry (not create a new entry).

Below are some useful functions that can be invoked via M-x (or bound to a key)

Function Description
org2blog/wp-new-entry Creates a new blank blog entry with the headers added
org2blog/wp-post-buffer Posts the buffer as a draft to the WordPress instance
org2blog/wp-preview-buffer-post Opens a tab in a browser to preview the blog post
org2blog/wp-post-buffer-and-publish Publishes a draft