Object Commando languages, development and design

25Apr/100

Installing Parliament on Ubuntu

What is Parliament?

Parliament is an open source triple store that is an improved version of DAMLDB. There is some good information on using the triple in the User's Guide. What's interesting about Parliament is how it stores the triples. Relational databases are a more common implementation of an RDF store, but Parliament goes a different way. Parliament takes linked list style of approach It uses BerkeleyDB for storing the URI values and then stores the triple of references in a linked list. For more information on their approach, there's a great paper on it than can be found here.

Building Parliament for Ubuntu

Parliament has binaries on it's website for Windows and Mac, but none for Ubuntu (or any other Linux distro). Parliament is written in C++ and Java, so make sure you have the g++ package and the JDK installed. You'll also need to make sure you have a Subversion client installed to get the source and Ant installed to build the Java code. Below are the steps I went through to get Parliament to run on Linux:

  1. Build Boost Jam
    • Download and unzip Boost Jam
    • Run build.sh in the boost-jam directory
    • Put jam executable in your PATH
  2. Build the Boost C++ Libraries
    • Download and unzip Boost
    • cd into the boost directory and build boost with the following command (modify accordingly if your not on a 64 bit system)
      bjam -q --build-dir=linux/build --stagedir=linux/stage
      --layout=versioned --with-test architecture=combined address-model=64
      variant=debug,release threading=multi link=shared runtime-link=shared
      stage
      
  3. Build and Install BerkleyDB
    • Download and unzip [http://www.oracle.com/technology/software/products/berkeley-db/htdocs/popup/db/4.7.25/db-targz.html BerkleyDB] 4.7.x
    • cd into /build/unix and type ../dist/configure
    • run make
    • run sudo make install
  4. Create the following environment variables (if you don't already have them
    • JAVA_HOME=/usr/lib/jvm/java-6-sun
    • BOOST_ROOT=/path/to/boost/boost_1_42_0
    • BDB_HOME=/usr/local/BerkeleyDB.4.7/
    • BOOST_BUILD_PATH=$BOOST_ROOT/tools/build/v2
    • BOOST_TEST_LOG_LEVEL=message
  5. Building Parliament
    • Checkout the Parliament source: svn checkout --username anonsvn https://projects.semwebcentral.org/svn/parliament/trunk
    • Copy the parliament_dir/doc/Linux/*.jam files to ~/
    • The Parliament build uses pushd and popd, which is not build into /bin/dash (which is where /bin/sh is symlinked in Ubuntu). To fix this, I changed the /bin/sh symlink to /bin/bash
    • Copy build.properties.template from the Parliament source directory to build.properties
    • Comment the various build architectures (for Mac and Windows) and make sure the below line in uncommented nativeBuildParams=toolset=gcc-4.4 address-model=64 variant=release
  6. Source Changes
    • When I tried to build Parliament the first time, I received an error that the method remove could not be found when compiling Parliament/KbCore/FileHandle.cpp. It was due to the lines of code below:
      #if defined(PARLIAMENT_WINDOWS)
      	if (!DeleteFile(filePath.c_str()))
      #else
      	if (remove(filePath.c_str()) == -1)
      #endif
      

      I added an include to the top of the file:

      #if !defined(PARLIAMENT_WINDOWS)
      #	include <errno.h>
      #	include <fcntl.h>
      #	include <sys/stat.h>
      #      include <stdio.h> //<-- Added this
      

      to fix the problem.

    • The and build file includes the Mac environment variable for the C libraries, but not the Linux ones. I changed the build.xml in the Parliament directory to include either the Linux or Mac environment variable depending on the build architecture:
      	<condition property="libraryEnvVariable" value="DYLD_LIBRARY_PATH"> <!-- Line 290 -->
      			<os family="mac"/>
      		</condition>
      		<condition property="libraryEnvVariable" value="LD_LIBRARY_PATH">
      			<and>
      				<os family="unix"/>
      				<not><os family="mac"/></not>
      			</and>
      		</condition>
      ...
       <env key="${libraryEnvVariable}" path="${artifactsDir}/${nativeArtifactsDir}"/> <!-- Line 302 -->
      
  7. From the source tree root, run ant
  8. Copy Parliament-v2.6.7.0-InsertPlatformHere.zip from the target directory to your install directory (can be anywhere)
  9. In the install directory copy all of the files from gcc-4.4/release/64/ to the ParliamentKB directory
  10. Run the StartParliament.sh script to start Parliament
12Apr/100

Clojure Protocols Part 3

Recently there have been some changes to the Clojure Protocols code out on Github. Not huge changes, but enough that the examples I wrote from Part 1 and Part 2 will no longer work. I thought I'd finish out my protocol blog entries by showing how I used it and include the new syntax. I also have a better understanding on how reify can be used (thanks Meikel) and will include some of that. First the goal of protocol usage. I have been working on some comparisons and evaluations of triplestores. Triplestores can be used to store RDF data which is a series of subject/predicate (or property)/object triples. There are many triplestores out there and of the triplestores that are out there, many have several interfaces. For example, Oracle has a JDBC interface that uses stored procedures and a Jena API that incorporates pieces of the Jena framework. This was some pretty low hanging fruit from an abstraction perspective. Whether inserting a new triple in Oracle JDBC, Jena (with Oracle) or one of the other triplestore impelementations, on the surface, it is the same. Take this subject, predicate and object and store it. The same could be said for querying it with SPARQL or deleting entries. I ended up with a protocol named TriplestoreOperations like below:

(ns revelytix.triplestore-operations)

(defprotocol TriplestoreOperations
  "Interface for the various operations allowed by a triple store"
  (create-graph [impl graph-name] "Creates a new graph of name graph-name")
  (delete-graph [impl graph-name] "Deletes graph graph-name if graph exists")
  (insert-quad [impl graph-name subject predicate object]
    "Creates a new triple, data is assumed to be a full URI")
  ;;...)

This syntax is the same. The first argument is used to pass in the implementation of TriplestoreOperations. The graph-name or model in Oracle terms, is what is going to hold the triples. The protocol exists in one namespace (called triplestore-operations above) and the implementations of the interfaces are in separate namespaces. The first is an Oracle JDBC implementation of TriplestoreOperations. It's parameterized by the database connection details and the name of the table to store the data in.

(ns oracle.oracle-jdbc
  (:use clojure.contrib.sql
	triplestore-operations))

(deftype OracleJdbcOperations [db table-name]  TriplestoreOperations
  (delete-graph [impl graph-name]
	(let [drop-model-string (create-sql-string DROP-MODEL-SQL graph-name)
	      drop-table-string (create-sql-string DROP-TABLE-SQL table-name)]
	  (with-connection db
	      (with-open [drop-model-statement (.prepareCall (connection) drop-model-string)]
		(do
		  (drop-entailment-if-exists db graph-name "RDFS")
		  (.execute drop-model-statement)
		  (do-commands drop-table-string))))))
  (create-graph [impl graph-name]
      (let [createModelString (create-sql-string CREATE-MODEL-SQL graph-name table-name)
	    createTableString (create-sql-string CREATE-TABLE-SQL table-name)]
	(do (with-connection db
	      (with-open [createModelStatement
                                    (.prepareCall (connection) createModelString)]
		(do-commands createTableString)
		(.execute createModelStatement))))))
  (insert-quad [impl graph-name subject predicate object]
	       (create-family-triple table-name db graph-name subject predicate object))
  ;;...)

  (defn create-oracle-jdbc-triplestore-instance [table-name]
           (OracleJdbcOperations *oracle-jdbc-props* table-name)) ;;Awkward see below

One difference between the above code and the code in Part 1 or Part 2 is that the implementation parameter in the previous version of deftype disappeared. So the create-graph function above would have had only had a single parameter. I like the change, I found the original code a little confusing, wondering where the first parameter went etc. The next implementation of the TriplestoreOperations protocol was a Jena implementation of the protocol. The below code makes use of the reify function and feels a little more idomatic Clojure and less like the implementation of a protocol is something special and different from just functions. I like the refiy syntax over deftype and I've been moving my code over to use it. I'm going to cut a decent portion of the implementation below because it mostly calls Java APIs and is a bit noisy:

(ns jena-operations
  (:use triplestore-operations)
  ;;...)

(defn create-jena-operations-instance [jena-support-impl]
  (reify TriplestoreOperations
	  (create-graph [impl modelString] nil)
	  (delete-graph [impl modelString]
			(with-triplestore-connection ;...)
	  (insert-quad [impl modelString subject predicate object]
		       (with-triplestore-connection ;;...)
          ;;...))

The reify function call above also creates a new instance of the protocol TriplestoreOperations with the functions defined in line. There's also not a need to create an instance of the type like is being done in the previous example. The end result, deftype or reify from a functionality perspective is the same, there's just a different way to get there. Reading through some of the docs, it looks like reify is more dynamic and deftype results in generated code. One difference between Jena and the Oracle JDBC interface is that graphs don't need to be created explicitly using Jena, so that method does nothing. The above code is slightly different as well in that the implementation parameter no longer disappears. Another interesting part is that the JenaOperations instance is parametrized by another protocol called JenaSupport. What I have found is that many vendors support the Jena APIs, but they implement it slightly different. It's definitely not as pluggable as something like JDBC. This JenaOperations implementation is generic for the Jena APIs and is used by several triplestores with Jena implementations. The JenaSupport protocol abstracts things like getting a Jena connection, creating the correct implementation of Model etc which is different from implementation to implementation.

Development Gotchas

I have found a few issues when developing Clojure code that uses protocols. I'm using Leiningen and Lein Swank for development of the code. First I found that if I had AOT compilation enabled, and had run lein install, the protocol definition results in compiled code in the classes directory of the project. Where this caused a problem was when I tried to change a protocol definition. I'd make a change in Emacs, load the file with the updated protocol code and behaviour of the code would be such that I made no change to the protocol at all. What was happening was the old version of the code, the one that had the interface code generated, was still on the class path in the classes directory. Removing that code (through lein clean or something similar) allowed my changes to take affect. This problem stumped me for a couple of hours. I can avoid this entirely by just not using the AOT compilation (I don't really need it) but others might not.

Another gotcha I found was in the loading of files that use implementations of protocols. In the example above, let's say I have a test file (I'll call it test-A) that executes functions from TriplestoreOperations on the JenaOperations implementation that in turn uses the Oracle implementation of JenaSupport. Just loading test-A.clj file does not cause the loading of the Jena implementation of the TriplestoreOperations, or the Oracle version of JenaSupport. Rather it just complains that there is not an implementation of TriplestoreOperations for 'nil'. Loading those files individually fixes the problem, it just doesn't do that automatically for me.

Filed under: Clojure, Languages No Comments