Tuesday, July 10, 2012

Recently I needed to parse a list of reports and their recpients and create a list of the recipients with the number of reports they each received. The starting point was a CSV file of each report and a semicolon-delimited list of its recipients, like so:

REPORT_NAMERECIPIENTS
report_1bob@bigcorp.com; alice@bigcorp.com
report_2alice@bigcorp.com; eve@bigcorp.com
......

With an end result of:

EMAILCOUNT
bob@bigcorp.com1
alice@bigcorp.com2
eve@bigcorp.com1

Pretty simple. Now, full disclosure, I ended up doing this in Excel because I needed it done quickly, but later, when I had a few minutes to spare, I went back and thought about how I would do it in Clojure. Here's what I came up with:

A couple of things to note here:

One, because we're using the ->> macro, we eliminate a lot of the tedious "b = do_something(a); c = do_something_else(b); etc" code that you're probably used in other languages (this is, by the way, something you could accomplish by function composition -- i.e. c = do_something_else(do_something(a)), but it tends to look awful and make your code unreadable). The ->> macro allows for much tidier function composition by making the result of each function evaluation the last argument to the next function call. Hence the file location becomes the argument to text-reader, resulting in a file, which becomes the argument to slurp, which becomes a giant text string, and so on.

Two, whereas in procedural code we would normally use something like a hash map with each email address associated to a counter incremented each time we encounter that email address, in Clojure we tend to avoid the use of mutable state by various means, in this case, recursion. The loop/recur special form is used for recursion. loop sets up a recursion point, complete with bindings of the kind you're used to seeing in function definitions and let bindings, then recur calls the "function" that recur set up with new values. In this case, the new values are the hash map with the emails and their respective counts, the next email address to be counted, and the remainder of the list. When the list is empty, the function prints a nicely formatted list of the emails and their counts.

Thursday, January 12, 2012

namespace '%s' not found after loading '%s'

While working through Rob Rowe's introduction to C# / Clojure interop, I ran into the following error, which illuminates something useful about Clojure's expectations when it comes to namespaces. Much could be said about this topic, but I will leave that to the experts and just explain the problem I was having and how I solved it.

In Rob's post, he uses some sample code that calculates batting averages and standings for baseball teams. His Clojure code begins with the namespace directive "(ns one)", which tells the Clojure compiler into which namespace to register these functions. When testing out his code, I created a project named Baseball, which creates a file called baseball.clj:


















When I tried to build and run his code, I got the following error:

I chose "Debug the program" and was given the choice of debugging in the currently running instance of Visual Studio or starting a new one. I chose the currently running instance:

  
And got this:

 
Not very helpful. But when I clicked View Detail, I was presented with both an error message and, more usefully, a stacktrace:

Here is the stacktrace in full:

   at clojure/core$fn__13329$throw_if__13331.__invokeHelper_2v(clojure/core$fn__13329$throw_if__13331_base this, Object pred, Object fmt, Object args) in core.clj:line 4684
   at clojure/core$fn__13329$throw_if__13331.doInvoke(Object , Object , Object )
   at clojure/core$fn__13368$load_one__13370.__invokeHelper_3(clojure/core$fn__13368$load_one__13370_base this, Object lib, Object need_ns, Object require) in core.clj:line 4732
   at clojure/core$fn__13368$load_one__13370.invoke(Object , Object , Object )
   at clojure/core$fn__13461$compile__13463.__invokeHelper_1(clojure/core$fn__13461$compile__13463_base this, Object lib) in core.clj:line 4916
   at clojure/core$fn__13461$compile__13463.invoke(Object )
   at clojure.lang.Var.invoke(Object arg1)
   at BootstrapCompile.Compile.Main(String[] args)

Now I was getting somewhere: The highlighted lines were the call just before the error was thrown, so by looking there I should be able to get some idea of what went wrong. So I re-ran the program and this time chose to open a new instance of Visual Studio. I was asked to supply the source code for the lines I was debugging. Fortunately, the Find Source dialog started me out just one folder away from the source by default, in C:\Users\daniel_cotter\AppData\Local\Microsoft\VisualStudio\10.0\Extensions\jmis\vsClojure\1.1.0\Runtimes\1.2.0:

I chose clj-source and was presented with this much more helpful view (I've skipped to line 4732, where the error was thrown:

Lo and behold, the exact error I was getting. The line above it says, almost literally, "throw an error if a namespace is needed and not found." Thinking about this and looking at the code that was causing the error, I realized that the namespace had been set to "one" (remember "(ns one)"?). I went back and changed the namespace to "baseball" to match the filename ("baseball.clj"), recompiled, and it worked:
















Summary
The lesson I'm taking from all this? Be careful with your namespaces. For those of us accustomed to .NET's liberal namespacing conventions (basically, a namespace block is all you need, and it doesn't have to the filename), Java's conventions are much stricter, and Clojure seems to follow suit, even when in a .NET environment. Personally, I have long found namespacing issues one of the more confusing parts of Clojure, probably because I am not well-versed in Java.

Wednesday, January 11, 2012

csv-clojure and Python dialects

On a recent project, I had the choice of using either Clojure or Python to parse some CSV files. Now, Python has a built-in library for handling CSV files, which Clojure does not have. However, being slightly enamored of Clojure, I started investigating my options. The best-known CSV-parsing library seems to be David Santiago's clojure-csv, which contains functions for reading and writing CSV files.

Problem solved, right? Not so fast -- I not only had to parse these files, I had to account for differences in formats between the different systems reading and writing them such as the delimiter, quote character, linefeed character, etc. Python conveniently has a Dialect class, which can be used to inform the CSV parser of the differences in formats between files. clojure-csv uses dynamic variables to store some of these constants, which means that they can be set by the code consuming the library easily.

In an attempt to duplicate this functionality in Clojure, I came up with the idea of creating a format hash-map with these values in it and using a macro to bind its values to the dynamic variables in clojure-csv. Here's what I came up with:



In this code, I'm using the binding key word to set these global variables for the scope of the function, after which point they will be reset to their original values.

And here's a use of it:



That codes reads in the U.S. Presidents CSV sample found here and spits it back out in a different format, using a binding-within-a-binding to do so.

Thursday, January 5, 2012

Okay, I'll start a blog.

Okay. Well, I did it. I created a blog.

Why am I starting this blog? Because I think Clojure is an awesome language. I have always had a low tolerance for ugliness in languages, and Clojure has less ugliness than any programming language I've ever worked with. In fact, it's downright elegant. Writing Clojure is like writing poetry to your lover, if you wrote poetry in concise functional code and if your lover were the REPL. But beyond that, and certainly more important to its mainstream adoption, it compiles down to JVM and CLR bytecode and to JavaScript *and* interoperates seamlessly with Java and .NET. There are lots of other reasons, like dynamic typing and concurrency primitives, but the elegance and interoperability are the big ones for me right now.

Unfortunately, the .NET side of Clojure needs a little love. It works, but the community is not nearly as big, the IDE support is spotty, and the documentation is scattered throughout StackOverflow posts, Wiki pages, newsgroups threads, etc. I've been piecing together what's needed to get Clojure working on the CLR, and in  forthcoming blog posts, I'll be sharing what I've learned (and am still learning).

Well, good talk. I've gotta run for now, but I'll be back later. Thanks for reading.

Daniel