Cataclysmic Mutation

Machine Learning and Whatever Else

EuroClojure Wrapup

I’m just back from the first European Clojure conference in London, and I think the prevailing opinion was that it was a great success. I certainly enjoyed it, and I met a lot of great people doing cool things. In general, I think that’s how most conferences are judged – get interesting people to come and don’t do anything to screw it up, and you can have an awesome event.

The speakers were quite varied, but there were definitely several themes that emerged. Day one was quite heavy on the basic idea of “data” as an organizing principle for software design. This is a long-standing Clojure convention — eschew building new abstractions in favor of using the really good ones you already have. That is, the venerable “Employee” object should just be a Clojure map (a dictionary in python, hash table in some other languages).

This basic idea (and particularly the way it is reflected in the design of the language and standard libraries) is I think my favorite thing about Clojure, and a fair amount of the Big Ideas that are coming from the world of Clojure are simply extensions of this idea into new domains.

Probably the other big theme of the conference was Overtone, the open-source audio environment that has been a really popular presence at the last few Clojure events. There were three presentations on Overtone, and the lightning talk from Chris Ford was the highlight of the conference for me, which I suppose is saying something considering there were two Rich Hickey talks and a Stuart Halloway keynote.

Chris’ talk used Overtone to and Clojure to build up mathematical and functional definitions of canons (in particular he demos Bach’s Goldberg Variations), and shows how to not only construct the melodies and combine them to form various types of canons you can play using Overtone, but also demonstrates simple and elegant implementations to manipulate them in wonderful ways. Go right now to github and download the code from Chris’ talk. If you don’t know Clojure, use this as an excuse to learn it – it’s that good.

Aside from Chris’ talk, the notable successes for me were the following talks. Note that this list is personal. There were several perfectly good talks on things I simply didn’t care that much about, so don’t take this list as an exhaustive list of the “good talks”. These were just the ones that interested me the most for whatever reason.

Stefan Tilkov’s talk on Unlearning OOP

This was another talk around the “simple data” theme, and I thought he did a nice job of really grounding that idea into the language of a Java programmer who might just have found Clojure and is wondering how to structure programs in this strange new language.

Edmund Jackson’s introduction to core.logic

Overall, there was a lot of excitement around logic programming, and Edmund is a good speaker and gave a nice talk. One of the things that seems really well done about core.logic is how neatly it fits in as just another Clojure library. Part of this is the inherent of Lisp as a platform for these sorts of domain specific languages, but I also think the folks involved in core.logic simply did a very good job of designing a usable system. I’m really interested in seeing what benefits this has in terms of allowing fine-tuning how it operates at run-time.

On the down side, this overall excitement around logic programming was also one of the few things that popped up red flags for me. I don’t want to put words into anyone’s mouth, and it’s certainly possible that I missed some nuance somewhere, but my general impression of the community’s relationship with logic programming is something like naive optimism. The feeling sometimes strayed uncomfortably towards that of someone who has just discovered logic programming and is completely unaware of the history surrounding it. Anyone who’s written any serious Prolog code has struggled with the need to step inside the pretty formalism and start messing with the gears driving the search strategy. Prolog programs generally have cuts in them, and cuts turn your nice pretty declarative model into an imperative program written in the world’s least transparent language. There seemed to be zero awareness of this at all.

That said, much of the logic programming that was discussed here was in the form of things like Datalog, where conceivably, the problems won’t be quite so likely to arise. Basically, the problems with logic programming come up when you have deep search trees. Broad ones aren’t quite so much of a problem. If you have a list of a million facts (as in a simple database), you can answer queries by simply scanning the list in linear time if necessary. If you have the sort of complex relationships between your variables such that the value you assign to one may, many steps later, lead you to be unable to unify another, you have to backtrack, and this backtracking can go exponentially very quickly unless you do something to control it. One of the examples Edmund used was a course scheduling problem, and this is very nearly the “Hello world” of really nasty Prolog problems. It isn’t clear to me yet whether core.logic allows the type of search control that Prolog provides with its cuts, but either way, using core.logic for this sort of problem in the real world is likely going to be either impractical or impossible, not through any fault of core.logic itself – merely because it turns out that a pure declarative model is really difficult to scale in certain ways. Like I said though, I think having seamless access to Clojure under the hood gives the potential for really interesting variations – we should be able to tweak the declarative semantics to give better performance and scalability with a much better idea of how to control things. Overall, I think a move towards declarative programming where it makes sense is really a positive thing. I just think the level of optimism is probably a notch or two on the high side right now.

Stuart Halloway: Evident Code at Scale

Stuart is a great speaker, without a doubt. He gave a great talk on what he means by “evident” code, particularly focusing on Datalog to provide a declarative, logic-based programming model on certain types of tasks.

I’ve already said quite a lot about what I think here – declarative programming is awesome, right up until it isn’t. It remains to be seen whether Datalog’s more limited domain keeps it clear of the pitfalls that logic programmers have been dealing with for 30 or so years now, but I’m much more optimistic in that regard than in core.logic as a general-purpose computing model, at least in the short-term.

Rich Hickey on Datomic/Reducers

One of the speakers got stuck in Prague and couldn’t make it, and Rich agreed to step in and give a second talk on the recently announced Reducers framework in addition to his scheduled keynote on Datomic.

Much has been said about Datomic, and I have nothing much to add. Rich is, of course, an excellent presenter, and I learned quite a lot more about it that I had previously known, but it doesn’t fill a need I have. I think I was nearly unique among people I spoke with in that I tend to use Clojure as a better Lisp rather than a better Java. Most people were excited about things like Datomic and Pallet – practical tools for “the enterprise”, which is of course, perfectly valid. At the risk of heresy, I’d rather have had 40 more minutes of Chris Ford playing with the intersection of music and mathematics, but for the people who have much harder jobs than mine – the folks responsible for keeping the lights on and the trains running on time – the more practical things like Datomic, Pallet, and the various log and event handling related talks were probably really appreciated.

The reducers talk was much more interesting to me personally. I’m going to have spend a few minutes with the implementation to really understand them better – functions returning functions that create other functions is not exactly the kind of thing you can blow through in Powerpoint very well. However, the take-away is very nice, and my first impression is that it’s as elegantly designed and implemented as the rest of Clojure, which is pretty high praise indeed.

I’m going to toss one more downer in at this point though. During the talk, Rich made a comment to the effect of “and it’s just normal Clojure data structures – no parallel arrays of any of that object-oriented brain damage”. I’m paraphrasing, but that was the gist of it. The line got some chuckles and general approval, but the thing is that the “parallel arrays” people didn’t come to that decision because they’re idiots. They came to that decision because being elegant, pure, and N times slower wasn’t an option for them. Clojure is not currently a competitive platform for a lot of really compute-intensive work. Being pure has overhead; being lazy has overhead. If you’re writing a simulation that runs for two weeks on a cluster, going twice as slow means taking a month instead. Sometimes it’s worth it to be ugly. Sometimes that means things like “we can’t make normal arrays 10% slower just for thread-safety, so let’s just add a second array class that is thread-safe and let the user choose”. Yes, it’s ugly, but sometimes ugly and working beats the elegant solution that you can’t get off the whiteboard. It’s fine that Rich chooses not to take that route. It’s not quite so fine to be openly dismissive of those who do with no reference to the context in which they made those decisions. I don’t think this is news to people, but there is certainly a tendency to act like it is.

Mikel Brandmeyer on the history of lazy-seq

I think there was more here that I was unaware of than I thought going into it. I thought I had a pretty good grasp on lazyness in Clojure, but I did learn quite a lot from this talk.

Chistophe Grand: Not so Homoiconic

This was another of the talks that I think really resonated with the audience. Basically Christophe is working towards preserving more of the information that gets tossed by the reader, with the goal of enabling much richer classes of IDE-style source-code transformations. If you’re not a Lisp user, think of the reader as a kind of compiler. It takes in source code and spits out some other representation. Unlike a compiler, the representation is much closer to the original code than machine code or assembly language, but there are still many lossy transformations – comments and whitespace are discarded, certain types of variable references are rewritten into very unfriendly-looking forms, etc. If you treat Clojure code as data, you can read it using the Reader, but you lose all this valuable information. Christophe’s talk was on some ongoing work he’s doing towards trying to get around that without requiring changes to the language. The outcome would be some really great tools for writing better tools, and I think that prospect excited quite a few of the attendees.

Bruce Durling: Quick and Dirty Data Science with Incanter

How on earth did I forget Bruce’s talk when making the first version of these notes, given that his was probably the one I most anticipated as a “this can provide some practical help” sense? I use Clojure mostly for prototyping machine learning methods, and Incanter is one of the tools I lean on for this sort of analysis work, so I hoped I’d pick up some new tricks.  Bruce did a very good job I thought at the two major tasks he had in front of him: explaining what Incanter does to an audience with no special background in statistics and showing how to do some simple but useful tasks. 

Much of the material I already knew, but there were some bits that I hadn’t really used enough, and so I actually learned a fair amount of new tips as well.

Odds and ends…

A few unrelated points. First, serious thanks to Marco Abis for organizing the conference. There were a few minor issues (day one was incredibly hot), but overall, the conference was run beautifully. Everything from the selection of speakers down to the food provided was handled very well. I was also impressed by the tone of the speakers and attendees.

Others pointed out the severe gender imbalance – I don’t know the final breakdown of male/female attendees, but it was pretty stark. However, it’s hard to say any individual conference can do a great deal to change those realities. What a conference can very definitely do however is to send exactly the message that women aren’t welcome, and I happily didn’t notice any of that here. There was a question near the end that asked something about the culture of alcohol around meetups and dojos. There was definitely a lot of activity at the conference that took place after hours at the pub, and I know that a non-drinker would probably feel a bit put out or unwelcome. That’s a hard problem to solve. I will say that I don’t recall anyone crossing the line into any sort of obnoxious drunkenness, so I think at least a non-drinker could still participate in the community discussions. If a person really doesn’t want to subject themselves to an environment where there’s lots of alcohol around, it’s not a great solution.

I was slightly surprised by the number of people who were using Clojure professionally – that is, not very many of them. I don’t know that that’s either good or bad by itself – I think having things like Overtone is much more exciting that the typical sorts of things you’d see at JavaOne or some other conference where everyone is sent by their employers for training on the latest Java Enterprise Struts Foundation Builder Factory or whatever. The “hobbyist” vibe makes for a very friendly and diverse conference, and the people are probably more excited as well.

I also can’t resist making one more slightly negative point. There seems to be, at times at least, perhaps a bit too much hero worship in the community. Everyone wants to talk about how so and so “complects” this and that thing, or reference to how something is “simple” but not “easy”. I was talking with someone else who said it fairly well, that one gets the impression that if Rich stood up and said, “this homoiconic thing kind of sucks, I’ve decided — oh, and whitespace should be significant”, then the community would look very serious and studious and say, “You know, he’s right, I haven’t thought of it that way before”. By no means is this some endemic problem, and I think the core contributors certainly don’t fit this description. It does give a weird cargo-cult vibe at times though.

I want to end on a positive note. There were two full days of talks and lively conversations well into the evenings, and I’ve mentioned every negative point I could come up with. That leaves a lot of things that were very right. Thanks to all the organizers, speakers, and attendees for making EuroClojure a smashing success.