Cataclysmic Mutation

Machine Learning and Whatever Else

EuroClojure Wrapup

I’m just back from the first European Clojure conference in London, and I think the prevailing opinion was that it was a great success. I certainly enjoyed it, and I met a lot of great people doing cool things. In general, I think that’s how most conferences are judged – get interesting people to come and don’t do anything to screw it up, and you can have an awesome event.

The speakers were quite varied, but there were definitely several themes that emerged. Day one was quite heavy on the basic idea of “data” as an organizing principle for software design. This is a long-standing Clojure convention — eschew building new abstractions in favor of using the really good ones you already have. That is, the venerable “Employee” object should just be a Clojure map (a dictionary in python, hash table in some other languages).

This basic idea (and particularly the way it is reflected in the design of the language and standard libraries) is I think my favorite thing about Clojure, and a fair amount of the Big Ideas that are coming from the world of Clojure are simply extensions of this idea into new domains.

Probably the other big theme of the conference was Overtone, the open-source audio environment that has been a really popular presence at the last few Clojure events. There were three presentations on Overtone, and the lightning talk from Chris Ford was the highlight of the conference for me, which I suppose is saying something considering there were two Rich Hickey talks and a Stuart Halloway keynote.

Chris’ talk used Overtone to and Clojure to build up mathematical and functional definitions of canons (in particular he demos Bach’s Goldberg Variations), and shows how to not only construct the melodies and combine them to form various types of canons you can play using Overtone, but also demonstrates simple and elegant implementations to manipulate them in wonderful ways. Go right now to github and download the code from Chris’ talk. If you don’t know Clojure, use this as an excuse to learn it – it’s that good.

Aside from Chris’ talk, the notable successes for me were the following talks. Note that this list is personal. There were several perfectly good talks on things I simply didn’t care that much about, so don’t take this list as an exhaustive list of the “good talks”. These were just the ones that interested me the most for whatever reason.

Stefan Tilkov’s talk on Unlearning OOP

This was another talk around the “simple data” theme, and I thought he did a nice job of really grounding that idea into the language of a Java programmer who might just have found Clojure and is wondering how to structure programs in this strange new language.

Edmund Jackson’s introduction to core.logic

Overall, there was a lot of excitement around logic programming, and Edmund is a good speaker and gave a nice talk. One of the things that seems really well done about core.logic is how neatly it fits in as just another Clojure library. Part of this is the inherent of Lisp as a platform for these sorts of domain specific languages, but I also think the folks involved in core.logic simply did a very good job of designing a usable system. I’m really interested in seeing what benefits this has in terms of allowing fine-tuning how it operates at run-time.

On the down side, this overall excitement around logic programming was also one of the few things that popped up red flags for me. I don’t want to put words into anyone’s mouth, and it’s certainly possible that I missed some nuance somewhere, but my general impression of the community’s relationship with logic programming is something like naive optimism. The feeling sometimes strayed uncomfortably towards that of someone who has just discovered logic programming and is completely unaware of the history surrounding it. Anyone who’s written any serious Prolog code has struggled with the need to step inside the pretty formalism and start messing with the gears driving the search strategy. Prolog programs generally have cuts in them, and cuts turn your nice pretty declarative model into an imperative program written in the world’s least transparent language. There seemed to be zero awareness of this at all.

That said, much of the logic programming that was discussed here was in the form of things like Datalog, where conceivably, the problems won’t be quite so likely to arise. Basically, the problems with logic programming come up when you have deep search trees. Broad ones aren’t quite so much of a problem. If you have a list of a million facts (as in a simple database), you can answer queries by simply scanning the list in linear time if necessary. If you have the sort of complex relationships between your variables such that the value you assign to one may, many steps later, lead you to be unable to unify another, you have to backtrack, and this backtracking can go exponentially very quickly unless you do something to control it. One of the examples Edmund used was a course scheduling problem, and this is very nearly the “Hello world” of really nasty Prolog problems. It isn’t clear to me yet whether core.logic allows the type of search control that Prolog provides with its cuts, but either way, using core.logic for this sort of problem in the real world is likely going to be either impractical or impossible, not through any fault of core.logic itself – merely because it turns out that a pure declarative model is really difficult to scale in certain ways. Like I said though, I think having seamless access to Clojure under the hood gives the potential for really interesting variations – we should be able to tweak the declarative semantics to give better performance and scalability with a much better idea of how to control things. Overall, I think a move towards declarative programming where it makes sense is really a positive thing. I just think the level of optimism is probably a notch or two on the high side right now.

Stuart Halloway: Evident Code at Scale

Stuart is a great speaker, without a doubt. He gave a great talk on what he means by “evident” code, particularly focusing on Datalog to provide a declarative, logic-based programming model on certain types of tasks.

I’ve already said quite a lot about what I think here – declarative programming is awesome, right up until it isn’t. It remains to be seen whether Datalog’s more limited domain keeps it clear of the pitfalls that logic programmers have been dealing with for 30 or so years now, but I’m much more optimistic in that regard than in core.logic as a general-purpose computing model, at least in the short-term.

Rich Hickey on Datomic/Reducers

One of the speakers got stuck in Prague and couldn’t make it, and Rich agreed to step in and give a second talk on the recently announced Reducers framework in addition to his scheduled keynote on Datomic.

Much has been said about Datomic, and I have nothing much to add. Rich is, of course, an excellent presenter, and I learned quite a lot more about it that I had previously known, but it doesn’t fill a need I have. I think I was nearly unique among people I spoke with in that I tend to use Clojure as a better Lisp rather than a better Java. Most people were excited about things like Datomic and Pallet – practical tools for “the enterprise”, which is of course, perfectly valid. At the risk of heresy, I’d rather have had 40 more minutes of Chris Ford playing with the intersection of music and mathematics, but for the people who have much harder jobs than mine – the folks responsible for keeping the lights on and the trains running on time – the more practical things like Datomic, Pallet, and the various log and event handling related talks were probably really appreciated.

The reducers talk was much more interesting to me personally. I’m going to have spend a few minutes with the implementation to really understand them better – functions returning functions that create other functions is not exactly the kind of thing you can blow through in Powerpoint very well. However, the take-away is very nice, and my first impression is that it’s as elegantly designed and implemented as the rest of Clojure, which is pretty high praise indeed.

I’m going to toss one more downer in at this point though. During the talk, Rich made a comment to the effect of “and it’s just normal Clojure data structures – no parallel arrays of any of that object-oriented brain damage”. I’m paraphrasing, but that was the gist of it. The line got some chuckles and general approval, but the thing is that the “parallel arrays” people didn’t come to that decision because they’re idiots. They came to that decision because being elegant, pure, and N times slower wasn’t an option for them. Clojure is not currently a competitive platform for a lot of really compute-intensive work. Being pure has overhead; being lazy has overhead. If you’re writing a simulation that runs for two weeks on a cluster, going twice as slow means taking a month instead. Sometimes it’s worth it to be ugly. Sometimes that means things like “we can’t make normal arrays 10% slower just for thread-safety, so let’s just add a second array class that is thread-safe and let the user choose”. Yes, it’s ugly, but sometimes ugly and working beats the elegant solution that you can’t get off the whiteboard. It’s fine that Rich chooses not to take that route. It’s not quite so fine to be openly dismissive of those who do with no reference to the context in which they made those decisions. I don’t think this is news to people, but there is certainly a tendency to act like it is.

Mikel Brandmeyer on the history of lazy-seq

I think there was more here that I was unaware of than I thought going into it. I thought I had a pretty good grasp on lazyness in Clojure, but I did learn quite a lot from this talk.

Chistophe Grand: Not so Homoiconic

This was another of the talks that I think really resonated with the audience. Basically Christophe is working towards preserving more of the information that gets tossed by the reader, with the goal of enabling much richer classes of IDE-style source-code transformations. If you’re not a Lisp user, think of the reader as a kind of compiler. It takes in source code and spits out some other representation. Unlike a compiler, the representation is much closer to the original code than machine code or assembly language, but there are still many lossy transformations – comments and whitespace are discarded, certain types of variable references are rewritten into very unfriendly-looking forms, etc. If you treat Clojure code as data, you can read it using the Reader, but you lose all this valuable information. Christophe’s talk was on some ongoing work he’s doing towards trying to get around that without requiring changes to the language. The outcome would be some really great tools for writing better tools, and I think that prospect excited quite a few of the attendees.

Bruce Durling: Quick and Dirty Data Science with Incanter

How on earth did I forget Bruce’s talk when making the first version of these notes, given that his was probably the one I most anticipated as a “this can provide some practical help” sense? I use Clojure mostly for prototyping machine learning methods, and Incanter is one of the tools I lean on for this sort of analysis work, so I hoped I’d pick up some new tricks.  Bruce did a very good job I thought at the two major tasks he had in front of him: explaining what Incanter does to an audience with no special background in statistics and showing how to do some simple but useful tasks. 

Much of the material I already knew, but there were some bits that I hadn’t really used enough, and so I actually learned a fair amount of new tips as well.

Odds and ends…

A few unrelated points. First, serious thanks to Marco Abis for organizing the conference. There were a few minor issues (day one was incredibly hot), but overall, the conference was run beautifully. Everything from the selection of speakers down to the food provided was handled very well. I was also impressed by the tone of the speakers and attendees.

Others pointed out the severe gender imbalance – I don’t know the final breakdown of male/female attendees, but it was pretty stark. However, it’s hard to say any individual conference can do a great deal to change those realities. What a conference can very definitely do however is to send exactly the message that women aren’t welcome, and I happily didn’t notice any of that here. There was a question near the end that asked something about the culture of alcohol around meetups and dojos. There was definitely a lot of activity at the conference that took place after hours at the pub, and I know that a non-drinker would probably feel a bit put out or unwelcome. That’s a hard problem to solve. I will say that I don’t recall anyone crossing the line into any sort of obnoxious drunkenness, so I think at least a non-drinker could still participate in the community discussions. If a person really doesn’t want to subject themselves to an environment where there’s lots of alcohol around, it’s not a great solution.

I was slightly surprised by the number of people who were using Clojure professionally – that is, not very many of them. I don’t know that that’s either good or bad by itself – I think having things like Overtone is much more exciting that the typical sorts of things you’d see at JavaOne or some other conference where everyone is sent by their employers for training on the latest Java Enterprise Struts Foundation Builder Factory or whatever. The “hobbyist” vibe makes for a very friendly and diverse conference, and the people are probably more excited as well.

I also can’t resist making one more slightly negative point. There seems to be, at times at least, perhaps a bit too much hero worship in the community. Everyone wants to talk about how so and so “complects” this and that thing, or reference to how something is “simple” but not “easy”. I was talking with someone else who said it fairly well, that one gets the impression that if Rich stood up and said, “this homoiconic thing kind of sucks, I’ve decided — oh, and whitespace should be significant”, then the community would look very serious and studious and say, “You know, he’s right, I haven’t thought of it that way before”. By no means is this some endemic problem, and I think the core contributors certainly don’t fit this description. It does give a weird cargo-cult vibe at times though.

I want to end on a positive note. There were two full days of talks and lively conversations well into the evenings, and I’ve mentioned every negative point I could come up with. That leaves a lot of things that were very right. Thanks to all the organizers, speakers, and attendees for making EuroClojure a smashing success.

Syncing Google Contacts to BBDB – Introducing Charrington

I use Emacs as my usual mail client, currently with notmuch, but occasionally with Gnus as well. One of the really nice features to get working with any Emacs mail client is BBDB integration. BBDB, the “ubiquitous Big Brother DataBase”, is an address book for Emacs. It can do quite a few things, but I mostly only use it to autocomplete addresses when I’m sending mail.

Both my personal and work accounts are now hosted through Google Apps, and especially with an Android phone, Google Contacts is by far the most sensible place for my contacts to call their canonical home base. However, this introduces a problem – Emacs doesn’t want to play nicely with Google. There are tools, most notably the “Little Brother DataBase” (LBDB), which has connections to all sorts of data providers. However, (a) I never managed to get the UI for LBDB to be as nice and seamless as BBDB is, and (b) having to query a remote server to tab-complete an address introduces a really annoying delay.

There are a few tools out there that claim to write BBDB records from Google’s data, but I had very poor luck with them. They were fairly hard to get running, and typically left me with files that BBDB would choke on, perhaps due to character set issues (I have numerous Icelandic and other international colleagues with “interesting” characters in their names and addresses). Most of them seem to be quite old and unmaintained as well.

With that in mind, I wrote “Charrington”. The name is perhaps a little bit indulgent – Charrington was the member of the thought police in Orwell’s “1984” who turned Winston and Julia in to Big Brother. He kept Big Brother in the loop…get it? Huh? Like I said, it’s a stretch.

In any case, you can grab the code from Github and give it a spin. I’m sure there are plenty of bugs aside from the ones I already know about, so if you run into any problems, let me know.

Not Writing Like a Scientist

In light of the recent discussion around science writing, I decided to try a little experiment. I had an opportunity to submit a position paper to a workshop in my field recently, and it seemed like a perfect opportunity to try a more informal approach to writing. Position papers are always a little weird by their very nature, and the lack of hard data to present means you have a bit more flexibility in your prose, so I figured I’d give it a shot.

I got the reviews back today, and it was fairly mixed. The paper was accepted (barely), which is actually probably appropriate. I don’t think it was a really strong paper, and the scores seemed to me to be about exactly where I would have put them myself. So in that sense, writing informally didn’t seem to hurt me too much. However, there were several negative comments as well specifically singling out “informal” language, which was a little disappointing I guess.

Oh, and as a throwaway joke, I used the word “embiggened” in the paper, obviously a reference to the famous Simpsons neologism, which at least one reviewer complained about. The complaint was simply that it “isn’t a word, perhaps the author meant ‘enlarged’”, which probably indicates that the reviewer didn’t get the reference. No one else mentioned it at all, so it’s hard to say if it went unnoticed or was simply viewed as a perfectly cromulent word for a scientific paper.

I have to make some clean-ups for the camera-ready version, and there was enough push-back that I’ll probably rewrite some of the “worst” sections in a bit more standard scientific jargon.

Arch Linux on the Lenovo Thinkpad X220

First thoughts on the hardware

For the past several years, I’ve been using a Mac of one form or another as my everyday machine. For the most part, it’s been a pretty good experience. However, that experience has gotten worse, partly because Lion is just an abysmal OS upgrade, and partly because longstanding minor annoyances with Mac OS X have gradually become more and more grating. At some point, it occurred to me that what I really wanted was Linux again. I briefly considered just installing it on my Macbook Pro, but as I started looking at available PCs, I got really excited about the Lenovo x220. It’s a small, light laptop with high-end performance matched with very good battery life and the option of a second battery that snaps seamlessly onto the laptop. For things like conferences, this seemed like a perfect solution, and I pretty quickly convinced myself to buy one.

Overall, the laptop is really nice. I ordered it with the 6-cell battery and the additional “slice” battery. Together, they promise some 15 hours of battery life under Windows, according to Engadget. My last substantial experience with Linux on my everyday laptop was when power management was a noticeable issue, but I’ve so far been very pleased with how much time I’ve been able to squeeze out of a charge.

I purchased mine with the 2.8GHz Core i7 processor, 4 GB of memory (upgraded myself to 8GB), a 320 GB hard drive (upgraded myself to an Intel x25 160 GB SSD), the “premium” (IPS) display, the 6-cell battery, external “slice” battery, fingerprint reader, 720p webcam, the Intel 6205 wireless adapter, and bluetooth 3.0 adapter.

Without the slice battery, the laptop is very light, although not quite as thin as some other options. Particularly compared to the Macbook Air and competing Ultrabooks, it looks pretty chunky, but even compared to the Macbook Pro, it’s a bit thicker. However, it manages to mostly feel every bit as solid as my Macbook Pro, and with only half the weight. That “mostly” qualifier is necessary because of the rather odd fit of the 6-cell battery. The battery doesn’t fit flush with the laptop, instead extending a bit below the bottom plane of the case. However, it’s not the bulge that draws this complaint. Rather, it’s the fact that the battery feels loose. You can wiggle the battery around in its slot quite easily. It doesn’t seem to be anything that would cause any structural concern, but it definitely detracts a bit from the overall impression of fit and finish.

Update: It turns out that the latch holding the battery into my laptop was actually broken. I’m happy to report that despite my not being able to report the issue for several months (I ordered it over two months before I was back in the US to pick it up), Lenovo agreed to not only repair it for free, but also arrange to do so here in Iceland instead of having to send it away. The process of getting through their support channels to get to that point was frustrating. I’ll perhaps write up a bit more information on that when I have time. But for now, if you have significant looseness in the battery fit, double-check that it’s being locked in place securely and contact Lenovo if you think it’s excessive. Had I done so immediately, I could have saved myself some effort.

One area in which it seems no PC vendor has been able to compete with Apple is in trackpad quality, and while the x220 can’t either, I find the trackpad to be quite good overall. Out of the box, it wasn’t especially good – even in Windows 7 which came preinstalled. However, after installing Arch Linux and configuring the synaptics driver, the trackpad is very good. It doesn’t appear possible to configure the full range of gestures available on the Mac, but the most important ones are there (except for three-finger swipes for forward/backward, which are handled on the Lenovo through dedicated keys next to the arrow keys) and perform well.

The keyboard is, as one would expect from a Thinkpad, excellent. I strongly dislike the chiclet style keyboards Apple puts on all their systems now and which have since become de rigeur for most of the PCs I considered as well. Those forward/backward navigation keys are pretty easy to hit if you’re trying to reach for the arrow keys, and more than once, I’ve been typing something into a form on a browser, reached over to hit the left arrow to go back and edit a word or two previously, and accidentally hit the “back” button leaving the page and destroying my input. This is a problem that should mostly go away over time as I acclimate to the layout, but it’s something to be aware of.

It has a very nice assortment of ports available, although I’d much prefer a DVI port in place of the ancient VGA port. There is additionally an HDMI port, and with an adapter, this can provide a high-quality connection to a modern monitor, but it’s nonetheless silly to default to VGA output.

My other major complaint is with the screen resolution, but PCs in this class almost universally sport the same 1366×768 resolution. Sony’s VAIO Z series is available with a 13” screen and full 1080p resolution, but costs over $1000 more and still requires several compromises compared to the Thinkpad. The resolution of the display is low, but it’s not a dealbreaker. Aside from that, the display is very good. I went with the IPS version of the screen (a $50 upgrade from Lenovo), and the results are as one would expect. The viewing angles are very good, and the display is bright and crisp.

A bit more about the slice battery. It snaps in quite securely; ironically a much more solid fit than the standard 6-cell battery. Obviously, using it makes the laptop much thicker, but it’s still a light and portable machine. I don’t envision using it terribly often, but it’s a wonderful option to have when attending conferences or sequences of meetings that normally end up anchoring everyone it little circles around the nearest power outlet.

Overall, I’m very happy with the machine. If the resolution was a bit higher, even 1440×900, and the VGA port was ditched for DVI, it would be very nearly perfect. As it is, it’s very good, and those flaws are minor enough for me to live with. The battery life is amazingly good for a system this small, and with the optional slice battery, there’s really not another laptop on the market that compares with regard to runtime.

The Buying Experience

A final word about the laptop itself before I jump to getting Arch running on it. Lenovo really needs to do something about the experience of actually buying something from them. Like so many PC vendors, their website is generally a poor experience. It’s not quite a Dell-level train wreck, but it’s closer to it than it is to the experience of buying Apple. There are multiple different ways of finding the system, and the options are presented inconsistently depending on which you choose. In my case, I purchased using the Linux Foundation link, which saved me a fair amount of money. So thanks, Lenovo, for that at least. However, the configuration is divided into three screens. The first is the primarly laptop configuration options (processor, RAM, etc.). The second deals with warranty and add-on software options, and the third with other peripherals. In my case (but not via any of the other paths you can take to configure and buy an x220), the first page included an option to select between the 6-cell and 9-cell battery, and then a separate check-box to select whether to add on an external slice battery. Then the third page presented a list of several additional power options, including batteries. This list included another option to buy some sort of slice battery, but the description was slightly different, and it cost $60 more than the one listed on the first page. The “more information” links on each configuration section were mostly broken, popping up an empty white overlay. A few of the links worked – I could see more information about docking stations for instance, but for the batteries, nothing appeared. At other times, the links would populate, but with the wrong content. In short, the site itself was dreadful. On the plus side, the chat support was very quickly picked up, and the person on the other end was able to tell me that the two batteries were in fact the same, and that the one on page one was there and cheaper because it was part of a pre-packaged x220 configuration on offer. He seemed like he knew what was going on without having to go look up a canned answer in a knowledge base, and that’s very refreshing. But come on Lenovo, get the site in shape. We shouldn’t need to talk to someone to understand what you’re trying to sell us.

Arch Linux on the x220

I spent a few days using the Windows 7 install that came on the laptop just to get a feel for what to expect from the hardware, and then ditched the included mechanical drive in favor of a solid state drive on which I installed 64-bit Arch Linux. The drive bay on the x220 is too short to accept a normal 2.5” laptop hard drive, but the SSD I had, an Intel x25M 160 GB model, is actually a smaller drive with an included spacer. It’s easy to remove the spacer, but doing so leaves you without any screws to put back into the drive, as the included screws are long enough to account for the spacer. I had to visit a computer repair store kind enough to let me scrounge around a bin of screws, but after that, the drive slides in pretty well with the rubber sides from the included drive.

Installation of Arch (I went with the 64-bit version) was fairly uneventful. One thing you should be aware of if you’re buying an x220 is that you need to upgrade to one of the Intel wireless chips, otherwise you’ll end up dealing with a crappy Realtek driver, but the Intel 6205 chip in the one I bought worked without issue.

Disk configuration

I configured a small /boot partition, larger partitions for / and /usr/local, and the remainder devoted to /home, save for an 8 GB swap partition (I have 8 GB of RAM installed, and swap needs to be at least that large to support suspend-to-disk functionality). Because this is all on an SSD, I added the “discard” flag to the mount options in /etc/fstab to enable TRIM support.

Trackpad

The trackpad works OK out of the box, but you’ll want to configure the synaptics driver to get it to work well. In Arch, you need to install the xf86-input-synaptics-clickpad driver from AUR. This will create a file /etc/X11/xorg.conf.d/10-synaptics.conf. Read the synaptics man page for complete details, but I’ve included my configuration below.

Section "InputClass"
    Identifier "touchpad catchall"
    Driver "synaptics"
    MatchIsTouchpad "on"
    MatchDevicePath "/dev/input/event*"
    Option "TapButton1" "1"
    Option "TapButton2" "3"
    Option "TapButton3" "2"
    Option "ClickFinger1" "1"
    Option "ClickFinger2" "3"
    Option "ClickFinger3" "2"
    Option "VertEdgeScroll" "on"
    Option "VertTwoFingerScroll" "on"
    Option "HorizEdgeScroll" "on"
    Option "HorizTwoFingerScroll" "on"
    Option "LockedDrags" "on"
    Option "LockedDragTimeout" "500"
EndSection

This enables two and three finger taps and clicks, assigned to right and middle click respectively, two finger scrolling both horizontally and vertically, edge scrolling in both directions, and “drag locking”, which is the behavior that allows you to lift and replace your finger without interrupting a continuous drag event.

Wifi configuration

If you intend to run Linux, you should definitely upgrade to the Intel wireless chipset. I ordered mine with the 6205 adapter, which works well out of the box. However, getting up to speed on how Arch wants to support the alphabet soup of wireless standards was a bit of a hassle for me. I tried numerous options, all of which worked, but none of which was what I’d call seamless. Part of the problem is that the “friendly” methods tend to assume you’ll be using some Gnome applet to manage the connections, and I’m using Xmonad. I finally hit a sweet spot with Wicd. Configuration for my university’s WPA2-Enterprise network required a bit of trial and error (despite being a CS professor, I have next to no idea what the difference is between TKIP and CCMP is), but once I managed to get it connected, wicd seems to do a good job of connecting to my preferred networks and allowing me to intervene when needed.

Power managment

Power management was always one of the big issues for Linux on laptops. The slice battery on the x220 makes it less of a big deal, but I still want to be as conservative towards my battery as possible. Fortunately, I’ve managed to get the power drain down to very acceptable levels. I haven’t yet managed to get everything working the way I want yet, but it’s quite usable right now.

After installing the necessary packages (pm-utils, cpufrequtils, acpid, etc.), I had a few issues left to resolve. The default acpid handler script wasn’t able to correctly suspend to RAM when I closed the laptop lid. By default, the script tries to catch the actual ACPI events, but a bit of investigation showed that each time I closed and opened the lid, the event number was incremented. Instead, I modified the case in the handler script (at /etc/acpi/handler.sh) dealing with lid events to read as follows:

1
2
3
4
5
6
button/lid)
    if [[ `cat /proc/acpi/button/lid/LID/state | awk '{ print $2 }'` = "closed" ]]; then
        echo "lid closed" >/dev/tty5
        pm-suspend
    fi
    ;;

Now whenever any lid event occurs, it ignores the specific event and just reads the current lid state from the proc filesystem.

Hibernate, or suspend-to-disk, still isn’t reliably working. Currently, the issue seems to involve PulseAudio, but I’ve ran into multiple issues there, so I’m not entirely sure the current breakage isn’t related to something else I’ve done. Regardless, I haven’t had enough time to really dig into the problem just yet, but I’ll update things here when I resolve the problem.

Update: Hibernation is working fine as well, actually without much action at all on my part. I’ve done a couple of system updates since I last fooled with it, so perhaps it was just a matter of getting a new kernel. I’m currently on kernel 3.1.9-2-ARCH, and hibernation works fine out of the box. You will need to add the “resume” hook to your mkinitcpio.conf (as detailed here).

One other annoying issue I haven’t tracked down yet is that sometimes, after resuming from the suspend state, there’s a weird issue with the keyboard in X. If I have a terminal up, it seems to stop responding to keypresses intermittently. For example, I might type “ps -ef”, and nothing will happen after the “ps”. However, the key events are received – if I press enter in that example, ps does receive the “-ef” flags. Exiting and restarting X solves the problem when it occurs, so it’s an annoyance more than anything else, but I definitely want to spend a bit of time tracking down the problem.

CPU frequency scaling works well. I edited the file /etc/conf.d/cpufreq to select the ondemand governor by default, and added lines into the ACPI handler script as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
ac_adapter)
    case "$2" in
        AC|ACAD|ADP0)
            case "$4" in
                00000000)
                    echo -n $minspeed >$setspeed
                    for CPUNUM in {0,1,2,3}; do cpufreq-set -c $CPUNUM -g powersave; done
                ;;
                00000001)
                    echo -n $maxspeed >$setspeed
                    for CPUNUM in {0,1,2,3}; do cpufreq-set -c $CPUNUM -g performance; done
                ;;
            esac
            ;;
        *)  logger "ACPI action undefined: $2" ;;
    esac
    ;;

The additions are the two for loops that set the CPU governor for each CPU in the system. I also added the cpufreq daemon to /etc/rc.conf. I’m not sure if both of these steps are required or not.

Also, for reasons I don’t understand (and are almost certainly due to my absence from the Linux world for a while), I’m a little hazy on how certain events are being processed. In particular, some aspects of CPU frequency scaling and PCI power management aren’t happening automatically. For now, I’ve just been running the following script manually when I boot the machine on battery power or unplug the charger, but I need to figure out the appropriate way to get this to happen automatically. If necessary, I can hack a few scripts to force it, but I didn’t want to go that route until I spent a bit more time figuring out if there’s an elegant solution I just don’t know about yet.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#!/bin/sh

minspeed=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq`
maxspeed=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq`
setspeed="/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed"

device_pm() {
        for CONTROL in /sys/bus/{pci,spi,i2c}/devices/*/power/control; do
                [ -w "$CONTROL" ] || continue
                echo $1 > "$CONTROL"
        done
}

case "$1" in
        true)
                echo "Device power management on"
                device_pm auto
                echo "Switching to powersave CPU governor"
                #echo -n $minspeed >$setspeed
                for CPUNUM in {0,1,2,3}; do cpufreq-set -c $CPUNUM -g powersave; done
                ;;
        false)
                echo "Device power management off"
                device_pm on
                echo "Switching to ondemand CPU governor"
                #echo -n $maxspeed >$setspeed
                for CPUNUM in {0,1,2,3}; do cpufreq-set -c $CPUNUM -g ondemand; done
                ;;
esac

exit 0

Edit: turns out the TLP daemon is apparently just misconfigured for my purposes. Changing RUNTIME_PM_ON_BAT=on to RUNTIME_PM_ON_BAT=auto in /etc/default/tlp solves the problem of PCI device power management not working automatically at boot.

Another remaining issue is that the Intel HD audio driver seems to generate a lot of wakeups according to powertop2. Note that you do need to install powertop2 from AUR instead of powertop 1.13 from the core repository, as only the newer version correctly supports reading information from sysfs.

Bluetooth can be stopped and restarted using rfkill (rfkill block bluetooth) and (rfkill unblock bluetooth). As I very rarely use bluetooth, I chose to simply disable it at boot; I’ll unblock it manually if I need to use it. Because it’s sometimes handy to have a convenient place to do this sort of thing, I created a new script in /etc/conf.d named local that I start at boot time. Currently all it does is run the rfkill command to turn bluetooth off, but I can add more things later if needed.

I also modified my xmobarrc to ensure than nothing happened at greater than about five second intervals. On my desktop, for instance, I have an entry showing the time in hh:mm:ss format, updated every second, along with CPU and memory monitors. Dropping the frequency of the system monitors to five seconds and dropping the seconds from the date so it could be updated once per minute saves a noticeable amount of power from the reduced wakeups.

With all this in place, I’m getting very good battery life. With the screen dimmed to maybe 40% brightness, I’m seeing around 7:00 to 7:30 of normal usage on the 6-cell battery. I haven’t done any real testing, but that seems to be a reasonable estimate based on the discharge rate I’m seeing. With the slice battery connect, that number jumps to maybe 15 hours. Powertop doesn’t work completely with sysfs, but Powertop2 is available in prerelease form in AUR, and it shows an average of about 7.5W when the system is idle, although often I see higher use due to my not having managed to get a completely automated solution for setting all the needed options.

Other stuff

The audio and integrated webcam worked fine out of the box. As is typical with Arch, you need to add your user to the appropriate groups for device permissions, but nothing else was needed.

Bluetooth presumably works fine. It detects the controller and turns everything on, but I almost never use bluetooth, and I haven’t tested it to see if it works yet. I actually disable it in /etc/rc.local to save power.

The fingerprint reader doesn’t seem to work, or at least that seems to be what I see online. I haven’t tried to configure it at all. Running fprintd-enroll and swiping across the reader, I do see it detect my finger, so it’s possible that it works just fine if I knew how to configure it, but it’s also possible that there’s no driver for the protocol it speaks, and thus no way to make use of the data. I don’t know.

I think that covers the major points. I’ll update this as I continue to figure things out, particularly the remaining power management issues. For now though, it works plenty well enough for my everyday usage, and if you’re in the market for a small, powerful laptop with absolutely unmatched battery life, I recommend it pretty highly.

Esjan, Part Tvö

It was such a nice day today, I decided to go back up Esjan. This time, I remembered to turn on my GPS tracker to record some information about the path.

According to Google Earth’s interpretation of my GPS log, the total trail was 4.48 miles (7.21 km), 2.5 miles of which were ascending and the remaining two miles descending (I took the longer more meandering trail up and the more direct one down). It took three hours and 22 minutes in total, almost two hours of which were ascending. The elevation gain was 2627 feet (800 meters) from the parking lot to the summit. The average grade on the ascent was 19.9%, with a maximum of 41.5%. Descending the steeper trail, the average grade was -23.1%, and -41.8% at the steepest point.

Icelandish

First off, apologies to the excellent Yessica over at the real Icelandish blog for cribbing her title. It’ll only happen just this once, but it was too appropriate to pass up.

This was an apartment listing when I was looking last fall, as rendered by Google Translate:

Climbing Mt. Esja

Summer has finally announced its imminent arrival in Iceland, and I went this weekend with some friends to Mt. Esja, a popular hiking destination just outside Reykjavík. It’s about 1000 meters high at the summit, and the walk/climb to the top was pretty brutal for my woefully out-of-shape body. With some brief rest stops, getting to the top wasn’t so bad. However, the trip left my legs a bit jellied, and that in combination with my poor choice of footwear made the descent very taxing.

I think once I buy a set of proper hiking boots, I may make a regular occurrence out of this hike.

Debunking the “Supermoon” With Clojure

So some jackass realized that the moon doesn’t orbit in a perfect circle and decided that he alone in the annals of history was in possession of the information that Earthquakes and other doom might ensue, and since the Moon was closer than usual (more on that in a bit), something *bad* was about to happen. No big deal. There are about 7,000,000,000 people on planet Earth, and a good percentage of them are, even now, right as you’re reading this, drooling uncontrollably from their slack jaws. However, this happened to make what now passes for news just before a massive earthquake hit off the coast of Japan, and so now the idiots who have taught themselves to operate a television are convinced that they’re correct. Oh, dear.

There are so many problems here, it’s hard to know where to start. Possibly that the moon isn’t actually especially close to Earth right now, and couldn’t cause this earthquake even if it was. Anyway, Phil Plait has a nice takedown up right now, and he knows a lot more about planetary physics than I do anyway, so just go read what he has to say. I’m writing this for another reason. One of my Facebook friends wondered what the actual plot of earthquake frequency and magnitude versus lunar distance actually would look like. That sounded like a fun little exercise in programming.

There are basically two questions that need to be answered. First, when have there been major earthquakes? I went to Wikipedia and grabbed a list of the 24 largest earthquakes in known history (today’s Japanese quake comes in number six). This gives me the magnitudes, but also the dates the quakes occurred. The range of dates covered by these 24 mega-quakes spans from a 1575 Chilean quake right up to today’s Japanese quake. The second thing we need is an ability to calculate the distance between the Earth and moon on any given day in history. That’s a tougher problem, but a bit of Google-fu turned up the ELP2000-82B model of lunar ephemerides. That’s what we need here. However, it’s a very complex model, and I didn’t want to spend a week debugging scientific code, but fortunately, I found an implementation on Github.

A bit more reading of code and comments showed that the model needs Julian dates. A bit more searching to find the exact formula for this conversion, and a quick implementation of said formula in Clojure yields the following.

1
2
3
4
5
6
7
(defn gregorian-to-julian
  [year month day hour minute second]
  (let [a (/ (- 14 month) 12),
        y (+ year 4800 (- a)),
        m (+ month (* 12 a) (- 3)),
        jdn (+ day (/ (+ (* 153 m) 2) 5) (* 365 y) (/ y 4) (- (/ y 100)) (/ y 400) (- 32045))]
    (+ jdn (/ (- hour 12) 24) (/ minute 1440) (/ second 86400))))

Because this was intended to be quick and dirty, I just manually set up a simple data structure for my earthquake data, and rather than spend lots of time integrating my Clojure code with the C library that implemented the lunar model, I simply used the C code to compute the distances, and then copied them directly into Clojure as well.

1
2
3
4
5
6
7
8
9
(def *quakes*
     (list [[1960 5 22] 9.5] [[2004 12 26] 9.3] [[1964 3 27] 9.2]
           [[1952 11 4] 9.0] [[1868 8 13] 9.0] [[2011 3 11] 8.9]
           [[1700 1 26] 8.9] [[2010 2 27] 8.8] [[1906 1 31] 8.8]
           [[1833 11 25] 8.8] [[1965 2 4] 8.7] [[1755 11 1] 8.7]
           [[1730 7 8] 8.7] [[2005 3 28] 8.6] [[1957 3 9] 8.6]
           [[1950 8 15] 8.6] [[2007 9 12] 8.5] [[1963 10 13] 8.5]
           [[1938 2 1] 8.5] [[1923 2 3] 8.5] [[1922 11 11] 8.5]
           [[1751 5 24] 8.5] [[1687 10 20] 8.5] [[1575 12 16] 8.5]))

To get the Julian dates for each quake, it’s sufficient to just run a quick and dirty map over the quake data as follows.

1
2
3
4
(defn build-julian-dates
  [quake-list]
  (let [gregdates (map first quake-list)]
    (map #((comp int gregorian-to-julian) (nth % 0) (nth % 1) (nth % 2) 11 0 0) gregdates)))

Now we feed that data into the C code from Github to calculate the distance the moon was from the earth at noon on the day of the quake (I made that approximation rather than look up the precise time each quake occurred). After those distances have been computed, I just copied them back into a Clojure data structure.

1
2
3
4
5
(def *distances* (list 401457.7391 406456.7146 397480.5329 391134.7747
                       367611.4645 384653.3206 391978.4295 357983.2221 404404.0269
                       388230.3723 395956.3832 360256.3522 401928.1778 375390.8061
                       368042.0999 371760.8433 403062.0740 399405.1708 395555.1038
                       364759.5125 373865.8605 361623.8578 366941.6539 401339.3053))

With that, we have enough data to plot earthquake strength versus lunar distance for the 24 largest quakes in recorded history. Clojure has a really nice library available called Incanter that makes charting and statistical analysis very easy.

1
2
3
4
5
6
7
8
9
(defn make-plot
  []
  (let [plot (scatter-plot (map second *quakes*) *distances*
                           :title "Quake Strength vs. Lunar Distance"
                           :y-label "Distance to Moon in km"
                           :x-label "Magnitude")]
    (set-y-range plot 356380 406710) ;; min/max lunar distance between 1575 and 2011
    (view plot)
    (save plot "plot.png")))

The resulting plot is shown below. As you can see, there’s no relationship at all between the strength of the strongest quakes and the current distance from the Earth to the moon. Note that the Y-axis is scaled to the minimum and maximum distances the moon has been in about 450 years, so the quakes really are distributed all over the place.

One last thing. Although the graph is pretty obvious, we really shouldn’t try and use intuition as a substitute for actual analysis, so let’s go ahead and fit a linear model to the data. Again, using Incanter,

1
(:r-square (linear-model *distances* (map second *quakes*)))

we get an value of 0.1193, or basically no relationship at all.

Building Your Own Keyboard Layout for Fun and Profit

It’s commonly known, if you’re the kind of person who commonly knows this sort of thing, that the normal QWERTY keyboard layout is not particularly well suited to typing English prose. The usual story is that it was designed to intentionally slow typists down so that the physical hammer mechanism used by early typewriters would have time to clear between key punches so as to not jam. Probably, that’s a bit suspect. But certainly the design was affected by the implementation on these typewriters to some degree. You can switch to the Dvorak layout, and there’s Colemak, and a few lesser known ones as well, but what is the optimal layout? It turns out that mathematicians and computer scientists have actually spent a bit of time thinking about this problem, and it’s a fun little exercise to look at how we might go about answering it.

I’ve seen this question appear two or three times on Reddit now, and I’ve always written a lengthy explanation. My Ph.D. work consisted of a large segment the general optimization problem that we’ll use to model keyboard layout, and I actually gave a talk on using it for that purpose in something of an “Ig nobel” approach. Anyway, it finally occurred to me that I have a convenient place where I can publish this sort of thing.

First, what is the actual problem we’re trying to solve? Basically, we want to be able to type as quickly as possible. Anyone who types with greater than “hunt-and-peck” speeds knows that certain words are harder to type than others. Personally, I notice this whenever I visit “news.bbc.co.uk”. Let’s look at that URL to see why. The first bit isn’t so bad really. “N-E-W-S” is typed with my right index finger reaching a bit, the “E-W” rolls off the left hand quite easily as the two characters are adjacent on my QWERTY keyboard layout so you can sort of drum your fingers across the two letters quite quickly. But then I have to hit the “S” with the same finger that just typed the “W”, but down a row. Moving my fingers up and down rows takes much longer than simply pressing a different finger on the home row. So there’s a little delay while I get my ring finger in position after typing the “W”. The “.bbc.” isn’t so bad. The “B” is one of the harder keys to reach, and I generally find the bottom row to be harder to type on than the top row (both of which are harder than the home row obviously), but the repeated keypress on the “B” is as easy as possible, and the “C” uses a different finger, and the “.” is on the other hand, both characteristics helping to allow faster typing of those keys. But the end of the URL, “.co.uk” has a lot of right hand action, and a lot of jumping up and down rows. Because it will be useful later, let’s think about the difficulty of pressing two physical keys on the keyboard in succession as the distance between those keys.

Note a critical insight here. This distance we’ve defined is between keys or buttons on the keyboard. It has nothing to do with letters. We can (and will) think about removing the binding of letters and symbols to keys so that we can rebind them in more optimal ways. To do this, we need this distance measure between the physical keys where we can put them.

We need a bit more information before we can start though. Namely, what is it that we hope to type quickly. After all, you may think that QWERTY is an awful layout, but if you want to type the sentence, “asdf asdf asdf asdf asdf,” it’s a pretty good system. Generally, we’re probably interested in typing ordinary English prose, but it’s not clear exactly what that means. We need to formalize that description somewhat. The key is to understand that typing quickly is simply being able to type successive pairs of characters needed quickly and easily. If “T-H” and “H-E” are both very easy to type successively, then the word “the” will be easy to type. This means we need a convenient source of what we’ll call 2-gram frequencies. The idea is that, given the type of text you wish to write, can you estimate the frequency of each pair of characters appearing in succession? We could sit down and attempt to reason our way through it, e.g., “t-h appears a lot, e-r does too…” However, a more reliable solution is to simply analyze a bunch of text that seems to be a good representative sample of the type of thing you’ll be typing, and compute the frequencies directly from that.

Let’s take a little detour into classical optimization problems. Introduced in 1957 by Koopmans and Beckman, the Quadratic Assignment Problem (QAP) asks the following question. Can you find the best way to put buildings in available physical locations such that overall amount of traffic between the buildings is minimized? Imagine you’re tasked with building a new campus for a university with 25,000 students, faculty, and staff. The university planning committee hands you a list of all the academic buildings they’ll have (physics, engineering, mathematics, nursing, psychology, literature, etc.) and a table of numbers that tell you how many people walk from one building to another during the course of an average week, for all pairs of buildings on campus. They’ve noticed anecdotally that there are a lot of students who complain that they’re always late for class because they have to walk all the way across campus to get from their calculus class to their physics classes. They want you to try and put related buildings like that closer together when you design their new campus.

Looking at the problem, you realize that what you really want to do is minimize the total amount of miles walked by all the people on campus. If 20 people walk one mile each, and two people walk five miles each, the total walked is just miles. That is, you multiply the distance between the two locations by the amount of traffic flowing between the buildings you assign to those two locations. Sum this over all the location/building assignments, and you have the total miles walked. Minimizing this is the QAP.

Because we’re computer scientists, let’s make this easy and number everything. We have buildings and locations to put them in. Let’s number the buildings from 0 to . If we agree to always write them in the same order, we can represent a possible assignment as a valid permutation of those numbers. For example,

2 7 1 4 0 6 5 3

denotes that building 2 goes in location 0, building 7 in location 1, building 1 in location 2, and so on.

We can now formalize the problem. Take all those distances between locations and stuff them in a matrix such that is the distance between location and location . Similarly, we define a cost or “flow” matrix such that is the amount of traffic flowing between building and building . Our goal is to find the permutation such that the following objective function

is minimized. Expressed in pseudocode, this function is as shown below.

int objective(P)
    f = 0
    for i=0 to N-1
        for j=0 to N-1
            f  = D[i,j] * C[P[i],P[j]]
        end for
    end for
    return f

It should hopefully be starting to fall into place. The problem of finding an optimal keyboard layout for a given set of 2-gram frequencies is just an instance of the QAP. Instead of some form of physical distance between locations, we have a generalized distance metric that intends to conform to the idea that harder to type pairs of keys are “farther apart” than easier to type pairs. And our flow or cost matrix comes directly from the 2-gram frequencies calculated from a large sample corpus of representative text we might wish to type.

To actually solve the underlying QAP instance is challenging. Unlike, say the traveling salesrep problem (TSP), QAP is very difficult to solve well even in practice. Whereas TSP instances of tens of thousands of cities can be routinely solved within a few percent of optimality by simple heuristics, QAP instances much larger than about 35 or so remain very difficult to handle in practice. Smallish instances can be directly solved via integer programming techniques, often based on some form of branch-and-bound algorithm. However, these methods can be quite slow, and fail completely for large instances. Depending on precisely how you formulate the problem, keyboard layout is likely small enough to be handled directly. However, as my expertise is on heuristics, that’s where I will focus my attention.

Going back to Reddit, another common theme is a fascination with genetic algorithms. It’s a fascination that I understand completely. Unfortunately, GAs rather, well…how to put this delicately…suck at QAP. Actually, they’re not so fantastic on a lot of problems, but QAP in particular is kryptonite to whatever super powers your GA might ever have. That said, they are fun to explore, and looking at how we might approach QAP with a GA is instructive on several levels, so I want to talk a bit about how you’d build a GA for this problem.

Without going into all the details on GAs in general, the real issue here is crossover. Recall that every valid potential solution is a permutation, and all the simple crossover operators you might be familiar with from a GA tutorial or two share the problem that they will almost never actually produce permutations if we use them here. Assume the following two permutations:

p1 = 0 1 2 3 4 5 6 7
p2 = 4 5 7 3 1 0 6 2

Let’s imagine doing a typical one-point crossover operation with crossover point equal to three. This gives us the following situation, with the parts of the parents that will not be swapped replaced by xs.

p1' = 0 1 2 x x x x
p2' = 4 5 7 x x x x

If I just swap those two pieces, I get nonsensical results.

c1 = 0 1 2 3 1 0 6 2
c2 = 4 5 7 3 4 5 6 7

In this example, offspring c1 has building 0 in two different places (and the same for buildings 1 and 2) and it doesn’t put buildings 4, 5, or 7 anywhere.

Obviously, this is not good. The reason I get bad results is because the subset of values I chose from each parent contains values not contained in the other, which means that those values must be in the other part of the parents, and will be repeated when I swap the pieces out. For example, if I swap them, the child 2 is going to get “0 1 2” as its first three numbers, but because those weren’t in the first three numbers of parent 2, they must be somewhere else, and since child 2 gets the rest of the values from parent two, you repeat them, which is bad. What if instead, I choose indexes 0, 1, 4, and 5. Now I have

p1' = 0 1 x x 4 5 x x
p2' = 4 5 x x 1 0 x x

If I swap those values to create two partial offspring, I get

c1 = 4 5 x x 1 0 x x
c2 = 0 1 x x 4 5 x x

Completing the crossover operation by copying the x bits down from the two parents yields

c1 = 4 5 2 3 1 0 6 7
c2 = 0 1 7 3 4 5 6 2

which works nicely.

Fortunately, it turns out that there is an algorithm that guarantees exactly this situation. The method is called Cycle crossover (CX), and it’s one of a number of methods that can be used on permutations. More precisely, what CX does is define the algorithm by which you can repeatedly generate these subsets of indices in such a way as to allow you go do a crossover operation without mucking up your permutations. Slightly more formally, what CX tries to do is find subsets of positions such that a “cycle” is formed, where a cycle is defined as a closure on the contained values. OK, that doesn’t help at all. That’s why I provided an example (and code).

Note that the code below actually does a bit more than I let on in the example. In general, finding one cycle isn’t enough. Cycles can be quite short compared to the length of the string, so finding one cycle would not tend to mix the parents very well. Instead, what we do is repeat the process of finding and crossing cycles until we’ve found them all. By the same token, if we crossed every cycle over, we’d end up just swapping the parents whole. Instead, every other cycle is swapped; the rest are passed down unmodified.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
void cycle_crossover(const chromosome& p1,
                     const chromosome& p2,
                     chromosome& c1,
                     chromosome& c2)
{
    mtrandom mt;

    // find a cycle starting from a random point
    int n = (int)p1.length();
    int cycle_len = 0;
    vector<int> perm = mt.permutation(n);
    int index;

    bool mix = true;            // mix cycles from different parents
    while(cycle_len < n) {
        // pick the first point in the cycle
        for(index=0; index<n; index++) {
            if(perm[index] != -1) {
                break;
            }
        }
        index = perm[index];

        // store the points in the cycle
        int start = p1[index];
        vector<int> cycle;
        do {
            cycle.push_back(index);
            perm[index] = -1;   // this point has been included in a cycle

            // set index equal to position of p2[index] in p1
            for(int i=0; i<(int)p2.length(); i++) {
                if(p1[i] == p2[index]) {
                    index = i;
                    break;
                }
            }
            cycle_len++;
        } while(p1[index] != start);

        // now that we have a cycle, update the offspring
        for(unsigned int i=0; i<cycle.size(); i++) {
            if(mix) {
                c1[cycle[i]] = p2[cycle[i]];
                c2[cycle[i]] = p1[cycle[i]];
            } else {
                c1[cycle[i]] = p1[cycle[i]];
                c2[cycle[i]] = p2[cycle[i]];
            }
        }
        mix = !mix;
    }
}

An interesting question is why one would choose CX over one of the other permutation crossover operators. The answer lies in understanding what properties of a candidate solution are important, and should thus be heritable. Let’s back off of QAP for a moment and think about the traveling salesrep problem (TSP), another permutation problem. In TSP, our goal is to find an ordering of cities such that a tour of each city in the given order is shortest (assuming a return to the first city as well). Again, assuming some numbering of cities, we might have the following two candidate solutions:

p1 = 6 4 3 7 0 2 1 5
p2 = 7 0 2 1 5 6 4 3

Note that these two solutions have no cities in the same location. And yet a bit of thought should convince you that they by necessity have the same cost, as they visit each city in exactly the same order, only starting and ending in a different city. They are the same tour. Generalizing a bit, the important part of a TSP solution is not where exactly the numbers appear in the permutation. It’s what numbers they appear next to in the permutation. So ideally, if we have two good parents in TSP, our crossover operator should try to produce offspring in which the adjacency of cities is similar to that exhibited in the parents.

Compare this to QAP. In QAP, the physical location of a building is important. The building will be connected to every other building with some flow, and the physical distance between locations is very important. Adjacency doesn’t matter at all, as location 0 and location 1 may be on opposite sides of a campus, so the fact that a solution puts one building “next to” another one purely in terms of adjacent location numbers is not a meaningful piece of information. Crossover should be designed to pass down relevant information, and for QAP, it’s absolute position in the chromosome. CX does exactly that.

With a suitable crossover operator, basic familiarity with GAs should allow you to implement one. Mutation can be done simply by swapping two elements at random within the chromosome. Selection and replacement occur as usual.

The one piece we’ve yet to address is how to build a distance matrix for optimizing your keyboard layout. I alluded briefly earlier to the notion that keys that are hard to type consecutively should be further apart according to this distance measure, but there are a lot of ways to interpret that. In addition, elements of that decision will need to take into account the specific model of keyboard you’re trying to optimize. I advocate the following approach: sit down with your keyboard and a pen and paper and start listing little rules and associated penalties. As a really quick example, you might have something like this.

  1. Default distance between two keys is 0
  2. +1 if keys are on the same hand
  3. +1 for each key in the pair that is not on the home row
  4. +10 if the keys use the same finger, but skip one row (e.g., ‘c’ and ‘e’ on a QWERTY keyboard)
  5. +12 if the keys use the same finger, but skip two rows (e.g., ‘c’ and ‘3’ on a QWERTY keyboard
  6. +9 for hard to reach keys (e.g., the ‘=’ key on most American keyboards)

The important thing here is that your rules match your expected keyboard and that the resulting typing model be roughly consistent and representative of realistic typing preferences. It’s not easy to come up with good rules. My advice would be to write a program that lets you easily tweak the rules and generate the resulting distance matrix. That way, you can experiment much more easily, as it’s unlikely you’ll be happy with your model the first few tries. The nice thing about this approach is that you can use it for very non-standard purposes. The most recent Redditor to ask about this was attempting to design a layout that could be used with one finger. That’s a perfect opportunity to use something like this, because you can easily play with suitable models of what it would be like to type on a hypothetical keyboard, and optimize your key assignments accordingly.

Finally, there’s one other really important thing you’d want to do, and it’s a real bummer as it takes us out of the nice pure world of mathematics where the basic abstract QAP formulation is all we need. This last issue is that while your computer will be capable of finding “good” layouts as measured by that overall distance times flow metric, real humans have several other constraints for which violations will make them cranky. People won’t like it if the ‘(‘ character is where the comma key normally goes and the ‘)’ is a shifted ‘2’. Look at your keyboard: you have several issues like this. The square brackets, parenthesis, and curly braces should have a coherent organization (next to their matching symbol, same level of shiftedness, etc.).

In the vernacular of optimization problems, these are called constraints, and constrained optimization is a whole category of study by itself. If you’re doing heuristic search like GAs or tabu search, you have two choices really. You can penalize the fitness of solutions that violate these constraints and hope that the algorithm will find a way to find better objective function values by satisfying them, or you can directly attempt to detect and fix violations. You can also treat the degree of violation as an extra objective to be minimized and do multiobjective optimization. My particular preference is usually to repair violations if I can and penalize when I can’t.

I’ve posted some C++ source code for generating the distance and flow matrices. Note a couple of things. My keyboard model is pretty basic. And by “basic”, I mean “shitty”. Nonetheless, it might give you a base to start from. Also, I did this a few years ago as a graduate student as part of my department’s sort of Ig Nobel Research Day where we all did something goofy with our serious research. As my research was mostly on multiobjective optimization, I generated two sets of flow matrices: one based on a corpus of my dissertation’s LaTeX source code, and the other based on a fairly large C++ code base I used for me research at the time. With multiobjective optimization, you get a set of solutions when your done that represent different trade-offs between the objectives. So I had one keyboard that was really good for typing my dissertation, one keyboard that was really good for typing my C++ code, and a bunch of keyboards in between. However, I punted completely on the constraints mentioned above, so my keyboards were all very screwy.

So anyway, that’s a pretty decent overview of how you can build your own customized keyboard layout using a bit of ingenuity to build a good keyboard model and some standard algorithms to attack the resulting optimization problem. If you try this yourself and come up with something interesting, drop me a line. I’d love to see what you came up with.