A Trifle Absurd
Matthew Morgan’s software notions
Finding First-Draft Mode
30 October 2004 at 23.54 • in GeneralIf programming is a creative act, like making a painting or a poem, then maybe programmers can use this ubiquitous creative advice: Blurt it out, then edit later. Separate the blue-sky, first-draft work of dreaming things up from the detail-focused engineering work of making things solid.
Some call this exploratory programming. You play with ideas to see where they go, using the most malleable medium you can get your hands on. Then, once the ideas are sketched out, you can hone the program into a solid and efficient form.
I want to have that mindset when I’m programming and writing. I have a hard time adopting it, though: I keep leaping to the editing stage prematurely. I could blame prior experience — programming atop ossified code, writing last-second papers for class assignments — but regardless of the cause, I need to break out of my edit-it-first mentality.
On the writing front, National Novel Writing Month may be just what I need. The idea is brilliantly crazy: write a 50,000-word novel during the thirty days of November. Of course, you can’t possibly write 1700 words per day if you’re dropping into edit mode; the blistering pace means you have to stay in first-draft mode to have any chance of succeeding. (Successful NaNoers also employ a few dirty tricks, like not using contractions…)
I’m going to give NaNoWriMo a shot, in hopes of conquering the premature-editing impulse. If I’m lucky, the lessons of NaNoWriMo will carry over into my programming as well.
Avoiding Overdesign
28 October 2004 at 23.58 • in GeneralLinus Torvalds on beginning a project: “… So start small, and think about the details. Don’t think about some big picture and fancy design. If it doesn’t solve some fairly immediate need, it’s almost certainly over-designed.”
This advice is echoed from all directions: start software small and simple, then grow it incrementally. I’m still trying to internalize it, though. As this recent entry attests, I’m prone to get tied up planning for the what-ifs of the future rather than making progress in the present.
But there may be hope for me yet. This past week, I’ve assembled a functional item-category database out of simple data structures, and I’m ready to start hooking it up to a web servlet. Will the database be fast enough to handle thousands of items? No — but it will work just fine with dozens, and that’s all I need right now.
Editor-Hunting
25 October 2004 at 16.41 • in GeneralI’ve been using SCiTE as my editor of choice: it’s simple and fast, and supports a number of languages. Unfortunately, though, it can’t do decent Scheme autoindentation, so I’m forced to look elsewhere.
One downside of picking Scheme is that it really narrows the pool of available editors. Most editors’ autoindentation is too simple-minded to handle normal Lisp and Scheme indenting. The only choices (that I know of) in the cross-platform open-source department are Emacs, vi, DrScheme, and now Eclipse.
Emacs isn’t an option for me: I’ve found it annoying ever since I first encountered it thirteen years ago. I recently took GNU Emacs and XEmacs for a spin to see if anything had changed, but still couldn’t find it in my heart to like ‘em. Just one of those things, I guess.
In college I embraced vi as the anti-Emacs choice. But frankly, it’s the editor of the past. I might fire it up to edit a couple of lines in a config file, but for serious code editing, I want more.
DrScheme is a good Scheme editor, even if you’re writing in a non-PLT dialect. But I expect to be coding in a mix of languages (e.g. Scheme, Javascript, C) and I’d like an editor that can handle more than just Scheme.
Which brings me to Eclipse, the editor/IDE/platform aiming to go beyond Emacs in kitchen-sink universality. Despite its ghastly startup time, it’s usable, and editing itself is snappy. And Dominique Boucher is working on a Scheme-editing plugin now in alpha, which already handles indentation well. It’s a tempting choice.
Keeping it Simple
24 October 2004 at 22.04 • in GeneralThe other day I realized I needed to make my simple database thread-safe. I worked out a locking scheme without much trouble, but in the process slid down the slippery slope to “What I should do is use a Real Database — then I won’t have to worry about all this stuff.” I even found myself reading API docs for SQLite and Berkeley DB with an eye towards making Scheme bindings for them.
But then I heard whispers — Knuth, Hoare, and Floyd, no doubt — saying “Premature optimization is the root of all eeee-vil.” Why am I trying to optimize database performance before even building a complete prototype? Instead, I need to keep it simple, so I can get a basic, functional system up and running and grow it from there.
The way to keep it simple is to keep it all in ordinary Scheme data structures for now, and worry about leaping the chasm of scalability when I come to it. If I build the right abstractions now, it should be (relatively) easy to swap out components later.
Gambit Scheme: First Impressions
21 October 2004 at 15.31 • in GeneralI’ve been putting Gambit Scheme 4.0 through its paces these last few days, and I really like what I see. Gambit doesn’t have the full range of libraries that PLT Scheme does, but it does have a good selection of I/O primitives and a straightforward C FFI. (And did I mention it’s fast?)
Writing a trivial HTTP server was trivial indeed, thanks to Gambit’s networking primitives. The built-in function open-tcp-server returns a port of ports: that is, the values returned by reading the port are themselves ports, each of which represents an individual client connection. So, a simple server loop can just read the next connection port from the server port and spawn a new thread for that connection.
Gambit uses ports quite a bit. For instance, open-vector-pipe creates a pair of ports for bidirectional communication between two threads (or two of anything, really). String, vector, and byte ports, and variations of them, are all provided. Even directories are represented as ports.
Pervasive ports and lightweight threads make Gambit a natural choice for message-passing concurrency, and indeed, Marc Feeley (Gambit’s creator) and Martin Larose collaborated on a (now-defunct?) Erlang-to-Scheme compiler that targeted Gambit.
Intuition
17 October 2004 at 21.36 • in GeneralYou know, I think I was looking for technical reasons to legitimize my choice of Scheme. I seized on Gambit’s release as such a reason, but in reality, my decision was based on gut feeling. Either Scheme or Python would have worked — I just happen to like Scheme better.
In geek-land, we often act as though any technical choice must be defended on the basis of technical merit. It just seems so solid to point to real, objective reasons for a choice. Intuitive, aesthetic choices, on the other hand, seem too vague and squishy — and so we grasp at technical arguments to back up our intuitive choice. I wonder how much air we could clear if we owned up to our intuition as intuition rather than trying to disguise it in a haze of technical argument.
Not that technical qualities don’t matter. They do, of course, matter very much. I only got to the point of choosing between Scheme and Python after evaluating the technical qualities of each and deciding that either would work. But in the end, I had to, in my wife’s (repeated) words, “just pick one”.
The Language Dilemma Resolved
14 October 2004 at 13.35 • in GeneralHere’s how I went about choosing a language at last:
I ruled out Haskell fairly quickly. It’s a fascinating language, but its lack of libraries is compounded by my own lack of Haskell experience. If my goal is to deliver a working application as soon as possible, Haskell isn’t the right choice, as I’m sure I’d spend half my time trying to reorient my thinking (and/or deciphering type errors). Perhaps after I’ve played around with Haskell for a while longer, I’ll be ready to use it in a project.
Python looked like the boring-but-sensible choice. There aren’t many Xs for which “Python is the best language for X”, but there are a whole lot of Ys for which “Python is a pretty good language for Y”. In fact, Python’s list of Ys is probably as long as that of any other language. Sure, its good points derive from popularity, but they’re still good points: lots of people know the language, and lots of open-source libraries are being written for it. Given all that, Python felt like the choice I should pick, even if my heart was leaning towards Scheme.
Perhaps I was just searching for any “logical” reason to pick Scheme instead, but in any case, I have one: the long-awaited Gambit-C 4.0 Scheme compiler is in public beta at last. (Thanks to Dominique Boucher for spreading the news.) It’s fast. It’s concurrency-oriented. It’s cross-platform. It’s open source. And it has tipped the balance to Scheme.
Haskell, Scheme, or Python?
11 October 2004 at 11.28 • in GeneralIt’s time to make a final language choice for Trifle and start cranking out the code. I’ve done experiments in several languages, and the three that stand out are Haskell, Scheme, and Python.
Haskell has a lovely syntax and type system, and its lazy evaluation strategy leads to elegant programs. It’s potentially the fastest language of the three, being both statically typed and natively compiled. The community is very small, though, which means there are few available libraries, and few potential collaborators should Trifle grow beyond a one-man project.
Scheme is clean and extensible, and there are multiple good implementations. It too has a small user base, but there are more libraries available than for Haskell (at least in PLT Scheme). Still, “more” doesn’t mean “a lot”.
Python has libraries — lots of them. It’s widely available on hosted servers, and lots of folks program in it. As a language, it’s not bad, though a little too imperative for my tastes. Right now it’s probably the slowest of these three, but lots of smart Pythoneers are working to change that.
So, where does that leave me? For the language itself, I’d pick Haskell; for the libraries, I’d pick Python; for the middle of the road, I’d pick Scheme. Ultimately, though, I just need to pick one.
Scheduling with Schelog
7 October 2004 at 14.58 • in GeneralA potential client does a lot of scheduling each week, matching up people, equipment, and places given various constraints like “this place needs five people”, “this person can’t work on Mondays”, “this place needs someone with skill X”, et cetera. Right now, she does this by hand in Excel, but she’d love to have a program do it for her.
This sounded like a job for Prolog, but I didn’t want to bother actually learning Prolog itself. Instead, I decided to try Dorai Sitaram’s Schelog, an embedding of Prolog in Scheme. Thanks to the magical call/cc, it really is an embedding: Prolog relations and operators can be freely mixed with ordinary Scheme code. (I’m sure it blows up if you use set!, though.)
I was able to get a proof-of-concept up and running without much trouble. Then I started to hatch plots about delivery, with an Excel macro calling out to a compiled Scheme program (using Chicken as the compiler ’cause it’s continuation-friendly). I had to admit, though, that it would be a lot simpler all around if I could just do it all in Excel.
This provoked brief but horrible visions of trying to write a Prolog interpreter in VBA. Recovering from that, I realized that I don’t need all of Prolog — all I need is depth-first search with backtracking. Even in VBA, that should be tolerable, right? Famous last words, I know…
Update: I started wondering if I even needed backtracking search, and came up with a simple algorithm to select people for place-day-skills slots using sorting and filtering. That handles the simple case where each person has a certain set of days they can and can’t work, but breaks when there are people who can work any particular day, but no more than x days per week. Solving that in general would require backtracking, but in this particular case it may be as easy as picking slots for unlimited-time people first, then giving limited-time folks the leftovers.
Haskell Performance
7 October 2004 at 08.46 • in GeneralThe Haskell community seems to be all in a tizzy about performance lately, with email flying about (like this message and the thread following) and code being optimized for the new Great Computer Language Shootout. The irony of this for me is that I started learning Haskell a few months ago after being surprised at how well it did in the Shootout.
André Pang makes two main points in his well-worth-reading Haskell performance missive: First, that you can make a Haskell program run really fast, but you have to be an expert, and the resulting program will be an ugly kludge. Second, that space leaks are an insidious problem in Haskell, because they show up all the time and are hard to find and fix.
I’m not so worried about the first issue. After all, if the Shootout results are accurate (which is debatable), Haskell is in the same neighborhood as Java when it comes to performance. Sure, people gripe about Java’s performance all the time — but lots of people have found a use for it. And Haskell is generally faster than the usual-suspect scripting languages.
The bigger issue in my mind is space leakage. Every few days, there’s somebody asking on the Haskell mailing lists for help debugging and/or fixing a space leak. It seems like an inevitable consequence of laziness: somewhere, somehow, a bunch of thunks will build up in memory when they shouldn’t.
The usual prescription for a space leak is a healthy dose of ’seq’ and strictness annotations. You wind up with a lazy program with some strict subparts. What if you turned that around? What if you had a strict language with laziness annotations (like a few ML variants)? Speed and space usage would be more predictable, but you could still use laziness where it made sense. If a language like that existed — with Haskell’s syntax and type system — I’d be using it…