Archive for August, 2008

a coder's bookshelf

August 30th, 2008

What is this obsession people have with books? They put them in their houses - like they're trophies. What do you need it for after you read it?
- Jerry Seinfeld

I think it's because reading a book takes a lot of effort, and we want to get credit for it. Reading a big book takes considerably longer than anything else you might do for "fun". And then you can point to it and say, "look, this is what I know".

I have a bunch of computer books, a lot of them from college, that I'll probably never toss out even though I'm unlikely to ever re-read them. Meanwhile, I can do what a lot of people are doing and put them on display. "Look, I must be really clever, I have all these books!"

Frankly that's all they're good for after I'm done with them.

tahple or twople?

August 21st, 2008

The word tuple is used quite a lot in computing. That's what database people call a row in a table. It's also what several programming languages call a structure where the fields are ordered but not named.

It seems to be one of those words that is hard to translate, so other languages often use the English word. And yet there is some confusion about pronunciation. Some say tahple, some say twople. As far as I know there is no dispute about the spelling, it's tuple. So where do you get twople from that?

I think having a lot of exceptions on pronunciation from what is the obvious pronunciation is bad for language. There are words that are fancy or interesting enough to perhaps deserve it, but tuple isn't one of them. So I'm going to keep saying tahple.

Beautiful code

August 16th, 2008

I don't remember who metioned this book or where they did it. I seem to remember it being mentioned by several people. But for one reason or another I decided to order it and I've eventually made my way through to it.

"Beautiful code" is a compilation of 30-something case studies, each chapter written by a different contributing author, describing code or systems they found beautiful. I suppose it is subjective how wide your definition of "beautiful code" is, but some authors describe architectures rather than code, which isn't quite what I'd expect. To me "code" is generally something that happens at the statement/function level, otherwise you call it "design" or "architecture".

The case studies are extremely diverse, you have everything from kernel code to high level systems. As I'm not a kernel hacker I have to say I didn't understand much of the chapter on Linux drivers, but then I get the feeling I'll never grok c types without a mentor or something, the Hungarian notation style variable naming tells me little about their meaning. There's a FreeBSD chapter on filesystem layering, and that's fairly straightforward, then there's a Solaris chapter on thread handling which is interesting, but the code unfortunately is less instructive to me than is the prose (the author's fascination with sewage is also mildly disturbing).

You'll find the code examples in a variety of languages, some familiar (c++, haskell, java, python, ruby), some not directly familiar but partly or mostly understandable (c, c#, javascript, perl, scheme), and some foreign (elisp, fortran, matlab, visual basic). There are two chapters showing implementations of python datastructures (in c) that I found quite interesting, one from the standard library (dictionaries), the other from NumPy (n-dimensional arrays).

It turns out this book is more interesting than I expected. Some of the chapters I'm just not in a position to understand, but many of them are well written and interesting to delve into. I successfully killed 4-5 hours of time in flight and at the airport with it, which is better mileage than I get out of books on tape. What I really like about it is that it's a book for hackers in the trade -- it's a book that shows you stuff, not one that tries to teach you. Which means you get right to the point without the obligation to introduce and prepare you for what you're about to read. It's a lot more like reading a blog.

So then there's the question, is the code that these supposed masters of the trade write more beautiful than yours and mine? Well, not necessarily. In some of the examples presented it's the design that's supposed to make it beautiful, not the code itself. And try as you might to imagine how an expert will wield untold levels of voodoo to problems you and I would love to solve better, most of the time they don't. I guess there isn't all that much hidden magic out there.

Norwegian is the best language, yo

August 14th, 2008

Quick, what's the most important quality a foreign language can have? If you said "easy to use" you'd be right. All other concerns are trumped, because other values of a language can never be appreciated unless you can learn it first. And apparently Norwegian ranks first on ease of learning for speakers of English (fun to know ). The ranking is of course highly unofficial, but what the heck.

Exhibit A:

Scandinavian verbs have some of the easiest conjugation you can find in Europe. Present tense is made by adding an -r to the verb, regardless of who's doing it. That gives us:

ha - to have

jeg har - I have
du har - you have
han har - he has
vi har - we have

Such simplicity is brilliant (and unheard of).

The full rationale is here. A few selected gems follow.

Norwegians understand 88% of the spoken swedish language
understand 73% of the spoken danish language

Swedes understand 48% of the spoken norwegian language
understand 23% of the spoken danish language

Danes understand 69% of the spoken norwegian language
understand 43% of the spoken swedish language

Norwegians understand 89% of the written swedish language
understand 93% of the written danish language

Swedes understand 86% of the written norwegian language
understand 69% of the written danish language

Danes understand 89% of the written norwegian language
understand 69% of the written swedish language.

Hah, suckers! More succinctly:

"Norwegian is Danish spoken in Swedish"

Norwegian + phonology - vocabulary = swedish

Norwegian - phonology + vocabulary = danish

long passwords are evil

August 12th, 2008

I'm writing this partly in response to Jürgen's post a week or so back about passwords. Of course, he's not the only one to advocate long passwords, a lot of people are doing that these days in the name of security. Today's sad reality is that if your password is not "test" or "password" you are more secure than most people.

I do think, however, that any idea for improvement should stand to be evaluated on usability. After all, my first loyalty is to the user in me. Failing to do that produces wide adoption of bad ideas like captchas that are directly hostile to users. (Incidentally, that's why so many people who build systems for others build them badly. The implication of using it every day never takes a foothold.)

Short passwords have too little entropy, therefore they are easy to break.  Granted. So the response is "use long passwords", or better yet "not passwords, pass phrases". Such as oh bugger, my cat has cancer. With or without the spaces and punctuation it makes a perfectly acceptable password in terms of length. But tell me now who is willing to actually type these monstrosities?

The evil of password typing is reduced by our methods to avoid typing them all the time. Use public keys with ssh, never type the password again. Save passwords in the browser, avoid typing those. It's a fabulous usability gimmick.

But short passwords, bad for security, are great for another closely related purpose: being able to actually type them in. If you have a short password you don't need much practice to be able to type it. It's a sort of sweet spot between usability and security, more secure than nothing, not too painful to type if you have to. My password input rate might be something like 98%. I rarely fail to log in. But with pass phrases of 29 characters like the one above, how confident would you be? You don't see what you're typing either, just echo characters at best. I expect the likelihood of typing it correctly falls dramatically, maybe to as low as 75-80% for the average user, in the average point of his learning curve to learn typing it (does not apply to hackers with stellar typing skills yadayadayada). If you're doing something once, 80% is pretty good odds. But if you're doing it everyday, it's no longer odds, it's a statistical average. Imagine if those were your parking odds. One in five times you fail to maneuver through the opening of your garage, I don't think you'd be happy.

I tested myself on cancer cat just now, 6/10. On a sentence I've never typed before. And that's while seeing the characters on the screen.

And then there's the chance that you'll forget it, or remember it wrong, switch a character in your mind, use the wrong case. It's hard to estimate how likely that is, but with long passwords it seems rather likely. Inputing passwords is not an approximation, it has to be exact. And it's not just one of those phrases you have to remember *exactly*, you need one for every distinct password you keep.

Security is a social problem, not a technical one. If you force people to use long passwords they struggle to input (for christ's sake, they *already* use post-it's on the monitor), we will just embrace ways of avoiding passwords all the more. Passwordless ssh is great, but if I'm using every trick in the book to avoid typing my long password, I haven't had enough practice typing it when I actually have to type it.

That is, if I even remember it correctly. And I somehow doubt sysadmins will give you more tries to type a long password than they currently give you, 3 tries or whatever it is. And then you're locked out.

It's the perfect anti-security. The bad guys have a shot at my account (but they have to be pretty clever), but I myself am locked out.