Archive for 2010

what is comprehension really?

December 11th, 2010

When we're talking about learning a language we seem to present an idealized picture of both what the language is like and what our learning is like. We imagine a script, which is written on paper, and which is recited with clear enunciation and proper pacing by a voice actor on our language tapes. This creates the impression that these two things are somehow equivalent, that the written text and the spoken message are two encodings that have the same content. This is the ideal case.

But as we all know, language is not ideal. There is a reason we have expressions like "what did you say" that we use with people who speak our own language perfectly well. Our communication is lossy, ie. the sum total of what is received is less than what was transmitted. To make up the difference we have to reconstruct the message, and our reconstruction may not be accurate. Comprehension is the ability to reconstruct successfully, most of the time.

In order to belabor this point a bit, let's use an example message from an article:

Liu, he said to sustained clapping, "has exercised his civil rights. He has done nothing wrong. He must be released."

This is the ideal message, the message that was intended for you. If you speak English then this is perfectly clear to you. But when it becomes degraded in communication, what degree of error can you cope with?

What if the writer had excluded the commas?

Liu he said to sustained clapping "has exercised his civil rights. He has done nothing wrong. He must be released."

What if there were no quote marks?

Liu he said to sustained clapping has exercised his civil rights. He has done nothing wrong. He must be released.

No periods?

Liu he said to sustained clapping has exercised his civil rights He has done nothing wrong He must be released

No capitalization?

liu he said to sustained clapping has exercised his civil rights he has done nothing wrong he must be released

No spaces?

liuhesaidtosustainedclappinghasexercisedhiscivilrightshehasdo- nenothingwronghemustbereleased

Now, think about it. When you hear speech in a language that you don't understand, does it have punctuation in it, does it even have spaces? I would suggest that speech is actually much more similar to the last version than to the first one. Sure, there is vocal "punctuation" in the audio, with things like tone of voice and pacing, but it's not as standardized and clear as punctuation in writing. Names are not capitalized. There are no parentheses. There are no quote marks (only the bandwidth heavy "and I quote"/"end of quote", which gets quite tiring to hear if there are many quotes).

And the above example only demonstrates information loss, it doesn't even begin to address errors introduced by a faulty communication (mispronounced words, words spoken with the emphasis in the wrong place, spoken with a foreign accent etc.), yet that too your brain must be prepared to deal with.

Spoken language is never like the original message. Some time ago, a friend recommended a book to me. A book that is available on the author's website, but which for some reason doesn't have any punctuation (I don't know if this is just the online version or the print version too, but that would surprise me). The only thing capitalized are names. I first tried to read this a few months ago, but I found it too tough. I came back to it recently and this time I made more progress. That's when it dawned on me how well this illustrates how we idealize and underestimate language.

In reading this book the brain is forced to attempt something similar to what I try to do here with a little syntax highlighting:

noi abbiamo le pive nel sacco Malva è sconvolta ma Cocco non molla entriamo e la facciamo lo stesso in quanti siamo dice dobbiamo farla lo stesso tanto ormai non abbiamo più niente da perdere grida e cosi convinciamo gli altri a fare lo stesso l'assemblea entriamo tutti insieme e ci mettiamo in un'aula vuota del pianterreno è un minuto che siamo dentro e non abbiamo ancora cominciato a dire una parola che arriva Mastino sbraitando cosa fate qui tu tu e tu siete tutti quanti sospesi passate in presidenza uno alla volta e esce lasciando la porta aperta Scilla dà un calcio alla porta e poi la barrica ci spingiamo davanti due banchi restiamo un momento in silenzio dobbiamo fare qualcosa ci guardiamo negli occhi ma non sappiamo cosa fare ci sentiamo in trappola
Noi abbiamo le pive nel sacco. Malva è sconvolta, ma Cocco non molla. “Entriamo e la facciamo lo stesso in quanti siamo”, dice. “Dobbiamo farla lo stesso, tanto ormai non abbiamo più niente da perdere”, grida. E così convinciamo gli altri a fare lo stesso. L'assemblea entriamo tutti insieme e ci mettiamo in un'aula vuota del pianterreno. È un minuto che siamo dentro e non abbiamo ancora cominciato a dire una parola che arriva Mastino sbraitando cosa fate qui tu, tu e tu? Siete tutti quanti sospesi. Passate in presidenza uno alla volta”. E esce lasciando la porta aperta. Scilla dà un calcio alla porta e poi la barrica. Ci spingiamo davanti due banchi. Restiamo un momento in silenzio. Dobbiamo fare qualcosa. Ci guardiamo negli occhi, ma non sappiamo cosa fare. Ci sentiamo in trappola.

Is that the canonical version? No, it's not. It's my best attempt at reconstruction. I'm not certain that this is correct, I'm not even certain that the version on the website matches the printed version (there seem to be quite a few typos). But ultimately, there is no final answer, all language comprehension is heuristical and hypothesis based.

free will or not

September 13th, 2010

One of the classic topics for debate among us humans is the dilemma of free will. In a way I'm surprised that it comes up as much as it does, because I don't really think it's that interesting a question. In the sense that I don't see how we're making any progress on it.

It's as if everyone is convinced that we have free will, and yet... there is absolutely no empirical basis to think so. Never has anyone had the chance to go back in time and make a different decision. So why do we seem to think it must be so?

Here's the thing. It's my impression that much of the debate on free will is shaped by an unwillingness to really accept the premise of the question and take it to its logical conclusion.

All too often I've heard people argue things like "well if there is no free will, then people cannot be held responsible for their actions, so we should let everyone get away with it". What do you mean "let"? Here's the problem: our form of expression is basically based on the premise of free will. So to even discuss it without commiting a fallacy one has to be careful.

The whole "ethical problem" of determinism seems to me nothing more than a false dilemma. If the actor has no free will to commit a crime or not, then why are we debating the problem of "deciding" how to respond if we have no freedom of choice? If the actor has no choice, then neither do we, there is nothing to decide, there is no problem to solve. Whatever happens is purely a matter of inevitability, however much it may seem otherwise. If the crime commited was deterministic, then our post-fact discussion is deterministic and whatever action we will take is also deterministic. The only way there is a dilemma is if the other guy's actions were determined and yours somehow aren't. But that's not how the question is defined.

The free will topic also enters the religious domain often, and there too people make the same mistake. As in, if god is all powerful and all knowing, then he knows your future, thus your future is decided, thus "why are you praying to him hoping he will change his mind?" Wrong. If the future is decided then he has *already decided* that you will pray to him and that you will thank him etc, so your apparent gratitude to him is nothing else than him having decided to "program you" (if you will) to thank him.

things I hate about haskell

September 4th, 2010

As the title for this blog entry popped into my head today I realized I had been silently writing this for five years. I want to stress one word in the title and that's the word "I". It might be that what I have to say is no more insightful than people who don't like python because of the whitespace, or people who can't get over the parens in lisp. But I happen to believe that a subjective point of view is valid, because someone, somewhere had a reaction to something and it's crooked trying to pretend that "the system is immune to such faults".

My complaints have little to do with the big ideas in haskell, and those ideas can just as well be realized in another language. In fact, the way things are going, it's likely that haskell's bag of tricks will be the smörgåsbord for many a language to choose from. F# from Microsoft, clojure making waves, and lambdas that have reached even java. As James Hague wrote, functional programming went mainstream years ago.

Elitism

I don't mean elitism in the sense that you sometimes hear about lisp mailing lists, that the people are hostile to newbies with a harsh rtfm culture prevailing. I haven't met any nasty haskell people, it's the culture where the elitism is encoded.

I don't think I have to explain that if you, as a software engineer, meet with a client for whom you are building a product then you don't insist that the conversation be held in terms of concurrency primitives or state diagrams. If you want to get anything done, you have to speak the language of the customer. And that's a general principle: if you are the more expert party in the field that the conversation concerns, then it's your responsibility to bridge the gap. If you don't understand your customer, then you have a problem. And if you do understand him, but you don't care, then there's a word for that... oh yes, elitism.

What I mean by elitism in haskell is the belief that "what we have is so great that we're doing you a favor if we let you be a part of it." Trying to learn haskell is something like being a fly on the wall at a congress of high priests. There is theological jargon flying all around and if you happen to make out some of it, we'll let you live, that's how nice we are. There seems to be a tacit assumption in place that if you touch haskell then either a) you have been brought up in the faith or b) you have scrubbed your soul clean of the sins of your former life. That is to say if you're comfortable coding in lambda calculus then you'll have a smooth run, but if you code in any of the top 10 most widely used languages then good luck to you. "Enough already, do I have to spell it out for you: forget everything you know about programming and imprint these ideas on your brain."

Here, I'm pleased to mention Simon Peyton Jones who makes an effort to speak the language of his audience. And when you read his papers you get the impression that he's saying "okay, so maybe you're not a gray beard in this field, but I'll try to explain this so that you get something out of it". Alas, Simon is miles ahead of the pack.

Still, it's changing over time. There seems to be a new generation of haskell coders who don't live in that ivory tower and actually do sympathize with their fellow man. (To the mild chagrin of the high priests, one imagines.)  The first time I really knew that haskell was on the right track in the language ecosystem evolution was the appearance of Real World Haskell. It's the first book that I knew about that wasn't for the in-group, with a provocative title that seems to say "the language isn't about you crazy puritans in the programming language lab, it's about people trying to solve their everyday problems". Since then I have seen further evidence of this development, notably Learn You a Haskell, a "port" of Why's Poignant Guide to Ruby, the first book about Ruby that really took the newbie programmer seriously (and I'm pretty sure did wonders for Ruby adoption).

Math envy

One of the earliest things I ever read about haskell was "the neat thing about haskell is that functions looks almost like functions in math". I remember thinking "why is that supposed to be a selling point?" Unless you are actually implementing math in code (which is a pretty minuscule part of the programming domain), who cares? As I would discover later, it was a sign of things to come.

I read a couple of books recently about the craft of software engineering that are written in the form of interviews with famous coders. I recall that on several occasions there would be a passage more or less like this "but when I went back to look at the code I had written back then I was ashamed that I had used so many single character variable names". Exactly how many more articles and books have to get written until people stop writing code like this:

(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c
m >=> n = \x -> do { y <- m x; n y }

Is this an entry in a code obfuscation competition? (Is there some way you could obfuscate this further?) Why does reading code in haskell have to be some sort of ill conceived exercise in symbol analysis where you have to try to infer the meaning of a variable based on the position of the parameter? I'll be honest, I have no memory for these kinds of things, half way down the page I have long since forgotten what all the letters stand for. Why on earth wasn't it written like this?

(>=>) :: Monad tmon => (tx -> tmon ty) -> (ty -> tmon tz) -> tx -> tmon tz
f1 >=> f2 = \x -> do { y <- f1 x; f2 y }

Maybe that's not the optimal convention, but it's far better than the original. Almost anything is. (It's still pretty minimalistic by the standards of most languages.) Haskell prides itself on its type system. As a programmer you have that type information, so for goodness sake use it in your naming.

This kind of thing is normal in math: pick the next unused letter in the alphabet (or even better, the Greek alphabet!) and you have yourself a variable name. It's horrible there and it's horrible here.

Syntax optimized for papers

If you read about the history of haskell this is not really all that surprising. Before haskell there was a community of people doing research on topics in functional languages, but to do that they had to invent little demonstration languages all the time. So eventually they decided to work on a common language they could all use and that became haskell.

Haskell snippets look nice in papers no doubt. But how practical is this syntax?

main = do putStrLn "Who are you?"
          name <- getLine
          case M.lookup name errorsPerLine of
               Nothing -> putStrLn "I don't know you"
               Just n  -> do putStr "Errors per line: "
                             print n

Is your editor set up for haskell, does it know to indent this way? If not you're going to suffer. Haskell breaks with the ancient convention of using tabs for indentation, so if what follows a do or a case doesn't line up with a tab stop, you're out of luck. Unless your editor supports haskell specifically, that is. So all you people using some random editor: bite me.

Because of the offside rule, and haskell's obsession with pattern matching/case statements, all your code ends up in the bottom right of the screen in functions past a certain size.

And haskell is obsessed with non-letter operator names, because letters are so tiresome, right? Haskell people love to author papers in latex and generate pretty pdfs where all the operators have been turned into nice symbols (and Latin letters become Greek letters, more math envy). Sometimes when you're reading the paper you don't even know how to type it into an editor. It's almost like coding haskell is also an exercise in typesetting. "Yes, it's ascii, but just think how lovely it's gonna look in a document!"

"How a programmer reads your resume"

August 13th, 2010

tl;dr: Sometimes stereotypes are true.

I came across this rather excellent comic about how people see your resume depending on who they are and after glazing over it and appreciating it as one of my ~5-10 daily interweb funnies, I looked over it again and noticed that it's eerily accurate.

Positives

  1. Has written a compiler or OS for fun.
    That'd be a yes.
  2. Resume compiled from latex.
    Actually, from hand made xsl to latex. There was a time when I was all excited about single source publishing, so that's what I did here. Xml to html/pdf/txt. (Last time I checked the latex->html bridge was seriously lacking anyway.)
  3. Contributes to open source software.
    Check.
  4. Has written compiler or OS for class.
    Check.
  5. Has blog discussing programming topics.
    You're here.
  6. President of programming/robotics/engineering club.
    Nope.
  7. Participated in programming/robotics/engineering contest.
    Nope.
  8. Internship at Google or Microsoft.
    They know where to find me.
  9. Has written non-trivial programs in dynamic languages (perl/python/ruby).
    Mhm.
  10. Knows 3 or more programming languages.
    Right.
  11. Previous position demonstrates similar skills.
    Not really.
  12. Has internship.
    Has.
  13. Founded a company.
    Only a pretend company, and we haven't been active for about 10 years.
  14. Personal web page uses Rails, PHP or Asp.Net.
    Been meaning to switch from PHP to Python, but there's just no pressing need for it.
  15. Email address at own domain.
    Not since 2005.
  16. Has modified programs in dynamic languages (perl/python/ruby).
    That's how I started out with dynamic web stuff in 1999, found perl scripts and tried to mod them without breaking them.
  17. Has personal web page.
    Welcome.
  18. High grades, top of class, etc.
    Meh.

Neutrals

  1. Won scholarship.
    Have never applied for one.
  2. Lists job at fast food chain.
    Haven't had the pleasure.

Negatives

  1. Looks kind of drunk in facebook picture.
    One of [apparently] few specimen in the human race who don't find unending ecstasy in alcohol.
  2. Has Ph.D.
    Not so much.
  3. Generic cover letter.
    Might be tempted.
  4. Mentions skills in Excel/Word.
    Over my dead body etc.
  5. Spelling or grammar errors on resume.
    My typing is a bit dodgy, but I tend to proofread.
  6. Resume font too small.
    Let's hope not.
  7. All programming experience in class.
    Nah.
  8. Knows only 1 programming language.
    Once upon a time.
  9. Resume more than three pages long.
    I try to make it in two.
  10. Includes irrelevant objective section.
    Never knew what the point was of that one.
  11. Took certification course in a technology.
    Never occurred to me.
  12. Low grades in relevant courses.
    Nah.
  13. Lists visual basic experience first.
    Don't have any.
  14. Topless in facebook picture.
    Only by mistake.
  15. Resume uses combination of tabs and spaces to indent sections.
    I'm clean, narc. (Actually, if you use Tab in vim with wildmenu, it's tricky to type a tab without a space first, because it will try to auto complete the current token. Haven't tried to fix that yet.)

I timebox and so can you

August 10th, 2010

Axiom: SRS is by far the most effective vocabulary learning method I've ever seen.
Corollary: There is no way I could have learned nearly as much vocabulary with my usual laid back attitude.

Don't get me wrong, I'm happy spending most of my life propping up the idea that "if a word wants me bad enough it will find a way to attach itself". It's modest reward for scarce effort and I like it that way. There is so much more worth doing in life than learning lots of words. But there are times when a quick uptake of vocabulary is pretty crucial, namely in the opening stages of a new language. It's when you put in a lot of effort and consequently, where seeing results matters a whole lot. But it matters not only for motivation. It also greatly impacts the quality of your early learning process if you can absorb the core vocabulary quickly.

I found this out last year when I was starting on Italian and I realized I had learned lots of not-all-that-interesting-but-important words that would have taken me several times longer without SRS. To my good fortune, I knew about Anki and I had read enough plaudits to try it.

Still, there is a problem. Anki may be effective, but I wouldn't call it fun. In fact, it's awfully tedious. So much so that even though I appreciate how helpful it is, most days I just can't persuade myself to click the icon that launches it. I get little thrill from returning to the same words that I saw yesterday and couldn't remember then. Plus the interaction itself is highly tedious; clicking those buttons like it's some kind of psychological survey, trying each time to pick the most appropriate choice.

Making it more fun

Alright, how? Khatzumoto writes about timeboxing and SRS tirelessly, and after reading through most of that I was ready to try it out. The idea is to go from "man this is boring, how much longer?" to "I only have 5 minutes, how much can I get done?" and it sounds like it's never going to work. And yet.. okay, have you ever stayed at a nice place on vacation for just a bit too long, so much so you get bored? The idea is to leave wanting more, it's basic showmanship. Timeboxing, believe it or not, adds that element of urgency to the mix. You give yourself 5 minutes for Anki and that means you only have 5 minutes, however many decks you have.

Yeah, it's weird. But here's what it looks like to me. Before timeboxing I would start up Anki, gaze at all the decks I have and all the cards that are up for review and sigh. "What a pain it's gonna be to review all that." Now, all of a sudden, I have a different reaction. "Alright, I have all these decks, which one do I most want?" Then I start on one and keep going for a while, but not for too long since I also want to cover other ground. The 5 minutes is almost up and I still want to get more done so it ends up being 7 minutes. 7 minutes,  which prior to timeboxing, seemed like a century of Anki.

Decks - how to plan them out?

You could just put everything in one giant deck, but I don't like that idea. I did that at first and I found out that I like to have some notion of the context where the information was from. Is it from a textbook, a vocabulary list on a particular topic, from reading or watching stuff (ie. passive learning) or what? That gives me a choice; I can pay close attention to some vocabulary set that's important right now. And if a particular deck is just annoying me I can remove it completely.

It's also a way to manage my morale. If I review lots of cards from a tough deck and I can't remember anything (ie. the thing that makes me unhappy), I can counter that with an easy deck where I win easy points.

timebox_anki_decks