# 6

Personal Blog

There has been previous discussion on LW on the topic of how to quickly determine if someone might be good at programming. This is relevant because this is currently a good career field that can be relatively easy to enter, and because programming-style thinking is often relevant to LW topics (eg decision theory). In light of this I've created the following test, which is based on my memory of a test from an interview process for a programming job. It attempts to test common low-level concepts from programming such as sequence, assignment, indirection, and recursion, in a way that doesn't require any previous programming experience (although previous experience will likely make it easier).

This test is aimed at getting a quick clear positive, so the fact that someone does poorly on it doesn't mean they can't become a programmer (ie I'd guess it's likely to generate false negatives rather than false positives). This test is obviously lacking scientific validation, and is probably too short, but I'd like to start somewhere.

I'd like to invite both programmers and non-programmers to take the test for comparison. It should only take about 5 minutes. If you do the test, please also take the short poll in the comments for feedback and calibration purposes, regardless of what result you got.

-----  Test begins below  -----

This is a 1-question algorithmic thinking exercise that should take less than 5 minutes.

Pen and paper is required. There should be no prerequisites beyond basic arithmetic.

First, write down the following sequence of numbered boxes. You will be writing numbers in some of the boxes more than once, so either use a pencil or make the boxes big enough to cross out and replace numbers.

```    1    2    3    4    5    6    7    8
[  ] [  ] [  ] [  ] [  ] [  ] [  ] [  ]```

Following is a sequence of numbered steps. Do the steps in the order they are numbered (unless instructed otherwise). Note that "write a number in a box" means "cross out the previous number and write the new number".

1. Write 1 in box 3, 2 in box 6, 9 in box 4, 1 in box 5, 5 in box 8, and 0 in the remaining boxes.

2. In box 4, write the sum of the number in box 3 and the number in box 5.

3. In both boxes 2 and 5, write the the number in box 8 minus the number in box 6

4. Write 1 in the box whose number is in box 3

5. In box 3, write the sum of the number in box 3 and the number in box 4

6. In the box whose number is in box 6, write the sum of the number that's in the box whose number is in box 4, and the number that's in box 5.

7. Do step 2 again, then continue directly on to step 8.

8. Do step 4 again, but this time with box 4 instead of box 3, then continue directly to step 9.

9. The final result is the number that is in the box whose number is the number that is in the box whose number is equal to 2 plus the number that is in box 4. End of test.

--------------

Expected Results: http://pastebin.com/wA6xDxVb

Thanks for taking the test! Don't forget to answer the poll in the comments too.

I'd also appreciate any feedback on the test, both if you think its going in the right direction or not and if you think there are specific improvements that could be made.

edit: As some commenters have pointed out, there was a previous attempt at such a test that you may have heard of: http://www.eis.mdx.ac.uk/research/PhDArea/saeed/

However, it seems that further investigation found that their test, while better than nothing, wasn't very accurate. The test given in this post takes a different approach.

Personal Blog

# 6

New Comment

I'm in the "experienced programmer" category. I answered the question correctly and quickly (and without inventing special notations for those repeated indirections). I found it unpleasant -- it reminded me of this "literacy test". Even if this turns out to be effective in predicting who will make a good programmer, I'd hesitate to use it for that purpose, for fear of putting people off unnecessarily.

Well, there's a big difference here in that "failing" the test doesn't prevent anyone from doing anything. The idea is more to encourage people who seem to find this sort of thing natural.

Oh yes, I wasn't suggesting it's evil in the sort of way that test was! That's just what my brain pattern-matched it to.

I'm forced to remind myself that that test was not actually designed to be a literacy test.

It includes riddles/illusions (Paris in the the spring for example), irrelevant terminology ("bisect"?) and unnecessary arbitrary things like knowing the order of the letters in the alphabet. If you became literate chiefly by reading...

Correct. Not an actual literacy test but a tool of oppression. (For a less blatant example at a much higher level, see "Jewish problems".)

I suggest that the history of this sort of thing is part of why the response to "hey, it turns out black people do worse than white people on IQ tests" is often to suggest that there's something very, very wrong with the tests. I mention this only because it's a topic that comes up every now and then on LW.

[EDITED to add: I should reiterate that I'm not suggesting any such sinister motive in the present case!]

[-][anonymous]7y 1

It includes riddles/illusions ([redacted] for example)

You might want to rot13 that, in case people are considering taking the test themselves.

[-][anonymous]7y 1

unnecessary arbitrary things like knowing the order of the letters in the alphabet

I wouldn't consider that that unnecessary and arbitrary -- I guess most people in jobs requiring literacy need to sort a list alphabetically or look something up in an alphabetic list at some point in their life, especially back then before electronic computers.

Okay fair, that makes sense. But then, why not have the test just say "write down the letters of the alphabet, in order", rather than being tricky. Plenty of very literate people still need to sing the mnemonic song in order to recall the order.

Oh wait, no, the being tricky is testing to see if people are literate enough to understand the fiddly details of the question. Still, I'd say testing that separately from alphabet skills is more efficient etc.

Wow.. that literacy test is something else. I would have thought they would have been slightly more circumspect about the fact that this was just a way of disenfranchising black voters. But no, instead they come up with a test that is obviously just designed to be a giant "Fuck you, nigger". It's not just that the test is unreasonably hard, it's that the questions -- plus the absurdly strict grading criteria -- look like there were specifically chosen to signal unreasonable hardness (if that makes sense).

St. Rev on twitter believes that the test is likely a hoax due to the way to was formatted vs. other tests at the time. There was still likely some test that was aimed at disenfranchisement but given the lack of evidence that it was real I'd say he might be right about this particularly unbelievable one one being a fake.

Maybe it's a hoax, but I'm not sure the formatting proves that. The website itself mentions that the version of the test posted there is a "word-processed transcript of an original". The original is here. Was this guy referring to the original or the transcript when he made his point about formatting?

[-][anonymous]7y -1

The test looked like it measured intelligence, literacy, and ability to follow rules. How is that biased against blacks?

There was sufficient ambiguity in many of those instructions to let the pass/fail distinction come down to whatever the test's grader wanted it to be. I bet the folks grading those tests weren't too big on equal rights. At least twenty of those questions were reasonable(if we assume the need for a test of this sort in the first place), but a few were pernicious. Given that even a single wrong answer disqualified you, you don't need many evil questions to make for an evil test.

The function of such tests was to invoke the grandfather clause — anyone whose grandfather could vote was not required to pass the test; thus, native-born whites were largely exempt.

(That is the original meaning of the term "grandfather clause", by the way.)

Grandfather clauses were declared unconstitutional in 1915, so this particular test would not have a grandfather clause exemption.

The function of such tests was to invoke the grandfather clause

Are you sure? The Slate article did not mention that as the function. According to the test:

This test is to be given to anyone who cannot prove a fifth grade education.

The Slate article doesn't mention grandfather clauses either, instead saying:

The literacy test—supposedly applicable to both white and black prospective voters who couldn’t prove a certain level of education but in actuality disproportionately administered to black voters—was a classic example of one of these barriers.

The Slate article did not mention that as the function.

The Wikipedia article I linked discusses this; this one mentions Louisiana's literacy test specifically.

The tests were never intended to verify mental competence or education; the grandfather clauses make this clear. Their purpose was to provide a pretext for disfranchisement of former slaves and the descendants of slaves.

Really. The past sucks.

"History is a nightmare from which I am trying to awake."

Oops. You are right.

The literacy test—supposedly applicable to both white and black prospective voters who couldn’t prove a certain level of education but in actuality disproportionately administered to black voters—was a classic example of one of these barriers.

Blacks may have had an unremarkable failure rate, but if (proportionally) more of them were tested, then (proportionally) more of them would have failed.

Because it was disproportionately administered to blacks, and you failed if you got a single question wrong.

The test is for people who "cannot prove a fifth grade education". I believe that over 80% of fifth grade students would fail is this test -- either make at least one mistake in all those ambiguously sounding questions, or fail the time limit. Actually, I would expect at least 30% of university students to fail.

In other words, the test pretends to be an equivalent of fifth grade, but in reality it is much more difficult. If you have two people with exactly equivalent knowledge and skills, one of them has a "proof of fifth grade education" and other one does not, the former does not have to pass the test, but the latter is eliminated with high probability.

Therefore the test is a "fuck you" for people who "cannot prove a fifth grade education", whoever it was in the given historical era.

[-][anonymous]7y 1

The test is for people who "cannot prove a fifth grade education". I believe that over 80% of fifth grade students would fail is this test -- either make at least one mistake in all those ambiguously sounding questions, or fail the time limit. Actually, I would expect at least 30% of university students to fail.

I think those figures are overly conservative. I doubt I could pass that test.

Agreed. I doubt I could pass that test (i.e. get every one right) against a fair examiner, given an hour, under no stress. And I'm pretty good at that sort of thing. Getting every question right is just too high a bar. In ten minutes against someone who has discretion to mark your answers on whim and wants you to fail? And your right to vote and self respect are tied up in it? No chance.

We're talking about a time when blacks were far more likely not to have proper schooling than whites. In some states a majority of blacks could not demonstrate a fifth grade education. The same was not true for whites. So literacy tests were disproportionately administered to blacks. Until 1915, even illiterate whites were exempted under a grandfather clause if they could demonstrate descent from someone eligible to vote in 1867 (before the 15th amendment -- prohibiting the denial of the right to vote based on race -- was ratified). These clauses were struck down by the Supreme Court in 1915, but they illustrate the real purpose of literacy tests -- the disenfranchisement of blacks.

The test itself is patently unfair. It tests far more than just basic literacy, and I do not see how it could be regarded as a proportionate substitute for a fifth grade education. You had to finish it in 10 minutes, and getting even one question wrong counted as failure. Under those conditions, I'm not sure even I could pass the test, and I'm pretty literate. Add to that the fact that many of the questions are ambiguously phrased, allowing multiple "correct" interpretations, and that grading was entirely at the discretion of local (white) officials.

[-][anonymous]7y 0

Then the test — and how it was graded and administered — got even more insidious. Check out question 21. It says: "Spell backwards, forwards". If a Black person spelled "backwards" but omitted the comma, he/she would be flunked. If a Black person spelled "backwards," he/she would be flunked. If a Black person asked why, he/she would be told either "you forgot the comma," or "you shouldn't have included the comma," or "you should have spelled 'backwards, forwards'". Any plausible response by a white person would be accepted, and so would any implausible response.

I'm a programmer and unlikely to actually do this test because it strikes me as pointless and boring. Possibly I'm atypical, but consider censorship bias nonetheless.

I forget the name, but when performing tests there's a desire to capture all of the relevant information and no irrelevant information. It's sort of like false positives vs false negatives, but I think what I'm really trying to get at is the idea that unless a test produces entirely random results, it's ultimately sorting people somehow. The question is "how much does that 'how' relate to programming potential?"

In the case of this test, I'd say it probably has a weak correlation with programming ability, and a stronger one for general reasoning ability.

I think this test is kind of backwards: it figures out if you can follow instructions, not if you can take a process you understand and turn it into instructions (i.e. an algorithm). Although I suppose the following-instructions part is a necessary prereq for the creation part. At the very least, I think you'll want to test the algorithm creation skill too. Then, if following is a prereq, you don't need to bother testing for it any more, as anyone who would have failed there will also be screened by the creation step.

(I'm a professional software engineer, sometimes)

You're asking people to execute a program, but you should be asking people to write a program.

Non-programmer, but have tried a little programming. Weirdly, I got the final answer right on the first pass, but the numbers in the boxes had some mistakes. I got the boxes right on a second pass.

It seems like a fair test of certain kinds of mental focus.

Whose, not who's.

Concretely, I have seen this style of test (for want of better terms, natural language code emulation) used as a screening test by firms looking to find non-CS undergraduates who would be well suited to develop code.

In as much as this test targets indirection, it is comparatively easy to write tests which target data driven flow control or understanding state machines. In such a case you read from a fixed sequence and emit a string of outputs. For a plausible improvement, get the user to log the full sequence of writes, so that you can see on which instruction things go wrong.

There also seem to be aspects of coding which are not simply being technically careful about the formal function of code. The most salient to me would be taking an informally specified natural language problem and reducing it to operations one can actually do. Algorithmic / architectural thinking seems at least as rare as fastidiousness about code.

This test doesn't seem too different from a US income tax form. Most people are able to complete US income tax forms correctly, so I wouldn't expect completing this test correctly to be strong evidence of programming inclinations.

An attempt at a short no-prerequisite test for programming inclination

Somebody else beat you to it and wrote papers about it. Maybe more than one somebody, but that's all I'm digging up for you.

Multiple papers on the topic are here: http://www.eis.mdx.ac.uk/research/PhDArea/saeed/

From the main page:

We (Saeed Dehnadi, Richard Bornat) have discovered a test which divides programming sheep from non-programming goats. This test predicts ability to program with very high accuracy before the subjects have ever seen a program or a programming language.

Draft paper which incorporates some of the criticism we got at Coventry mini-PPIG and the University of Kent in January 2006.

Abstract: All teachers of programming find that their results display a 'double hump'. It is as if there are two populations: those who can, and those who cannot, each with its own independent bell curve. Almost all research into programming teaching and learning have concentrated on teaching: change the language, change the application area, use an IDE and work on motivation. None of it works, and the double hump persists. We have a test which picks out the population that can program, before the course begins. We can pick apart the double hump. You probably don't believe this, but you will after you hear the talk. We don't know exactly how/why it works, but we have some good theories.

Now let's do some reductionism magic. Taboo "programming skills".

Which skill specifically is the one that some students have, and other students cannot be taught? If there are more skills like that, which is the most simple of them?

(If all we have is "here is a black box, we put students in, they come out in two groups", then it is not obvious whether we speak about properties of the students, or properties of the black box. Why not say: "Our teaching style of programming has a 'double hump'." instead?)

Which skill specifically is the one that some students have, and other students cannot be taught?

There is no particular, identifiable, atomic skill that they're calling "programming skills". Like any other performance or talent, it is made up of a jillion jillion component skills. And I don't see them claiming that any particular skill cannot be taught, only that it is less likely to be taught to some than others with a given amount of instruction.

They take the grade in class as a proxy for general programming skills. That has it's own issues, but I'd expect it to have decent merit on population statistics.

I don't see any "magic" coming out of further reduction, here.

"Our teaching style of programming has a 'double hump'." instead?)

Because they claim the empirical observation that the double hump is prevalent across the distribution of classes, not just in any particular class. Yes, maybe with a different teaching method, the bottom cluster could do better. Maybe I would have been a better basketball player than Kobe Bryan is someone had taught me differently as well. But they didn't. Oh well.

I recalled this story from years ago and tracked it down. Their main claim was that a particular test at the beginning of the course accurately predicted the outcome of the course in terms of their grade. Someone else mentioned that in the later papers, they say that their test is no longer predictive.

Their main claim was that a particular test at the beginning of the course accurately predicted the outcome of the course in terms of their grade. Someone else mentioned that in the later papers, they say that their test is no longer predictive.

It's possible that people are enough more used to computers that some elementary concepts (like that the computer responds to simple cues rather than having any understanding of what you mean) are much more common, so those concepts aren't as useful for filters.

My model is that there is something (I am not sure what it is) that is necessary for programming, but we don't know how to teach it. Maybe it is too abstract to articulate, or seems so trivial to those who already know it that they don't pay conscious attention to it . (Maybe it's multiple things.) Some people randomly "get it", either at the beginning of the class, or usually long before the class. Then when the class starts, those who "have it" can progress, and those who "don't have it" are stuck.

The study suggests that this something could be: expecting the same actions to have the same results consistently (even if the person is wrong about specific results of a specific action, because that kind of mistake can be fixed easily). Sounds plausible.

Assuming this is true (which is not certain, as the replications seem to fail), I would still describe it as a failure of the education process. There is a necessary prerequisite skill, and we don't teach it, which splits the class into those who got it from other sources, and those who didn't. -- It would be an equivalent of not teaching small children alphabet, and starting with reading the whole words and sentences. The children who learned alphabet at home would progress, the remaining children would be completely lost, we would observe the "double hump" and declare that the difference is probably innate.

The disappearing of the "double hump" (assuming that the original result was valid) could hint at improving the educational methods.

Even with perfect education, some people will be better and some will be worse. But there will be more people with partial success. -- To use your analogy, we would no longer have the situation where some people are basketball stars, and the remaining ones are unable to understand the rules of basketball; we would also have many recreational players. -- In programming, we would have many people able to do Excel calculations or very simple Python scripts.

If consistence really is the key, that would explain why aspies get it naturally, but seems to me that this is a skill that can be trained... at worst, by using some exercise of giving students the same question dozen times and expecting dozen same answers. Or something more meaningful than this, e.g. following some simple instructions consistently. It could be a computer game!

This is what I'm reacting to. IIRC some a follow-up study showed their proposed test didn't work that well as a predictor, so this is a different angle on the problem.

My first instinct was to link to codinghorror as well, so I think it would (have) be(en) helpful to include the "what I'm reacting to" in your initial post.

Ok, I will.

We now report that after six experiments, involving more than 500 students at six institutions in three countries, the predictive effect of our test has failed to live up to that early promise.

And reading a little further than that...

The test does not very accurately predict levels of performance, but by combining the result of six replications of the experiment, five in UK and one in Australia. We show that consistency does have a strong e ffect on success in early learning to program but background programming experience, on the other hand, has little or no effect.

I've read about some test someone developed that was supposed to work fairly well. You give a list of short psudocode problems and ask what values different variables have at the end. If they answer consistently, even if it's not what any actual programming language uses, they'll be able to program. If they answer inconsistently or refuse to answer (because x = x+1 is impossible), then they probably won't be a very good programmer.

I think you're referring to the test mentioned here.

That is sort of what this is referring to - apparently that test didn't work very well when they tried it more widely, so this is approaching the same problem from a different angle.

Silly me! I had the array exactly right but the result was wrong. I used the second box to find the value that I eventually added the value of box 4 to and got 7.

Poll for test takers:

Programming experience vs. whether you got the correct results (Here "experienced" means "professional or heavy user of programming" and "moderate" means "occasional user of programming"): [pollid:528]

Did you think this was fair as a quick test? [pollid:529]

I got this right, but ended up having to invent notation to keep track of the indirection in the last segment. I think it's likely a decent test of whether you're likely to quickly pick up an intuitive head for pointer math and a very basic variable name-value distinction, but it won't capture other forms of abstraction that're necessary for programming: loops, types, conditional branching, Boolean logic. You could probably get away with dropping conditionals (I get the impression they're fairly intuitive), but I've had trouble teaching the others in the past.

Has a bit of an old-school feel to it, too; I'd expect the results to correlate better with talent for C than they would with, say, Python.

I got this right, but ended up having to invent notation to keep track of the indirection in the last segment.

This is also the case for myself. I would be very impressed by anyone who did not have to do this.

The trick is to evaluate right to left.

I opted for doing this and also checking the answer once, as opposed to using notation.

I didn't. Instead, I just kept taking the least-condition-laden part of the instruction, replacing it with a number, and repeating the operation on the newly simplified sentence.

Ditto.

I didn't invent notation, but I did write

number whose number [redacted] box whose number = [redacted]

so that I could keep track as I worked from bottom to top.

I'm a new user with -1 karma who therefore can't vote, so I'll combat censorship bias like this:

Moderate programmer, correct

Yes

So, as of 2013-06-30 20:42 (UK summer time) it's 13:4 for "experienced programmers", 10:2 for "moderate programmers", and 7:2 for "non-programmers". The "moderate programmers" are beating the "experienced", and the "non-" are well within the margin of error.

Now, of course LW is a hive of scum and villainy^H^H^H^H^Hplace where even the non-programmers tend to be pretty programmery, and as Morendil points out the people who chose to take the test may be atypical somehow -- but, still, this looks to me like evidence that this isn't a very effective test for discriminating between people with aptitude for programming and people without.

Yes, I agree that the poll results aren't too encouraging. It might still be interesting to try the test on a more general population, but without a follow-up to see who eventually learned to program it would be hard to tell how accurate (if at all) it was.

I suspect this is more confusing because of the way it's written - especially the last step which I'd imagine is where most people are falling down - than because of it really being complicated.

Answered "moderate programmer, incorrect". I got the correct final answer but had 2 boxes incorrect. Haven't checked where I went wrong, although I was very surprised I had as back in grade school I got these things correct with near perfection. I learned programming very easily and have traditionally rapidly outpaced my peers, but I'm only just starting professionally and don't feel like an "experienced" programmer. As for the test, I suspect it will show some distinction but with very many false positives and negatives. There are too many uncovered aspects of what seems to make up a natural programmer. Also, it is tedious as hell, and I suspect that boredom will lead to recklessness will lead to false negatives, which aren't terrible but are still not good. May also lead to some selection effect.

The indirection syntax should be rewritten to be left to right. As it is, it's a good fit for C's idiosyncratic type syntax, but needlessly obtuse otherwise.

1. In the box whose number is in box 6, write the sum of the number that's in the box whose number is in box 4, and the number that's in box 5.

2. Take the number in box 5. Where n is the number in box 4, take the number in box n, and the number in box 5, and add them. Where m is the number in box 6, write the result in box m.

The indirection syntax should be rewritten to be left to right.

I don't actually think so. The final answer is simply f(g(h(x))), which is a perfectly normal thing to see in programming.

That said, I still think it's a bad test. It involves no reasoning whatsoever -- merely following instructions carefully. I'm a reasonably good programmer, but sometimes a bit sloppy (that's why I write tests). So, I ended up with the correct final answer but a wrong number in one of the boxes.

I wasn't sure whether to score myself as a moderate programmer or non-programmer. One CS class in college, three in Udacity, read GEB, use algorithmic approaches in day to day life, but not on a computer.

I'd say moderate.

I think there's a category missing between moderate and non-programmer. I got halfway through Eloquent JavaScript and maybe a tenth of the way through Learn Python the Hard Way. I have way, way less exposure to programming than palladias but almost infinitely more than the modal person. I suggest novice/glancing familiarity. I did the poll as moderate.

I see an alternative interpretation for the last question: The final result is (the number that is in the box whose number is the number that is in the box whose number is equal to 2) plus the number that is in box 4. (Might also be good to make a sharper distinction between the indices of the boxes and their contents.)

This might be a more enjoyable test (warning, game and time sink): http://armorgames.com/play/6061/light-bot-20

Coincidentally, there is another current attempt to use a LW poll to determine whether a simple test is useful for predicting success at programming-like jobs. Basically, it just asks you at what age you learned to touch type.

Answer at each step, in case you want to check where you went wrong, here: http://pastebin.com/upfP6AZn

I failed the first time at step 8...arrgghh so close! I failed the second time on step 3, and got it right the third time. I am not a programmer. I have completed the python track on code academy about 3 months ago, before that I taught myself very basic matlab to manipulate matrices.

Person who has been trying to learn to Python for a while now. I got it wrong the first time - at stage 3 I forgot to put the number in box 5 and that screwed up the rest.