Programming: Cascading Failure chains

Donald Hobson

tl;dr: This text would be much better if it was purely an example of a failure cascade & its explanation and if you left out the last paragraphs where you try to assign blame to specific parts of the system & how this might have been prevented. (I believe you don't really know what you (are trying to) talk about there.)

Let's work backwards, from the end back to the start.

First: Haskell, not Haskel.

The "this wouldn't have happened with Haskell" bit seems like a regretrospective coming from a narrow tunnel. I strongly suspect you're not actually using Haskell intensively, or you'd be aware of other problems, like the compiler / type system being too stupid to understand your logically correct construction and then having to use weird workarounds that just plain suck and that you subsequently drown in heaps of syntactic sugar to make them palatable. (I may be wrong on this; things might have improved a fair bit, I ragequit to Coq a long time ago…)

Also, Haskell still allows nonsense like let f x = f x in f "foo" (an endless loop) – you'll want to use Coq, Agda, or Idris with %default total – then you'll have to prove that your program will actually terminate. (But that still doesn't mean that it'll terminate before the heat death of the universe… only that it can't run forever. The one thing this does mean is that your functions are now safe to use in proofs, because you can no longer "prove" stuff by infinite regress.) With the extended capabilities of dependent types, you'll be able to construct almost anything that you can come up with… only it'll take years of practice to get fluent, and then for serious constructions it may still take days to explain what you're doing in ways that the compiler can understand. (Simple example: If you define addition on (unary) natural numbers (O, S O, S (S O), …) as O + m = m; (S n) + m = S (n + m), then the compiler will be able to figure out that a list of length 3 + n is S (S (S n)) long, but it won't have a clue about n + 3. This matters if you need a non-empty list (e.g. to get the first element) – in the 3 + n case, it can see that the list has length S <something> so it's non-empty, in the other case you'll first have to prove that addition is commutative and then show that n+3 = 3+n (by commutativity) = S <something> (by reduction)…[1] and that's for something as simple as getting an element from a list.) While this is very worthwhile for getting rock-solid software (and things are getting more and more usable / closer to being viable), it's far too slow if you just want to get something up and running that does stuff that's hopefully approximately correct. Nonetheless, I think it's worth spending some time trying to understand & work with this.[2]

Aside: I'm still occasionally using Coq, but >95% of what I'm writing I do in Lua… In years of Coq, I learned safe "shapes" of execution and now I'm mostly writing in Lua what I'd write in Coq, just without the proofs, and also with duck typing galore! If one function describes a certain "movement" of data, when the ways in which it interacts with your data are abstract enough, you can use it independently of types or particular shape of the inputs. While you should theoretically be able to explain that to e.g. Coq, that'd be really complicated and so, for all intents and purposes, it's practically un-typable… So I don't agree at all with your claim that "[t]he basic problem was that the programming language was dynamically typed[…]". You can use that reasonably safely, you just need a few years of training with some hardcore dependently typed language first… ;-)

Your brief interjection on hacking seems rather unmotivated. Your program doesn't provide any interesting computational capacity, there's not much to hack. Same thing goes for your claim that "All this means that most long and complex python program[s are] likely to be hackable." – the program can weigh megabytes or even gigabytes, as long as it doesn't consume any external inputs / doesn't interact with the outside world, there's no way for you to influence its behavior and so it'll be 100% hack proof. (In the other direction, a nc -l -p 2323|bash (don't run this!) or its equivalent in your language of choice is very short and provides full programmability to the outside world.) It's not about length at all (although, sure, in practice there's correlation), it's about the computational capacity provided to the input (and, by extension, the attacker.) You may be thinking that you are writing the program, therefore you are in control; but actually you are constructing an abstract machine on which the input (files, packets, …) will later run – the input is programming the abstract machine that is provided by your program, the branches taken are determined by the bytes read, and the attacker is in control – the only question is how much computational power do you provide? The LANGSEC people are on to something very good, but as some people show it's clearly not enough yet.[3]

The initial example was entertaining and I learned (or rather: became aware of) something new from that. (I technically knew that parentheses in Python do both grouping / precedence and tuple construction, but I wasn't actively aware of the resulting problem. So thank you for this!)

[1]: Actually, in this particular case it's enough to do a case analysis on n and see that even when n=0, you'll have the 3 items and so the list can't be empty. But in general, you'll rarely be that lucky and often have to do really tricky stuff…

[2]: If you want an intro to this stuff, look at the Software Foundations and then also Certified Programming with Dependent Types on how to not go crazy while writing all those proofs… or look at CompCert for some real-world software and be made aware of Mirage as (what I think is currently) the most likely path for this stuff to become relevant. (Also, if you're serious about working with dependent types, the stuff that Conor McBride does is awesome – understand that shift in perspective and you'll cut out years of pain.)

[3]: Now loop that back into the stuff about dependent types / functional programming, where stuff is often represented as trees with lots of pointers… you'll see that while that may be provably safe, it's also extremely brittle, which is another reason why a program written in Lua or C may be much more robust (and, counter-intuitively, ultimately safer) than one written in Coq…

Let's work backwards, from the end back to the start.

First: Haskell, not Haskel.

8

Programming: Cascading Failure chains

8

8

8