ryan_b's Shortform

by ryan_b6th Feb 202024 comments
24 comments, sorted by Highlighting new comments since Today at 3:26 PM
New Comment

Had some illusions about the C language shattered recently.

I read an article from 2018 called C Is Not A Low Level Language from ACM Queue. The long and short of it is that C fails to get the programmer "close to the metal" in any meaningful sense because the abstract machine it uses doesn't faithfully represent anything about modern computer architecture. Instead, it hides the complexity of modern instruction sets and memory arrangements beneath the abstractions which modeled hardware very well in the 1970s.

I, like most people, thought C was the best way to do hardware aside from just writing in assembly or machine code directly. I had assumed, but never checked, that as hardware advanced the work on C development was accounting for this; it appears backwards compatibility won out. This puts us in a weird position where both new hardware design and new C development are both constrained by trying to maintain compatibility with older C. In the case of hardware design, this means limiting the instruction sets of processors so they are comprehensible to C; in the case of C development this means an unyielding commitment to making people write code for the PDP-11 and relying ever more heavily on the compiler to do the real work.

The comments in the Reddit thread were, predictably, overwhelmingly focused on semantics of the high/low level dichotomy, with half claiming Assembler is also a high level language and the other half flatly rejecting the premise of the article or playing defense about how useful C is. I feel this kind of thing misses the point, because what I liked about C is that it helped me to think about what the machine was actually doing. Now I discover I wasn't thinking about what the machine I was working on was doing so much as thinking about general types of things a machine is expected to do (have one processor, and memory, and disk, and IO). While this is clearly better than not knowing, it left what I thought was the core advantage in the wind.

I therefore did a search, assuming that if old faithful had failed someone else had surely tried to fill the niche. Nothing presented itself in my searches, which combined represent an hour or so of reading. Instead I learned something interesting about programming; it is entirely upward oriented. All the descriptions of every language billed as being either for embedded systems or as "one language to rule them all" advertised themselves entirely of what kind of high-level abstractions they gave access to.

Hence the old saw: dogs cannot look up; computer scientists cannot look down.

Reflecting on the failure of another embedded language to provide exactly what I wanted - a way to reason about what the machine was doing - I looked a little further afield. I know of two candidates for thinking about this problem from different directions.

The first is Verilog (now SystemVerilog), which is a Hardware Description Language (HDL). This does pretty much exactly what I want in terms of reasoning about the machine, but it is for design purposes: you describe how the hardware works, then verify the description does what you want it to, and then eventually it is instantiated in actual, physical hardware. I am not sure how or even if it could be used on existing hardware to learn things about the hardware, or to optimize tasks.

The second is a thing called the Legion Programming System, out of Stanford. This comes from the other end of the scale, targeting High Performance Computing (HPC) applications. It is not a programming language per se; rather it is a model for application development. It identifies the exact same concerns I have about taking into account modern (and future) computing architecture, but it focuses on supercomputers and huge clusters of servers.

So the first option is still looking up the ladder of abstraction, just from beneath where I want to be; the second option mostly looks at high-scale hardware from the side. I suppose the thing that would make me happiest is something like a Legion Programming System but for robotics instead of HPC. The thing I would most be able to take advantage of personally is basically just a C fork with a new abstract machine and compiler which better accounts for modern architecture. Given how much work "basically" is doing, this amounts to a completely different language that uses a C-like syntax, and it seems unlikely if no one is making it already.

Does anyone know of a good tool for animating 3d shapes? I have a notion for trying to visualize changes in capability by doing something like the following:

  • A cube is grows along the z axis in according to some data
  • When a particular threshold is reached, a new cube branches off from that threshold, and extends in the x or y axis (away from the "trunk")
  • When a particular threshold is reached on that cube, the process repeats

In this way there would be kind of a twisted tree, and I could deploy our intuitions about trees as a way to drive the impact. I thought there would be something like a D3 for these kinds of manipulations, but I haven't found it yet.

Is spaced repetition software a good tool for skill development or good practice reinforcement?

I was recently considering using an Anki prompt to do a mental move rather than to test my recall, like tense your muscles as though you were performing a deadlift. I don't actually have access to a gym right now, so I didn't get to put it into action immediately. Visualizing the movement as vividly as possible, and tensing muscles like the movement was being performed (even when not doing it) are common tricks reported by famous weightlifters.

But I happened across an article from Runner's World today which described an experiment where all they did was tell a group of runners the obvious things that everyone already knows about preventing injury. The experimental group saw ~13% fewer injuries.

This suggests to me that my earlier idea is probably a good one, even though it isn't memory per se. The obvious hitch is that what I am after isn't actually recall - it isn't as though runners forget that overtraining leads to injury if you were to ask them, and I have never forgotten how to do a deadlift.

Rather the question is how to make it as correct and instinctive as possible.

  • This feels like a physical analogue of my earlier notion about Ankifying the elements of a problem, so as to integrate it into my perspective and notice relevant information.
  • Maybe a better way to say this is using an Anki prompt to help respond to a physical prompt, that being the task itself.
  • A physical action in response to the physical task instinctively already has a name; it is called muscle memory.

Gwern covers a bit of research here on when spacing does and doesn't work:

https://www.gwern.net/Spaced-repetition#subjects

 

Personally I've found the biggest problem with spaced repetition for skills and habits is that it's contextless. 

Adding the context from multiple skills with different contexts makes it take way more time, and not having the context makes it next to useless for learning the skils.

Personally I've found the biggest problem with spaced repetition for skills and habits is that it's contextless. 

Could you talk a bit more about this? My initial reaction is that I am almost exactly proposing additional value from using Anki to engage the skill sans context (in addition to whatever actual practice is happening with context).

I review Gwern's post pretty much every time I resume the habit; it doesn't look like it has been evaluated in connection with physical skills.

I suspect the likeliest difference is that the recall curve is going to be different from the practice curve for physical skills, and the curve for mental review of physical skills will probably be different again. These should be trivial to adjust if we knew what they were, but alas, I do not.

Maybe I could pillage the sports performance research? Surely they do something like this.

I review Gwern's post pretty much every time I resume the habit; it doesn't look like it has been evaluated in connection with physical skills.

 It is hard to find, but it's covered here: https://www.gwern.net/Spaced-repetition#motor-skills

My take is pretty similar to cognitive skills: It works well for simple motor skills but not as well for complex skills.

My initial reaction is that I am almost exactly proposing additional value from using Anki to engage the skill sans context (in addition to whatever actual practice is happening with context).

My experience is basically that this doesn't work.  This seems to track with the research on skill transfer (which is almost always non-existent or has such a small effect that it can't be measured.)

Ah, the humiliation of using the wrong ctrl-f inputs! But of course it would be lower level.

Well that's reason enough to cap my investment in the notion; we'll stick to cheap experiments if the muse descends.

I have some cards in my Anki collection that ask me to review dance moves, and I have found that it is helpful for making sure my body remembers how to do them

I wonder how hard it would be to design a cost+confidence widget that would be easily compatible (for liberal values of easy) with spreadsheets.

I'm reading a Bloomberg piece about Boeing which quotes employees talking about the MAX as being largely a problem of choosing the lowest bidder. This is also a notorious problem in other places where there are rules which specify using the lowest cost contractor, like parts of California and many federal procurement programs. It's a pretty widespread complaint.

It would also be completely crazy for it to be any other way, for what feels like a simple reason: no one knows anything except the quoted price. There's no way to communicate confidence simply or easily. The lowest bid is easily transmitted through any spreadsheet, word document, or accounting software, quality of work be damned. Any attempt to even evaluate the work is a complicated one-off report, usually with poorly considered graphics, which isn't even included with the budget information so many decision makers are unlikely to ever see it.

So, a cost+confidence widget. On the simple end I suppose it could just be a re-calculation using a reference class forecast. But if I'm honest what I really want is something like a spreadsheet where each cell has both a numerical and graphical value, so I could hover my mouse over the number to see the confidence graph, or switch views to see the confidence graphs instead of the values.

Then if we're getting really ambitious, something like a time+confidence widget would also be awesome. Then when the time+confidence and cost+confidence values are all multiplied together, the output is a like a heat map of the two values, showing the distribution of project outcomes overall.

Are math proofs useful at all for writing better algorithms? I saw on Reddit recently that they proved Batchelor's Law in 3D, the core idea of which seems to be using stochastic assumptions to prove it cannot be violated. The Quanta article does not seem to contain a link to the paper, which is weird.

Batchelor's Law is the experimentally-observed fact that turbulence occurs at a specific ratio across scales, which is to say when you zoom in on a small chunk of the turbulence it looks remarkably like all of the turbulence, and so on. Something something fractals something.

Looking up the relationship between proofs and algorithms mostly goes to proofs about specific algorithms, and sometimes using algorithms as a form of proof; but what I am after is whether a pure-math proof like the above one can be mined for useful information about how to build an algorithm in the first place. I have read elsewhere that algorithmic efficiency is about problem information, and this makes intuitive sense to me; but what kind of information am I really getting out of mathematical proofs, assuming I can understand them?

I don't suppose there's a list somewhere that handily matches tricks for proving things in mathematics to tricks for constructing algorithms in computer science?

A proof may show that an algorithm works. If the proof is correct*, this may demonstrate that the algorithm is robust. (Though you really want a proof about an implementation of the algorithm, which is a program.)

*A proof that a service will never go down which relies on assumptions with the implication "there are no extreme solar storms" may not be a sufficient safeguard against the possibility that the service will go down if there is an extreme solar storm. Less extremely, perhaps low latency might be proved to hold, as long as the internet doesn't go down.


How are algorithms made, and how can proofs improve/be incorporated into that process?

Given a problem, you can try and solve it (1). You can guess(2). You can try (one or more) different things and just see if they work(3).


1 and 2 can come apart, and that's where checking becomes essential. A proof that the method you're using goes anywhere (fast) can be useful there.


Let's take a task:

Sorting. It can be solved by:

  • 1. Taking a smaller instance, solving that (and paying attention to process). Then extract the process and see how well it generalizes
  • 2. Handle the problem itself
  • 3. Do something. See if it worked.

2 and 3 can come apart:

At its worst, 3 can look like Bogosort. Thought that process can be improved. Look at the first two elements. Are they sorted? No: shuffle them. Look at the next two elements...

4! = 12, twelve permutations of 4 elements. The sorting so far has eliminated some possibilities:

1, 2, 3, 4

1, 3, 2, 4

1, 4, 2, 3

2, 3, 1, 4

2, 4, 1, 3

Now all that's needed is a method of shuffling that doesn't make things less orderly... And eventually Mergesort may be invented.

In the extreme, 3 may be 'automated':

  • programs write programs, and test them to see if they do what's needed (or a tester gets a guesser thrown at it, to 'crack the password')
  • evolutionary algorithms

The post you linked to (algorithmic efficiency is about problem information) - the knowledge that method X works best when conditions Y are met, which is used in a polyalgorithmic approach? That knowledge might come from proofs.

Some notes about modelling DSA, inspired by Review of Soft Takeoff Can Still Lead to DSA. Relevant chunk of my comment on the post:

My reasoning for why it matters:

  • DSA relies on one or more capability advantages.
  • Each capability depends on one or more domains of expertise to develop.
  • A certain amount of domain expertise is required to develop the capability.
  • Ideas become more difficult in terms of resources and time to discover as they approach the capability threshold.

Now this doesn't actually change the underlying intuition of a time advantage very much; mostly I just expect that the '10x faster innovation' component of the example will be deeply discontinuous. This leads naturally to thinking about things like a broad DSA, which might consist of a systematic advantage across capabilities, versus a tall DSA, which would be more like an overwhelming advantage in a single, high import capability.

I feel like identifying the layers at work here would be highly valuable. I could also easily see specifying a layer below domain as fields, which will allow the lowest level to map to how we usually track ideas (by paper and research group) which leaves domain the more applied engineering/technician area of development, and then finally capability describes the thing-where-the-advantage-is.

After teasing out several example capabilities and building their lower levels, it starts to looks sort of like a multidimensional version of a tech tree.

I am also interested in accounting for things like research debt. Interpretive labor is really important for the lateral movement of ideas; leaning on Daniel's post again for example, I propose that ideas pulled from the public domain would be less effectively used than those developed in-house. This could be treated as each idea having only fractional value, or as a time delay as the interpretive labor has to be duplicated in-house before the idea yields dividends.

Had some illusions about the C language shattered recently.

I read an article from 2018 called C Is Not A Low Level Language from ACM Queue. The long and short of it is that C fails to get the programmer "close to the metal" in any meaningful sense because the abstract machine it uses doesn't faithfully represent anything about modern computer architecture. Instead, it hides the complexity of modern instruction sets and memory arrangements beneath the abstractions which modeled hardware very well in the 1970s.

I, like most people, thought C was the best way to do hardware aside from just writing in assembly or machine code directly. I had assumed, but never checked, that as hardware advanced the work on C development was accounting for this; it appears backwards compatibility won out. This puts us in a weird position where both new hardware design and new C development are both constrained by trying to maintain compatibility with older C. In the case of hardware design, this means limiting the instruction sets of processors so they are comprehensible to C; in the case of C development this means an unyielding commitment to making people write code for the PDP-11 and relying ever more heavily on the compiler to do the real work.

The comments in the Reddit thread were, predictably, overwhelmingly focused on semantics of the high/low level dichotomy, with half claiming Assembler is also a high level language and the other half flatly rejecting the premise of the article or playing defense about how useful C is. I feel this kind of thing misses the point, because what I liked about C is that it helped me to think about what the machine was actually doing. Now I discover I wasn't thinking about what the machine I was working on was doing so much as thinking about general types of things a machine is expected to do (have one processor, and memory, and disk, and IO). While this is clearly better than not knowing, it left what I thought was the core advantage in the wind.

I therefore did a search, assuming that if old faithful had failed someone else had surely tried to fill the niche. Nothing presented itself in my searches, which combined represent an hour or so of reading. Instead I learned something interesting about programming; it is entirely upward oriented. All the descriptions of every language billed as being either for embedded systems or as "one language to rule them all" advertised themselves entirely of what kind of high-level abstractions they gave access to.

Hence the old saw: dogs cannot look up; computer scientists cannot look down.

There's Assembly for those people who actually care about what the hardware is doing. The question is whether there meaningfully can be a language who's as low level as Assembly but which also provides higher abstractions to programmers. 

Had some illusions about the C language shattered recently.

I read an article from 2018 called C Is Not A Low Level Language from ACM Queue. The long and short of it is that C fails to get the programmer "close to the metal" in any meaningful sense because the abstract machine it uses doesn't faithfully represent anything about modern computer architecture. Instead, it hides the complexity of modern instruction sets and memory arrangements beneath the abstractions which modeled hardware very well in the 1970s.

I, like most people, thought C was the best way to do hardware aside from just writing in assembly or machine code directly. I had assumed, but never checked, that as hardware advanced the work on C development was accounting for this; it appears backwards compatibility won out. This puts us in a weird position where both new hardware design and new C development are both constrained by trying to maintain compatibility with older C. In the case of hardware design, this means limiting the instruction sets of processors so they are comprehensible to C; in the case of C development this means an unyielding commitment to making people write code for the PDP-11 and relying ever more heavily on the compiler to do the real work.

Had some illusions about the C language shattered recently.

I read an article from 2018 called C Is Not A Low Level Language from ACM Queue. The long and short of it is that C fails to get the programmer "close to the metal" in any meaningful sense because the abstract machine it uses doesn't faithfully represent anything about modern computer architecture. Instead, it hides the complexity of modern instruction sets and memory arrangements beneath the abstractions which modeled hardware very well in the 1970s.

I, like most people, thought C was the best way to do hardware aside from just writing in assembly or machine code directly. I had assumed, but never checked, that as hardware advanced the work on C development was accounting for this; it appears backwards compatibility won out. This puts us in a weird position where both new hardware design and new C development are both constrained by trying to maintain compatibility with older C. In the case of hardware design, this means limiting the instruction sets of processors so they are comprehensible to C; in the case of C development this means an unyielding commitment to making people write code for the PDP-11 and relying ever more heavily on the compiler to do the real work.

The comments in the Reddit thread were, predictably, overwhelmingly focused on semantics of the high/low level dichotomy, with half claiming Assembler is also a high level language and the other half flatly rejecting the premise of the article or playing defense about how useful C is. I feel this kind of thing misses the point, because what I liked about C is that it helped me to think about what the machine was actually doing. Now I discover I wasn't thinking about what the machine I was working on was doing so much as thinking about general types of things a machine is expected to do (have one processor, and memory, and disk, and IO). While this is clearly better than not knowing, it left what I thought was the core advantage in the wind.

I therefore did a search, assuming that if old faithful had failed someone else had surely tried to fill the niche. Nothing presented itself in my searches, which combined represent an hour or so of reading. Instead I learned something interesting about programming; it is entirely upward oriented. All the descriptions of every language billed as being either for embedded systems or as "one language to rule them all" advertised themselves entirely of what kind of high-level abstractions they gave access to.

Hence the old saw: dogs cannot look up; computer scientists cannot look down.

Is there a reason warfare isn't modeled as the production of negative value?

The only economic analyses I have seen are of the estimating-cost-of-lost-production type, which I can only assume reflects the convention of converting everything to a positive value.

But it is so damned anti-intuitive!

I'm not sure what you're proposing - it seems confusing to me to have "production" of negative value.  I generally think of "production" as optional - there's a lower bound of 0 at which point you prefer not to produce it.

I think there's an important question of different entities doing the producing and capturing/suffering the value, which gets lost if you treat it as just another element of a linear economic analysis.  Warfare is somewhat external to purely economic analysis, as it is generally motivated by non-economic (or partly economic but over different timeframes than are generally analyzed) values.

You and everyone else; it seems I am the only one to whom the concept makes any intuitive sense.

But the bottom line is that the value of weapons is destruction, which is to say you are paying $X in order to take away $Y from the other side. Saying we pay $X to gain $Y value is utterly nonsensical, except from the perspective of private sector weapons manufacturers.
 

I agree that economic models are not optimal for war, but I see a significant problem where our the way we think about war and the way we think about war preparation activities are treated as separate magisteria, and as a consequence military procurement is viewed in Congress as an economic stimulus rather than something of strategic import.

I agree that economic models are not optimal for war

Go a little further, and I'll absolutely agree.  Economic models that only consider accounting entities (currency and reportable valuation) are pretty limited in understanding most human decisions.   I think war is just one case of this.  You could say the same for, say, having children - it's a pure expense for the parents, from an economic standpoint.  But for many, it's the primary joy in life and motivation for all the economic activity they partake in.

But the bottom line is that the value of weapons is destruction.

Not at all.  The vast majority of weapons and military (or hobby/self-defense) spending are never used to harm an enemy.  The value is the perception of strength, and relatedly, the threat of destruction.  Actual destruction is minor.

military procurement is viewed in Congress as an economic stimulus

That congress (and voters) are economically naïve is a distinct problem.  It probably doesn't get fixed by additional naivete of forcing negative-value concepts into the wrong framework.  If it can be fixed, it's probably by making the broken windows fallacy ( https://en.wikipedia.org/wiki/Parable_of_the_broken_window) less common among the populace.

The value is the perception of strength, and relatedly, the threat of destruction.  Actual destruction is minor.

The map is not independent of the territory, here. Few cities were destroyed by nuclear weapons, but no one would have cared about them if they couldn't destroy cities. Destruction is the baseline reality upon which perceptions of strength operate. The whole value of the perception of strength is avoiding actual destructive exchanges; destruction remains the true concern for the overwhelming majority of such spending.

The problem I see is that war is not distinct from economics except as an abstraction; they are in reality describing the same system. What this means is we have a partial model of one perspective of the system, and total negligence of another perspective of the system. Normally we might say not to let the perfect be the enemy of the good, but we're at the other end of the spectrum so it is more like recruiting the really bad to be an enemy of the irredeemably awful.

Which is to say that economic-adjacent arguments are something the public at large is familiar with, and their right-or-wrong beliefs are part of the lens through which they will view any new information and judge any new frameworks.

Quite separately I would find economics much more comprehensible if they included negatives throughout; as far as I can tell there is no conceptual motivation for avoiding them, it is mostly a matter of computational convenience. I would be happy to be wrong; if I could figure out the motivation for that, it would probably help me follow the logic better.

But the bottom line is that the value of weapons is destruction

The bottom line is protection, expansion, and/or survival; destruction is only an intermediate goal