All of DanB's Comments + Replies

Visible Homelessness in SF: A Quick Breakdown of Causes

I don't buy the housing cost / homelessness causation. There are many poor cities in the US that have both low housing costs and high homelessness. This page mentions Turlock, CA, Stockton, CA, and Springfield, MA as among the top 15 places with the highest homelessness rates; a quick Zillow search indicates they all have a fair bit of cheap housing.

The relationship between homelessness and state-wide housing costs is probably caused by a latent variable: degree of urbanization. Cities are both more expensive and have more homelessness, and states vary w... (read more)

4Michael Thomas1mo
I also don’t buy that there is a causal relationship between high living costs and high homelessness. As a Bay Area resident, it’s pretty clear that people don’t go from “can’t afford a home” to “homeless” – they go from “can’t afford a home“ to ”resident of Boise”. Someone else made this point on a post I read recently, but I can’t remember where (Maybe it was Bryan Caplan?) But it extends the observation above as follows: The Bay Area (and other high-cost areas) are, all else being equal, desirable places to live. People don’t live here primarily because it is too expensive to have a decent quality of life here. A homeless person, however, gets to live here (i.e., a “nice” place to live) without having to face the high housing costs, because they don’t pay those high housing costs. That seems like a semi-plausible explanation of why high housing costs would be correlated to high homelessness without causing that homelessness.

On the state level, the correlation between urbanization and homelessness is small (R^2 = 0.13) and disappears to zero when you control for housing costs, while the reverse is not true (R^2 of the residual = 0.56). States like New Jersey, Rhode Island, Maryland, Illinois, Florida, Connecticut, Texas, and Pennsylvania are among the most urbanized but have relatively low homelessness rates, while Alaska, Vermont, and Maine have higher homelessness despite being very rural. There's also, like, an obvious mechanism where expensive housing causes homelessness (... (read more)

Land Ho!

Copied from a previous comment on Hacker News

I wish you well and I hope you win (ed, here I mean I hope the proposal is approved)

I am pessimistic though. I don't think people really understand how much current homeowners do not want additional housing to be built. It makes sense if you consider that the net worth of a typical homeowner is very substantially made up of a highly leveraged long position in real estate. If that position goes south - because of an increase in housing supply, or because of undesirable new people moving into the neighborhood - th... (read more)

7Steven Byrnes5mo
The selfish interest of homeowners and landowners is generally to prevent the creation of new homes and new land. The selfish interest of renters and businesses is generally the opposite. I suspect that New York City has an unusually high ratio of the latter to the former (in terms of political power). Obviously, people frequently vote against their selfish interests. [] But it's at least slightly relevant. Anyway, you brought it up.
I don’t entirely disagree with you, but I find this explanation confusing. Take an urban homeowner in a single family zoned neighborhood. They paid a large premium to buy it, and they must not like that. And if the land were up zoned, their property would suddenly be worth a lot more. If land gets upzoned (or in this case built) in a faraway location, it doesn’t particularly effect them. So I’m left to conclude, tentatively, that homeowners resist development for the actual reasons they give. They’re personally attached to their neighborhood. They don’t want to move. They like the way it looks, like the feeing of safety and intimacy, and don’t want to see that change too much. When they moved in, they were paying not only for the plot of land, but for the atmosphere of its surroundings, the local amenities, the view. They care about the environment, and don’t think we have adequate appreciation for its importance, or caution in appropriating it or tampering with it. Now, maybe they could get all that and more at the price of some short-term disruption that would, in the long run, make conditions better for almost everybody. But they don’t want to deal with that disruption, to the tune of quite a lot of money. It’s fine to say that the law is not an appropriate mechanism for enforcing the coordination feat of environmental protection or urban planning, or that you think these people have wrong opinions. But if you’re trying to model their true worldview, it seems more plausible to me that they’re revealing their honest preferences.
Uncontroversially good legislation
Answer by DanBJan 10, 2022-1

End Social Security and Other Defined-Benefit Pension Schemes They are intrinsically racist and sexist.

Consider two people, Alice and Bob. Alice is an Asian-American female, while Bob is an African-American male. From the point of view of Social Security, they are identical in every respect: they are the same age, they make the same contributions of the same amount on the same date, and retire at the same time. For the sake of argument, suppose they begin taking SS payments at age 70.

Given that Alice and Bob have made exactly equivalent contributions to ... (read more)

Why would you believe that to be uncontroversial?
Designing Low Upkeep Software

Having a budget where initial creation is essentially free (fun!) while maintenance is extremely expensive (drugery!) is a dramatic exaggeration for most software development.

My feeling is that most software development has exactly the same cost parameters; the difference is just that BigTech companies have so much money they are capable of paying thousands of engineers handsome salaries, to do the endless drudgery required to keep the tech stacks working.

The SQLite devs pledge to support the product until 2050.

The payback is also very different at tech companies, or in any professional environment. I make something because I'm excited about it, and the payback is some combination of getting to use it and being glad other people can use it. When a company makes something, the payback is typically that people pay for it, or perhaps use it while looking at ads. This dramatically changes the incentives around adding features and generally changing it: you need to keep improving the product to compete with others that people might use instead. Once you need to keep the product live, in the sense that there are always multiple engineers spun up on it, the cost of switching to updated versions of dependencies is a small portion of the overall cost. And designing for minimal upkeep makes it harder to add new features.
I do tend to use SQLite when a flat file would work.
A Small Vacation

Thanks for the positive feedback and interesting scenario. I'd never heard of Birobidzhan.

Compositionality: SQL and Subways

Thanks for the tip about Kusto - it actually does look quite nice.

How will OpenAI + GitHub's Copilot affect programming?
Answer by DanBJul 01, 20213

My prediction is that the main impact is to make it easier for people to throw together quick MVPs and prototypes. It might also make it easier for people to jump into new languages or frameworks.

I predict it won't impact mainstream corporate programming much. The dirty secret of most tech companies is that programmers don't actually spend that much time programming. If I only spend 5 hours per week writing code, cutting that time down to 4 hours while potentially reducing code quality isn't a trade anyone will really want to make.

Sympathy for the ferryman of Hades, or why we should keep Trump off Twitter

Why isn't this an argument for banning all politically powerful people from Twitter?

I think it is? That was kind of the implication that I read into it at least.
Or maybe all people and we just have bots arguing for retweets and likes? :)
Survey on cortical uniformity - an expert amplification exercise

One very important observation related to this issue is the fact that we often observe specific cognitive deficits (e.g. people who can't use nouns) but those specific deficits are almost always related to a brain trauma (stroke, etc.) If there were significant cognitive logic coded into the genome, we should see specific cognitive deficits in otherwise healthy young people caused by mutations.

4Steven Byrnes1y
Good insight! Haven't seen that one before!
Utility Maximization = Description Length Minimization

I'm not sure exactly what you mean, but I'll guess you mean "how do you deal with the problem that there are an infinite number of tests for randomness that you could apply?"

I don't have a principled answer. My practical answer is just to use good intuition and/or taste to define a nice suite of tests, and then let the algorithm find the ones that show the biggest randomness deficiencies. There's probably a better way to do this with differentiable programming - I finished my Phd in 2010, before the deep learning revolution.

Utility Maximization = Description Length Minimization

In my Phd thesis I explored an extension of the compression/modeling equivalence that's motivated by Algorithmic Information Theory.  AIT says that if you have a "perfect" model of a data set, then the bitstream created by encoding the data using the model will be completely random. Every statistical test for randomness applied to the bitstream will return the expected value. For example, the proportion of 1s should be 0.5, the proportion of 1s following the prefix 010 should be 0.5, etc etc. Conversely, if you find a "randomness deficiency", you have... (read more)

Interesting framing. Do you have a unified strategy for handling the dimensionality problem with sub-exponentially-large datasets, or is that handled mainly by the initial models (e.g. hidden markov, bigram, etc)?
Fight Akrasia and Decision Fatigue with DIY Productivity Software

Cool concepts! What tech stack did you use? Was it painful to get the Facebook API working? 

1Lukas T1y
All my tools are just Windows desktop applications built on old technology, C# and Windows Forms, using a simple file for data storage. The Facebook API is extremely limited due to privacy considerations, IIRC it does not allow fetching a list of your friends. Therefore I just implemented everything using web automation.
The Rediscovery of Interiority in Machine Learning

Not a stupid question, this issue is actually addressed in the essay, in the section about interior modeling vs unsupervised learning. The latter is very vague and general, while the former is much more specific and also intrinsically difficult. The difficulty and preciseness of the objective make it much better as a goal for a research community.

2Steven Byrnes2y
Your reply seems to treat "unsupervised learning" and "self-supervised learning" as synonyms, but I don't think they are. Self-supervised is more specific. Things like clustering and dimensionality reduction and feature extraction are not examples of self-supervised learning, I don't think. Predicting the next few seconds of a video, or predicting the next word of a text, or deleting a word from the middle of a sentence and training your model to guess what it is, would all be examples of self-supervised learning.
The Rediscovery of Interiority in Machine Learning

I started this essay last year, and procrastinated on completing it for a long time, until recently the GPT-3 announcement gave me the motivation to finish it up.

If you are familiar with my book, you will notice some of the same ideas, expressed with different emphasis. I congratulate myself a bit on predicting some of the key aspects of the GPT-3 breakthrough (data annotation doesn't scale; instead learn highly complex interior models from raw data).

I would appreciate constructive feedback and signal-boosting.

College advice for people who are exactly like me

I would add two ideas:

  • Try to find a good role model - someone who is similar to you in relevant respects, is a couple of years ahead of you, who has done something you think is awesome, and who you can talk to and observe to some extent. Bill Gates is probably not a good role model.
  • Try to form a realistic assessment of how important college actually is; people often err in imagining it to be more or less important than it is in reality (these errors seem to be correlated with social class). I would estimate that the 4 years of college are only modestly more important than other years of your life. What you do right after college is important. What you do when you're in your late 20s is important.
Extended Quote on the Institution of Academia

Holden is a smart guy, but he's also operating under a severe set of political constraints, since his organization depends so strongly on its ability to raise funds. So we shouldn't make too much of the fact that he thinks academia is pretty good - obviously he's going to say that.

This doesn't quite seem like an accurate description of the situation, given that his org is trying to give away billions of dollars. Don't disagree that it's in his interest to choose his words carefully though.
Yeah, when I listened to it, it struck me that he was describing things in a pretty glass-half-full sort of way. In general I'm impressed that he often manages to be both blunt and diplomatic.
The New Riddle of Induction: Neutral and Relative Perspectives on Color

Interesting analysis. I hadn't heard of Goodman before so I appreciate the reference.

In my view the problem of induction has been almost entirely solved by the ideas from the literature on statistical learning, such as VC theory, MDL, Solomonoff induction, and PAC learning. You might disagree, but you should probably talk about why those ideas prove insufficient in your view if you want to convince people (especially if your audience is up-to-date on ML).

One particularly glaring limitation with Goodman's argument is that it depends on natural l... (read more)

2[comment deleted]1y
In my view, "the problem of induction" is just a bunch of philosophers obsessing over the fact that induction is not deduction, and that you therefore cannot predict the future with logical certainty. This is true, but not very interesting. We should instead spend our energy thinking about how to make better predictions, and how we can evaluate how much confidence to have in our predictions. I agree with you that the fields you mention have made immense progress on that. I am not convinced that computer programs are immune to Goodmans point. AI agents have ontologies, and their predictions will depend on that ontology. Two agents with different ontologies but the same data can reach different conclusions, and unless they have access to their source code, it is not obvious that they will be able to figure out which one is right. Consider two humans who are both writing computer functions. Both the "green" and the "grue" programmer will believe that their perspective is the neutral one, and therefore write a simple program that takes light wavelength as input and outputs a constant color predicate. The difference is that one of them will be surprised after time t, when suddenly the computer starts outputting different colors from their programmers experienced qualia. At that stage, we know which one of the programmers was wrong, but the point is that it might not be possible to predict this in advance.