Background is in the preregistration post. This post will assume you’ve read that one.

First, the headline result: my main prediction was wrong. Paul’s predictions were also wrong. By the preregistered metric, the second hose improves performance by much less than either of us expected. That said, it looks like the experiment’s main metric mostly did not measure the thing it was intended to measure, which is why we were both so far off.

I did collect a bunch of data, which allows us to estimate the air conditioner’s performance in other ways. That analysis was not pre-registered and you should therefore be suspicious of it, but I nonetheless believe that it gives a more accurate view of the air conditioner’s performance. Main takeaway of that analysis is that the air conditioner performs about twice as well with two hoses as with one. Going by that analysis, both Paul's formula and my prediction were correct.

Experiment Setup

Here’s the air conditioner with the cardboard “second hose” attached:

 
The t-shirts are to cover gaps. It doesn’t actually need to be airtight.

I hung thermometers at each corner of the room, in the middle of each wall, and on the ceiling in the center of the room. I also placed one thermometer in the inlet (i.e. the hole in the window covering through which the cardboard hose draws air), the outlet (i.e. the hole in the window covering through which the other hose blows air), and outside on the balcony.

 
 
 
 

I ran a few different tests:

  • Air conditioner with and without the cardboard intake hose, on low fan
  • Air conditioner with and without the cardboard intake hose, on high fan
  • Air conditioner off (control)

For both the 1-hose and control tests, I left the inlet hole in the window covering open. (This was mainly to reduce infiltration from places other than outside.)

Assorted Notes:

  • I did try the experiment previously (about a month ago), and ran into issues which resulted in some minor changes to the experiment setup; info about that is here.
  • None of the experiment setup, thermometers, or the room were in direct sun.
  • The time for each test was mostly determined by when I had meetings scheduled, and when the temperatures seemed to stop changing.

Results

Data is here. Some numbers:

  • Outdoor temperature was 85-88°F (29.4 - 31.1°C) for most of the testing, though it dropped to 82°F (27.8°C) in the evening during the control test
  • Temperature difference between outdoor and indoor (higher is better), in each test:
    • Low fan: 20.6°F (11.4°C) with one hose, 22.7°F (12.6°C) with two hoses
    • High fan: 18.7°F (10.4°C) with one hose, 22.2°F (12.3°C) with two hoses
    • Control: 13.1°F (7.3°C)
  • Temperature variance across the room was fairly high (~2.5 - 3.0°F, or ~1.4 - 1.7°C), and consistent (i.e. measurements 30 minutes apart had similar relative temperature patterns)
  • Exhaust temperature was 98 - 100°F (36.7 - 37.8°C) with one hose, 112 - 119°F (44.4 - 48.3°C) with two

Analysis

My main prediction was that the outdoor-indoor temperature difference would be at least 50% greater with two hoses than with one hose, with my median estimate around 100% (i.e. a factor-of-two difference). That was definitely wrong: the actual number was 10% with fan on low, 19% with fan on high.

Paul’s prediction for the same number was 33-43%, though that was based on some very rough guesses about indoor, outdoor and exhaust temperatures. Using Paul’s formula with the actual indoor, outdoor and exhaust temperatures from the tests gives predictions anywhere from 75% to 180%, consistent with my own factor-of-two median guess. (The difference is mainly because the exhaust temperature was considerably lower than Paul had speculated. It really is a very shitty air conditioner.)

So experimentally, the difference came out way lower than anybody guessed. Why?

The result from the control test basically tells us the answer. With the air conditioner off, the room did not return to anywhere near outdoor temperature over the course of an hour. That implies some combination of:

  • Very slow equilibration, such that the other test results were probably also not near steady-state.
  • Indoor temperatures driven more by (cooler) temperatures in neighboring rooms, rather than outdoor temperature.

The high and consistent-over-time variance of temperatures in different locations within the room also points toward some combination of these two issues. I would guess the second issue is more important than the first, though I’m not confident in that.

One relatively simple way to correct for the problem: use the temperature from the control test as the baseline temperature, rather than using the outdoor temperature as baseline. If we do that, then with the fan on high the results are:

  • AC cools the room by 2.6°F (1.4°C) relative to control with one hose
  • AC cools the room by 5.1°F (2.8°C) relative to control with two hoses (despite the outdoor temperature being slightly higher during the two-hose test)

Two hose performs better by about a factor of two, consistent with both Paul’s formula (using the real indoor/outdoor/exhaust temperatures) and my guess.

My Updates

On the Experiment

First and foremost: I made a confident wrong prediction, so there better be some substantial update from that. My main update from that is to put even more weight on “the specific metric you choose will inevitably fail to measure the thing you thought it would measure, especially on your first try”. That’s something I already knew on some level, I already considered it the main bottleneck to making prediction markets actually useful in practice, and in hindsight it’s embarrassing that I didn’t put more weight on it when making the air conditioner prediction.

Second: I’ve basically thrown out my pre-registration and done a bunch of analysis which disagrees with the preregistered analysis. I do think that was the right choice for maximal epistemic accuracy (and in fact is very often the right choice for maximal epistemic accuracy, because the specific metric you choose will inevitably fail to measure the thing you thought it would measure, especially on your first try). But it’s important to increment the counter in the back of one’s head when doing that, and occasionally go re-examine to see if one is systematically steering away from some undesired conclusion. Counter incremented.

Third: shortly after I put up the preregistration post, both Paul and ADifferentAnonymous left comments explaining where Paul’s formula came from, and I updated a lot toward that formula being a good model based on the thermodynamic argument they gave. (The main thing the formula leaves out is extra waste heat generated with one hose vs two, and I still had some uncertainty about that.) After seeing the (admittedly very rough) agreement between the formula and the performance-relative-to-control, I currently think the formula is basically correct.

Fourth: both the temperature-change-relative-to-control and the rough thermodynamic calculation (i.e. Paul’s formula) with real exhaust temperature point to two hose performing about twice as well as one. I think that’s probably about right (modulo some large error bars, of course, since none of this was super precise).

On One vs Two Hose

Now for the real claim of interest: that single hose air conditioners are “stupidly inefficient in a way which I do not think consumers would plausibly choose over the relatively-low cost of a second hose if they recognized the problems”.

That claim boils down to a cost-benefit analysis, and this whole experiment has been about the benefit. What about the cost side?  Air conditioner hoses similar to the one my unit uses cost about $20 on amazon. Presumably the actual cost-to-the-manufacturer is lower, especially if they’re shipping in a box with the rest of the air conditioner. So, the second hose adds at most $20 to a $300-500 air conditioner. It also adds a little bit more annoying fiddliness when setting up the AC.

That cost sounds like it is very obviously worth paying for a factor-of-two performance improvement. It would be very obviously worth paying for the second hose even if my estimates of performance improvement are quite far off; the performance improvement would have to be well below 30% before I’d even start to consider a second hose not-obviously-worthwhile. The marginal cost is just so small.

So, yeah, I do think that single hose air conditioners are stupidly inefficient in a way which I do not think consumers would plausibly choose over the relatively-low cost of a second hose if they recognized the problems.

On Civilizational Adequacy

The air conditioner was intended as an example in which a product is shitty in ways the large majority of consumers don’t notice, and therefore market pressures don’t fix it. Two further implications of our ability to actually find such an example:

  • It can’t be that rare for products to be shitty in ways the large majority of consumers don’t notice, otherwise we wouldn't have found one.
  • If there’s products where it takes an unusually-good-relative-to-the-populace understanding of technical topics to recognize major problems, then there’s probably products which have major problems nobody recognizes, because nobody yet knows the right technical topics well enough. Again, such cases probably aren't that rare.

I still think that such cases are not only “not rare”, but common. I still expect major problems which nobody is able to recognize, due to an insufficient understanding of the right technical topics, to be the main source of AI X-risk. And I still expect that opportunities to iterate will mostly not help us to directly fix such problems, for the same reason that markets don’t fix single hose air conditioners: people have to notice the problem in order for the feedback loop to fix it.

(Also, of course, the Department of Energy coming up with an utterly bullshit energy rating which makes single hose air conditioners look much less bad than they are is a metaphor for everything, and is very much the sort of thing I expect to generalize.)

… though at the same time, a counter has incremented in the back of my head, and I do have a slight concern that I’m avoiding evidence against the “people don’t notice major problems” model. I don’t actually think I’m updating incorrectly, at this point, but it’s a possibility which has risen to my conscious attention and I’m keeping an eye on it.

78

38 comments, sorted by Click to highlight new comments since: Today at 1:14 AM
New Comment

I take fault with your primary conclusion, for the same reasons I gave in the first thread:

  1. You claim how little adding a 2nd hose would impact the system, without analyzing the actual constraints that apply to engineers building a product that must be shipped & distributed
  2. You still neglect the existence of insulating wraps for the hose which do improve efficiency, but are also not sold with the single-hose AC system, which lends evidence to my first point -- companies are aware of small cost items that improve AC system efficiency, but do not include them with the AC by default, suggesting that there is an actual price point / consumer market / confounding issue at play that prevents them doing so

The full posts, quoted here for convenience

I think one reason that this error occurs is that there's a mistaken assumption that the available literature captures all institutional knowledge on a topic, so if one simply spends enough time reading the literature, they'll have all requisite knowledge needed for policy recommendations. I realize that this statement could apply equally to your own claims here, but in my experience I see it happen most often when someone reads a handful of the most recently released research papers and from just that small sample of work tries to draw conclusions applicable that are broadly applicable to the entire field.

Engineering claims are particularly suspect because institutional knowledge (often in the form of proprietary or confidential information held by companies and their employees) is where the difference between what is theoretically efficient and what is practically more efficient is found. It doesn't even need to be protected information though -- it can also just be that due to manufacturing reasons, or marketing reasons, or some type of incredibly aggravating constraint like "two hoses require a larger box and the larger box pushes you into a shipping size with much higher per-volume / mass costs so the overall cost of the product needs to be non-linearly higher than what you'd expect would be needed for a single hose unit, and that final per-unit cost is outside of what people would like to pay for an AC unit, unless you then also make drastic improvements to the motor efficiency, thermal efficiency, and reduce the sound level, at which point the price is now even higher than before, but you have more competitive reasons to justify it which will be accepted by a large enough % of the market to make up for the increased costs elsewhere, except the remaining % of the market can't afford that higher per-unit cost at all, so we're back to still making and selling a one-hose unit for them".

 

Concrete example while we're on the AC unit debate -- there's a very simple way to increase efficiency of portable AC units, and it's to wrap the hot exhaust hose with insulating duct wrap so that less of the heat on that very hot hose radiates directly back into the room you're trying to cool. Why do companies not sell their units with that wrap? Probably for one of any of the following reasons -- A.) takes up a lot of space, B.) requires a time investment to apply to the unit which would dissuade buyers who think they can't handle that complexity, C.) would cost more money to sell and no longer be profitable at the market's price point, D.) has to be applied once the AC unit is in place, and generally is thick enough that the unit is no longer "portable" which during market testing was viewed as a negative by a large % of surveyed people, or E.) some other equally trivial sounding reason that nonetheless means it's more cost effective for companies to NOT sell insulating duct wrap in the same box as the portable AC unit. 

Example of an AC company that does sell an insulating wrap as an optional add-on: https://www.amazon.com/DeLonghi-DLSA003-Conditioner-Insulated-Universal/dp/B07X85CTPX

EDIT: I want to make a meta point here, which is that I have not personally worked on ACs, but I have built & shipped multiple products to consumers, and the type of stupid examples I gave in the first AC post are not just made-up for fun. Engineers argue extensively in meetings about "how can we make product A better", and ideas get shot down for seemingly trivial reasons that basically come down to -- yes, in a vacuum, that would be better, but unfortunately, there's a ton of existing context like how large a truck is or what parts can actually be bought off the shelf that kneecap those ideas before they leave the design room. The engineers who designed the AC were not idiots, or morons, or clowns who don't understand thermodynamic efficiency. Engineering is about working around limitations. Those limitations do not have to be rooted in physics; society or infrastructure or consumer behavior around critical price points can all be just as real in terms of what it is feasible for a company to create. Just look at how many startups fail and the founder claims in a postmortem, "Yeah, our tech was way better, but unfortunately people wouldn't pay 10% more for it, even though it was AMAZING compared to our competitor. We just couldn't get them to switch."

EDIT 2: I'm pretty annoyed that you doubled-down on your conclusion even after admitting the actual efficiency difference was significantly less than expected, and then chose a different analysis to let you defend your original point anyway, so these edits might keep coming. Regarding market pressures, two-hose AC units do exist. Companies do sell them, and if consumers want to buy a two-hose AC unit, they can do so. But the presence of both one-hose AC units and two-hose AC units in the market tells us it is not winner-take-all and there is consumer behavior, e.g. around price or complexity, that prevents two-hose units from acquiring literally all market share. So until that changes, it will always be more rational for companies to sell one-hose AC units in addition to their two-hose AC unit, because otherwise they'd be leaving money on the floor by only servicing part of the consumer market. (EDIT 5: see also this post, which was itself a reply to AllAmericanBreakfast's reply on this thread here)

EDIT 3: Let's look at your math. Outdoor temp is 85-88 F, let's just take the average and call it 86.5 F. That's pretty hot. I'd definitely be uncomfortable in that scenario. How cold did the AC cool the rooms? You say on low fan it was 20.6 F degrees with one hose, 22.7 F with two hoses, and then on high fan, 18.3 F with one hose, and 22.2 F with two hoses. The control was 13.1 F. Looking at the control, that gives a room temperature of ~73.4 F. That is uncomfortably hot in my opinion. I keep my room temperature around 68-70 F ish. The internet tells me that this is within the window of a "comfortable room temperature" defined as 67-75 F[1], so I'm just a normal human, I guess. How well did the ACs accomplish that? With one hose, you got it down to ~66 F, and with two hoses, you had it down to about ~64 F. That is pretty cold in my mind -- I would not set my AC that low if it actually reached that temperature. What does this mean? The one hose unit literally did the job it was designed to do. With an incredibly hot outside temperature, that resulted in an uncomfortable indoor "control" temperature, the one-hose AC was able to lower the temperature to a comfortable, ideal range, and then go below that, showing it even has margin left over. But now you're saying that they should make the thing more expensive  and optimize it for even greater efficiency because ... why!? It works! 

EDIT 4: I will die on this hill. This is the problem with how the rationalist community approaches the concept of what it means to "make a rational decision" perfectly demonstrated in a single debate. You do not make a "rational decision" in the real world by reasoning in a vacuum. That is how you arrive at a hypothetically good action, but it is not necessarily feasible or possible to perform, so you always need to check your analysis by looking at real world constraints and then pick the action that is 1.) actually possible in the real world, and 2.) still has the highest expected value. Failing to do that is not more clever or more rational, it is just a bad, broken model for how an ideal, optimal agent would behave. An optimal agent doesn't ignore their surroundings -- they play to them, exploit them, use them. 

  1. ^

    I averaged the following lower / upper temperatures.

    Wikipedia: 64-75
    www.cielowigle.com: 68-72
    www.vivint.com: 68-76
    www.provicincialheating.ca: 68-76

... companies are aware of small cost items that improve AC system efficiency, but do not include them with the AC by default, suggesting that there is an actual price point / consumer market / confounding issue at play that prevents them doing so

Or it suggests that consumers would mostly not notice the difference in a way which meaningfully increased sales, just like I claim happens with the single-hose vs two-hose issue. For instance, I believe an insulating wrap would not change the SEER rating (because IIRC the rating measurements don't involve the hose), so consumers would not be able to recognize the impact on performance that way. That would explain why companies don't include them, despite the apparent low cost.

(Also in the case of hose insulation I think the effect size is much smaller than 1 vs 2 hose, so that impacts the cost-benefit too.)

"Yeah, our tech was way better, but unfortunately people wouldn't pay 10% more for it, even though it was AMAZING compared to our competitor. We just couldn't get them to switch."

This is also the sort of thing which I expect usually happens because the tech is way better in ways which aren't obvious, or whose value isn't obvious, to the person who makes the decision about whether to purchase.

You claim how little adding a 2nd hose would impact the system, without analyzing the actual constraints that apply to engineers building a product that must be shipped & distributed

Details like "it would bump the box up to another category" can't matter that much, because in the worst case you could just ship it separately and we already know that costs at most $20 because we can in fact get an AC hose on Amazon for $20.

And sure, I wouldn't be surprised if I were missing some key detail that made it $40 rather than $20 (maybe connectors?), but this just isn't plausibly going to be big enough to offset the huge effect size of a second hose.

So until that changes, it will always be more rational for companies to sell one-hose AC units in addition to their two-hose AC unit, because otherwise they'd be leaving money on the floor by only servicing part of the consumer market.

What exactly do you think you're objecting to here? I never said companies were failing to pursue profits. The point was to explain why the companies' profit-maximizing strategy is to sell a product which could cheaply be improved. The lack of ideal reasoning is in the consumers, not the companies.

In general, you seem to have largely lost track of what the posts are actually saying.

This experiment, for me, actually provides evidence for the way visibility to the consumer drives engineering decisions but in the opposite manner of how I think you intended it to.

It was obvious to you (visible to the consumer), prior to the experiment, that the two-hose model was superior to the one-hose model. Even after running a controlled experiment that demonstrated that both A/Cs were able to cool your apartment to a comfortable temperature, you described the one-hose model as "very shitty," showing that even this experiment was not powerful enough to convince you that both models would suit your needs just fine. Because it's so obvious to you that two-hose A/Cs are superior to one-hose A/Cs, you'll probably unnecessarily overpay for the two-hose model the next time you need an A/C.

Overall, we have another datapoint of a consumer projected to make a suboptimal purchasing decision, based on an inaccurate view of his own needs and the capabilities of the options on the market. This is exactly what you were worried about before you began the experiment, and so in this sense, the overall evidence provided by this scenario supports your conclusion about "form over substance" even more perhaps than it would have if the one-hose unit hadn't cooled your apartment effectively.

At the heart of many such complex A/C purchasing decisions, the market is recognizing a simple truth: customers are split on whether they want a cheaper one-hose or a more expensive two-hose unit. The market therefore supplies both. People who buy a model that breaks too quickly or doesn't cool effectively either return or replace their model with a different one, and move on to more important issues. This is a system that gets almost everybody what they need at a reasonable cost, without anybody having to think too hard about it.

That said, there are all kinds of other issues with the market that parallel the Potemkin Village and Godzilla metaphores that we're trying to articulate when we try to find metaphores for AI risk in the human world. Climate change and war are two obvious examples. We ge the illusion of free choice and private property, but actually pump out all kinds of pollution that harms other people without their consent. We see big powerful states putting disfavored minority groups in concentration camps, with zero effective resistance. Even if we don't worry about AGI, a garden variety that helps misaligned human agents do misaligned things better can help polluters profit from pollution and oppressors oppress more effectively. That harm can compound through the whole progression of AI to AGI.

Maybe the right move is to convince people that "alignment" is a broader issue than just AI, but that AI alignment is a key part of it. Government alignment, economic alignment, cultural alignment: these are framings that could be powerful, and more resonant and familiar, and be good ways to explain the issue and garner support. I wonder what would happen if we expanded on them?

We can even consider the proposed plan (add a 2nd hose and increase the price by $20) in the context of an actual company.

The proposed plan does not actually redesign the AC unit around the fact that we now have 2 hoses. It is "just" adding an additional hose.

Let's assume that the distribution of AC unit cooling effectively looks something like this graphic that I made in 3 seconds.


 

In this image, we are choosing to assume that yes, in fact, 2-hose units are more efficient on average than a 1-hose unit. We are also recognizing that perhaps there is some overlap. Perhaps there are especially bad 2-hose units, and especially good 1-hose units.

Based on all of the evidence, I'm going to say that the average 1-hose unit does represent the minimum efficiency needed for cooling in an average consumer's use-case -- i.e. it is sufficient for their needs.

When I consider what would make a 2-hose unit good or bad, I suspect it has a lot to do with how much of the design is built around the fact that there are 2-hoses.

In your proposal, we simply add a 2nd hose to a unit that was otherwise designed functionally as a 1-hose unit. Let's consider where that might be plotted on this graph.

I'm going to claim based on vague engineering intuition / judgment / experience that it goes right here.


 

If I am right about where this proposal falls against the competition, then here's what we've done:

  1. This is not a 1-hose unit any more. Despite it being more efficient than the average 1-hose units, and only slightly more expensive, consumers looking at 1-hose units (because they are concerned about cost) will not see this model. The argument that it is "only $20 more expensive" is irrelevant. Their search results are filtered, they read online that they wanted a one-hose unit, this product has been removed from their consideration.
  2. This is a bad 2-hose unit. It is at the bottom of the efficiency scale, because other 2-hose units were actually designed to take full advantage of the 2-hoses. They will beat you on efficiency, even if they cost more. Wirecutter will list this in the "also ran" when discussing 2-hose units, "So and so sells a 2-hose model, but it was barely more efficient than a 1-hose, we cannot recommend it".
  3. A consumer looking at 2-hose units is already selecting for efficiency over cost, so they will not buy the "just add another hose" 2-hose unit, since it is on the wrong end of the 2-hose distribution.
  4. You will acquire a reputation as the company that sells "cheap" products -- your unit is cheaper than other 2-hose units, but isn't better because it wasn't designed as a 2-hose unit, and it was torn apart by reviewers.
  5. Fixing this inefficiency requires actually designing around 2-hoses, which likely results in something like this


 

"Minimum viable", in the context of a "minimum viable product" or MVP, is a term in engineering that represents the minimal thing that a consumer will pay to acquire. This is a product that can actually be sold. It's not the literal worst in its category, and it has a clear supremacy over cheaper categories. This is also called table stakes. Reviewers will consider it fairly, consumers will not rage review it, etc.

However, it's probably also a lot more expensive than the hypothetical "only $20 more" that has been repeatedly stated.

Even in the scenario where a reviewer does consider the "just add another hose" model when viewing one-hose units, we've already established that the one-hose unit is cheaper (by $20! if it's a $200 unit, that's 10%), and that the average 1-hose unit is sufficient for some average use-case. Therefore the rational consumer choice is to buy the cheaper one-hose anyway, because it's irrational to pay more for efficiency that isn't needed![1][2]

  1. ^

    The exception here is some hypothetical consumer who knows, for a fact, that their unique situation requires a two-hose unit, e.g. they tried a one-hose unit already and it was insufficient.

  2. ^

    There's also an argument here that a rational option is to buy a 1-hose unit, and then if you need slightly more efficiency, just buy & wrap the 1-hose with insulation, as described here. This allows the consumer to purchase at the lower price point and then add efficiency if needed for the cost of the insulation. It's unclear to me that the "just add another hose" AC would still perform better than an insulated 1-hose.

As a concrete example of rational one-hosing, here in the Netherlands it rarely gets hot enough that ACs are necessary, but when it does a bunch of elderly people die of heat stroke. Thus, ACs are expected to run only several days per year (so efficiency concerns are negligible), but having one can save your life.

I checked the biggest Dutch-only consumer-facing online retailer for various goods (bol.com). Unfortunately I looked before making a prediction for how many one-hose vs two-hose models they sell, but even conditional on me choosing to make a point of this, it still seems like it could be useful for readers to make a prediction at this point. Out of 694 models of air conditioner labeled as either one-hose or two-hose,

3

are two-hose.

This seems like strong evidence that the market successfully adapts to actual consumer needs where air conditioner hose count is concerned.

I must admit I was surprised by the statistics here. It is true if you only use the air conditioner few days a year, the energy efficiency is not important. However, the cooling capacity is important. I think many people are using efficiency to mean cooling capacity above. Anyway, let's say the incremental cost of going from one hose to two hoses is $30. From working on Department of Energy energy efficiency rules, typically the marginal markup of an efficient product is less than the markup on the product overall (meaning that the incremental cost of just adding a hose is less than the $20 of buying it separately). It is true that with a smaller area for the air to come into the device with a hose, the velocity has to be higher, so the fan blades need to be made bigger (it typically is one motor powering two different fan blades on two sides, at least for window units). But then you could save money on the housing because the port is smaller. The incremental cost of motors is low. Then if the air conditioner cost $200 to start with, that would be 15% incremental cost. Then let's say the cooling capacity increased by 25% (I would say it actually does matter that a T-shirt was used, which would allow room area and instead of just outdoor air, so it probably would be higher than this). What this means is that the two hose actually has greater cooling capacity per dollar, so you should choose a small two hose even if you don't care about energy use at all. Strictly this is only true with no economies of scale, which is not a great assumption. But I think overall it will hold. Another case this would break down is if a person were plugging and unplugging many times, but I don't think that's the typical person. So I suspect what is going on is that people don't realize that the cooling capacity of the one hose is actually reduced more than the cost, so they should just be getting a smaller capacity two hose unit (at lower initial cost and energy cost).

There is a broader question here of whether there should be energy efficiency regulations. If people were perfectly rational and had perfect information, we would not need them. But not only are the incremental costs of energy efficiency regulations found to be economically beneficial by the US Department of Energy (basically a good return on investment), but a retrospective study found that the actual incremental cost of meeting the efficiency regulations was about an order of magnitude lower than predicted by the Department of Energy! So I think there's a very strong case for energy efficiency regulations.

This is a great comment. The graphs helped a lot.

It was obvious to you (visible to the consumer), prior to the experiment, that the two-hose model was superior to the one-hose model. Even after running a controlled experiment that demonstrated that both A/Cs were able to cool his apartment to a comfortable temperature, you described the one-hose model as "very shitty," showing that even this experiment was not powerful enough to convince you that both models would suit your needs just fine. Because it's so obvious to you that two-hose A/Cs are superior to one-hose A/Cs, you'll probably unnecessarily overpay for the two-hose model the next time you need an A/C.

Obviously I disagree, but I'd still say that's a pretty fair interpretation.

I just want to say that I found this comment personally helpful.

This is the problem with how the rationalist community approaches the concept of what it means to "make a rational decision" perfectly demonstrated in a single debate. You do not make a "rational decision" in the real world by reasoning in a vacuum.

Something about this seems on point to me. Rationalists, in general, are much more likely to be mathematicians, than (for instance) mechanical engineers. It does seem right to me that when I look around, I see people drawn to abstract analyses, very plausibly at the expense of neglecting contextualized details that are crucial for making good calls.  This seems like it could very well be a bias of my culture.

For instance, it's fun and popular to talk about civilizational inadequacy, or how the world is mad. I think that is pointing at something true and important, but I wonder how much of that is basically overlooking the fact that it is hard to do things in the real world with a bunch of different stakeholders and a confusing mistakes. 

In a lot of cases, civilizational inadequacy can be the result of engineers (broadly construed) who understand that "the perfect is the enemy of the good", pushing projects through to completion anyway. The outcome is sometimes so muddled to be worse than having done nothing, but also, shipping things under constraints, even though they could be much better on some axes is how civilization runs.

Anyway, this makes me think that I should attempt to do more engineering projects, or otherwise find ways to operate in domains where the goal is to get "good enough", within a bunch of not-always crisply-defined constraints. 

First of all, I agree with the gist of your comment.

That is uncomfortably hot.

I...do not agree.  I keep my room temperature 72-74.

keep my room temperature around 68-70 F ish. The internet tells me that this is actually the definition of a "comfortable room temperature"

Going from first four google results for "what is comfortable room temperature":

WHO according to wikipedia: 64-75

www.cielowigle.com: 68-72

www.vivint.com: 68-76

www.provicincialheating.ca: 68-76

 

Seems like both of our preferred temperatures are consistent with "normal human being".

I'll edit the range, and note that "uncomfortably hot" is my opinion. Rest of my analysis / rant still applies. In fact, in your case, you don't need need the AC unit at all, since you'd be fine with the control temperature.

My current best guess is that the estimate in my initial comment was correct and that for a typical case the efficiency loss is roughly 25-30% (e.g. when cooling from 85 to 70 in CA). I stand by the claim that your original post's language was hyperbolic for an effect of this size, that your theoretical reasoning turned out to be wrong (e.g. you called my estimate "obviously ridiculous"), and that you are misunderstanding many portable AC customers' preferences and the main costs from a second hose.

Post-morteming my estimate, I made two errors that happened to cancel out. I overestimated the temperature of exhaust by looking at this online comment (though I also think your AC might be on the unusually bad end), and I overlooked a crucial consideration raised by denkenberger here that reduces the efficiency loss ~2x.

ETA: actually it seems like humidity is also quite a large consideration, maybe increasing the efficiency loss by 1.5x. So now my best guess is more like 35-40%, significantly higher than the 25-30% estimate.

The 25-30% result is roughly the impression I got by googling 1-hose vs 2-hose AC. I don't think the experiment results presented in this post meaningfully change my best guess. I could still easily imagine that my guess is wrong, though I may not reply to further comments.

This was actually a kind of fun test case for a priori reasoning. I think that I should have been able to notice the consideration denkenbgerger raised, but I didn't think of it. In fact when I stared reading his comment my immediate reaction was "this methodology is so simple, how could the equilibrium infiltration rate end up being relevant?" My guess would be that my a priori reasoning about AI is wrong in tons of similar ways even in "simple" cases. (Though obviously the whole complexity scale is shifted up a lot, since I've spent hundreds of hours thinking about key questions.)

Note that some of the difference between these numbers comes from me stating (1-hose efficiency) / (2-hose efficiency) and John stating (2-hose efficiency) / (1-hose efficiency) in this post. This comment talks about (1-hose efficiency) / (2-hose efficiency) since those are the numbers we were discussing for most of the comment thread, including the initial comment that I'm reaffirming. Other ways of stating "25-30% efficiency loss" are: a 1-hose AC cools by 25-30% less than a 2-hose AC, or wastes 25-30% of the energy it uses, or requires 33-43% more energy than a 2-hose AC to achieve the same result.

I overlooked a crucial consideration raised by denkenberger here that reduces the efficiency loss ~2x.

Thanks-it looks like you are referring to the net infiltration flow rate impact on the building. But there was also the consideration of humidity, and I did not see any humidity measurements in the data, so we are not able to resolve that one. Humidity sensors are fairly cheap, but notoriously unreliable. But one could actually measure the amount of water condensed pretty accurately to get an idea how much of the cooling of the air conditioner is going to condensing water versus cooling air (sensibly).

I didn't know how to estimate this effect but I was guessing the total impact on the bottom line is much smaller than the factors of 2 from the other factors, at least in CA (though it's definitely another factor I overlooked). I'm comfortable treating 85 to 70 in CA as a typical use case to benchmark efficiency for a portable AC.

That guess is coming from the rough sense that dehumidifiers use much less energy than air conditioners. I don't know if that's right and reflects that dehumidifying is pretty cheap (at least in CA), or if dehumidifiers are just normally used for relatively small humidity changes, or if I'm wrong about relative energy use. I also have a sense that when I run an AC it just doesn't produce very much water (and that the energy cost is like ~0.6kWh per liter).

Actually this seems pretty non-trivial to estimate. Do you know reasonable ballpark figures?

If you want to geek out on this you can use a psychrometric chart. For instance, if outdoor air is 85F and 50% relative humidity (RH), that's an enthalpy of about 35 BTU/lb of dry air. Typical exit air conditions on the cool side of an air conditioner are ~50F and 100% RH, so ~20 BTU/lb of dry air. The dehumidification portion would be going to 85F and ~30% RH or ~29 BTU/lb of dry air, so ~40% of the heat removed is in the form of condensing water (latent). This means you would take the sensible part and multiply by about 1.7 to get the total load on the air conditioner. If you were not drawing in outdoor air, the latent load would be much lower. So overall I think you're right that in CA the humidity correction is not as big as the other factors.

I don't think I can follow your calculation. My version would be:

  • You are intaking hot wet outside air (wet from both high RH and high temp). You need to cool it and condense a bunch of water out of it. There’s some ratio that’s fixed by the humidity and temperature of the outside vs inside air. I think that's what you are saying is around 40%? I think actually the number you are giving isn't what quite this calculation needs, but I'll run with it anyway.
  • If all the heat was coming in from outside air (either before turning on AC or from infiltration), then you’d have a fixed ratio of latent to sensible heat removed, so the ratio wouldn’t depend on how much additional infiltration you caused, and we could just ignore humidity when thinking about the efficiency loss.
  • But in fact some of the heat is coming in from other channels. I guess the other big one is sunlight through windows. That heat doesn’t come with any more humidity. Extra infiltration from 2-hose AC increases how much latent heat you need to remove per unit of sensible heat, by increasing the relative importance of infiltrated air vs sunlight and other sources of heat. So if we just calculate how much extra sensible heat you have to remove, we'll underestimate the efficiency loss.
  • The total extra infiltrated heat is about 25% of what the AC removes. At equilibrium, that’s 25% of all the heat gain in the house. If 13% of heat gain is normally from infiltration, then replacing that with 75% normal heat and 25% new infiltration would increase the fraction of heat from infiltration all the way to 35%. (I was super wrong about the 13% going in, I was expecting 25-50%!)
  • So per unit of heating, you are also increasing the fraction of heat coming from infiltrated air by 22%.
  • For the heat coming from infiltration, the extra cost of dehumidifying is about 2/3 of the sensible heat removed. So per unit of sensible heat removed, you need to remove an additional 15% of a unit of latent heat.
  • If the AC exhaust was more humid than the inside, then this would be lower, but my sense is that AC exhaust is basically as dry as indoor air?
  • So the net effect would be to take you from 25% efficiency loss (ignoring humidity) up to roughly 40% efficiency loss, which is pretty huge.

That was a super confusing calculation, definitely beyond my pay grade. I assume I got a ton of numbers/calculations and wrong, that there were much simpler ways to do it, and that this overall computation is likely to be conceptually confused in one or more ways. So I’d be pretty curious for your bottom line estimate or intuition about where it should have ended up.

(But I also understand if you want to stop talking about AC and put this thread to rest...)

I would say that is basically right. AC exhaust is about as humid as indoor air. The fraction of the heating load in the summer due to infiltration really does depend on how tight your building construction is. With the numbers Jeff was assuming for a very old house, infiltration would be a much larger percentage. There are some other sources of heat in a house that come with humidity, such as people and showers, but overall it is much less humidity than bringing in outdoor air (there is heat conduction through the walls, electricity use of lighting and appliances, etc.). So that might mean that it would take you from a 25% efficiency loss (ignoring humidity) up to a 35% efficiency loss, which is still a big deal. But I'm not sure if 85°F in California typically corresponds to 50% relative humidity.

This was actually a kind of fun test case for a priori reasoning. I think that I should have been able to notice the consideration denkenbgerger raised, but I didn't think of it. In fact when I stared reading his comment my immediate reaction was "this methodology is so simple, how could the equilibrium infiltration rate end up being relevant?" My guess would be that my a priori reasoning about AI is wrong in tons of similar ways even in "simple" cases. (Though obviously the whole complexity scale is shifted up a lot, since I've spent hundreds of hours thinking about key questions.)

This idea -- that you should have been able to notice the issue with infiltration rates -- is what I've been questioning when I ask "what is the computational complexity of general intelligence" or "what does rational decision making look like in a world with computational costs for reasoning".

There is a mindset that people are simply not rational enough, and if they were more rational, they wouldn't fall to those traps. Instead, they would more accurately model the situation, correctly anticipate what will and won't matter, and arrive at the right answer, just by exercising more careful, diligent thought.

My hypothesis is that whatever that optimal "general intelligence" algorithm[1] is -- the one where you reason a priori from first principles, and then you exhaustively check all of your assumptions for which one might be wrong, and then you recursively use that checking to re-reason from first principles -- it is computational inefficient enough in such a way that for most interesting[2] problems, it is not realistic to assume that it can run to completion in any reasonable[3] time with realistic computation resources, e.g. a human brain, or a supercomputer.[4] 

I suspect that the human brain is implementing some type of randomized vaguely-Monte-Carlo-like algorithm when reasoning, which is how people can (1) often solve problems in a reasonable amount of time[5], (2) often miss factors during a priori reasoning but understand them easily after they've seen it confirmed experimentally, (3) different people miss different things, (4) often if someone continues to think about a problem for an arbitrarily long people of time[6] they will continue to generate insights, and (5) often those insights generated from thinking about a problem for an arbitrarily long period of time are only loosely correlated[7]

In that world, while it is true that you should have been able to notice the problem, there is no guarantee on how much time it would have taken you to do so.

  1. ^

    The "God algorithm" for reasoning, to use a term that Jeff Atwood wrote about in this blog post. It describes the idea of an optimal algorithm that isn't possible to actually use, but the value of thinking about that algorithm is that it gives you a target to aim towards.

  2. ^

    The use of the word "interesting" is intended to describe the nature of problems in the real world, which require institutional knowledge, or context-dependent reasoning.

  3. ^

    The  use of the word "reasonable" is intended to describe the fact that if a building is on fire and you are inside of it, you need to calculate the optimal route out of that burning building in a time period that is than a few minutes in length in order to maximize your chance of survival. Likewise, if you are tasked to solve a problem at work, you have somewhere between weeks and months to show progress or be moved to a separate problem. For proving a theorem, it might be reasonable to spend 10+ years on it if there's nothing necessitating a more immediate solution.

  4. ^

    This is mostly based on an observation that for any scenario with say some fixed number of "obvious" factors influencing it, there are effectively arbitrarily many "other" factors that may influence the scenario, and the process of deterministically ordering an arbitrarily long list and then preceding down the list from "most likely to impact the situation" and "least likely to impact the scenario" to manually check if each "other" factor actually does matter has an arbitrarily high computational cost.

  5. ^

    Feel free to put "solve" in quotes and read this as "halt in a reasonable time" instead. Getting the correct answer is optional.

  6. ^

    Like mathematical proofs, or the thing where people take a walk and suddenly realize the answer to a question they've been considering.

  7. ^

    It's like the algorithm jumped from one part of solution space where it was stuck to a random, new part of the solution space and that's where it made progress.

My take is: you shouldn’t expect to get everything right when you try to reason about a moderately complicated system abstractly, no matter how smart you are. You’d like to have a lot of practice so that you can do your best, can get a sense for what kinds of things you tend to miss and how they change the bottom line, can better understand what the returns to thinking are typically like, and so on. This was a fun and unusually self-contained example, where we happened to miss an important and very clean consideration that can be appreciated with very little domain knowledge. (I think realistic cases are usually much more of a mess.)

In this case, I feel pretty confident that I would have noticed this consideration if I thought about the question for a few hours (and probably less), and I think that it would become obvious if you tried to write out your reasoning sufficiently carefully. But even if I spend hundreds of hours thinking about some issue with AI, I expect to miss all kinds of important and obvious-in-retrospect considerations in a roughly analogous way. (This is related to my view that verification is easier than generation.)

I don’t think that means we shouldn’t try to figure things out by thinking about them. Thinking about what’s going on is an important part of how to get to correct answers quickly and an important complement of empirical data (you need to think when empirical data is hard to come by, to help interpret history and the results of experiment, to prioritize experimentation, etc.).

I’m not sure if your comment is disagreeing with any of this. It sounds like we’re on the same page about the fact that exact reasoning is prohibitively costly, and so you will be reasoning approximately, will often miss things, etc.

Of course, I think even if you successfully notice every on-paper consideration, there are still likely to be messy facts about the real world that you either didn’t know or obviously had no hope of capturing in a model that’s simple enough to reason about. That said, I think that reasoning in practice is basically never purely in this regime (and if you do literally get to this regime for a question, in some sense you’ve probably spent too long thinking about the question relative to doing something else), so in practice wrong conclusions are almost always due to a combination of both "not knowing enough" and “not thinking hard enough” / “not being smart enough.”

I’m not sure if your comment is disagreeing with any of this. It sounds like we’re on the same page about the fact that exact reasoning is prohibitively costly, and so you will be reasoning approximately, will often miss things, etc.

I agree. The term I've heard to describe this state is "violent agreement". 

so in practice wrong conclusions are almost always due to a combination of both "not knowing enough" and “not thinking hard enough” / “not being smart enough.”

The only thing I was trying to point out (maybe more so for everyone else reading the commentary than for you specifically) is that it is perfectly rational for an actor to "not think hard enough" about some problem and thus arrive at a wrong conclusion (or correct conclusion but for a wrong reason), because that actor has higher priority items requiring their attention, and that puts hard time constraints on how many cycles they can dedicate to lower priority items, e.g. debating AC efficiency. Rational actors will try to minimize the likelihood that they've reached a wrong conclusion, but they'll also be forced to minimize or at least not exceed some limit on allowed computation cycles, and on most problems that means the computation cost + any type of hard time constraint is going to be the actual limiting factor.

Although even that, I think that's more or less what you meant by

in some sense you’ve probably spent too long thinking about the question relative to doing something else

In engineering R&D we often do a bunch of upfront thinking at the start of a project, and the goal is to identify where we have uncertainty or risk in our proposed design. Then, rather than spend 2 more months in meetings debating back-and-forth who has done the napkin math correctly, we'll take the things we're uncertain about and design prototypes to burn down risk directly.

I bought a single-hose AC unit. I knew two-hose units existed, and that a two-hose design intuitively seems to be the way to go for good thermodynamic reasons, but I did it anyway. This was mostly, as I remember, for four reasons:

  1. Apparent air-conditioner experts seemed to think the one-hose models worked OK and that hot air infiltration was a manageable problem. Especially for the cool-a-single-inhabited-room application.
  2. The one-hose models were significantly cheaper. This in turn translated to solving my problem significantly sooner, because I did not need to save for longer to afford it.
  3. I was already going with a portable unit rather than a window unit, because I anticipated needing to move it every day, and the one hose model had... one hose to hook up and unhook every time.
  4. I only recognized the need to buy an air-conditioner on hot days. On cool days, I did not feel like spending the money. So when I actually made my purchasing decision, I bought a plausible and available unit over the optimal unit, to ensure that a unit was bought.

On the design side, while clearly it would be better not to suck warm air into the room if you don't have to, the engineers are up against competing problems:

  1. You are working with a much smaller intake. Compare the intake on the back of your unit to the area of the output hose. Even your makeshift intake duct has a much higher area than the normal hoses. When the intake is smaller, you need faster flow, and more fan to make that happen.
  2. You need more internal ducting inside the unit. You can't just draw room air more or less straight over the heat exchanger like in a refrigerator, and then blow it out; you need to internally go from a small intake hose to a big heat exchanger to a small output hose, all without mixing with the room air, and you need another completely separate pathway for the room air.
  3. You still need a room air intake.
  4. You need an additional fan. The room air and the outdoor air can't be pushed by the same fan if they are isolated loops. The additional fan makes additional noise. A single-hose system can use one fan and divide the air for the two paths.
  5. You have to fit the intake into the window opening in addition to the exhaust. You can't usually use as much window as you are covering in your experimental setup, it has to fit in a much smaller area or people will reject the solution. My unit had two plastic pieces that nest and slide over each other to adapt to the width of your window, with an opening for the exhaust to snap into. If you tried to add another opening to that design, it would be obstructed at small window sizes, so you would either need to accept only fitting large windows, find some other way to adapt to widows of varying width, or make the hoses even narrower and the flow even faster.
  6. You are dumping the heat into warmer air. This is harder to do than dumping heat into cooler air. You might need a better heat exchanger.

As it was, my single-hose unit was bumping up against size, weight, cost, and noise limits. While it might be able to do more cooling per watt if given another hose, it might also then not meet other design constraints, and thus not actually solve my problem.

My tentative conclusion from looking at these results is that they're inconclusive because neither metric captures what consumers care about? As others have pointed out, the ability to cool down a room more than you need isn't necessarily impressive.

More relevant may be something like

  • days in year in which the one-hose model isn't good enough
  • amount of time you have to leave it on
  • level of noise it makes
  • energy consumption

(Note that I'm not saying there was an easy way to measure any of these.)

You could read the post's results as suggesting that a two-hose model performs better according to those metrics, but it's not trivially true.

Measuring energy consumption is cheap and easy with a $30 Kill-a-Watt: https://www.amazon.com/P3-P4400-Electricity-Usage-Monitor/dp/B00009MDBU/ref=sr_1_3?crid=2F9E51CGW3ZDS&keywords=kill+a+watt&qid=1656014636&sprefix=kill+a+watt%2Caps%2C107&sr=8-3

I propose a follow-up experiment to measure daily energy consumption alternating hose configuration with the same set temperature.  The previous experiment tried to answer "how much does maximum cooling power change between configurations," while here we would answer "how much does efficiency changes between configurations."

Potential issues:

  1. If this causes your unit to run at different power levels, you would also capture any efficiency change based on the power level, but I would guess your unit regulates simply as on/off (check the (instantaneous) power consumption with the Kill-a-Watt to be sure).
  2. If one configuration works faster, it may not do as much work on the other side of the room before the unit senses "cool," and turns off.  Fans in the room increasing circulation will mitigate this.
  3. I don't know how many days you would need for good statistics to smooth over all the day-to-day environmental changes.
  4. You should also analyze weekend vs. weekday, and possibly exclude one.
  5. You will need to monitor that both configurations actually are up to the task.

I'm not sure I want to register an advance prediction, but if OP agrees to do this, I will at least put some thought in towards one.

One relatively simple way to correct for the problem: use the temperature from the control test as the baseline temperature, rather than using the outdoor temperature as baseline. If we do that, then with the fan on high the results are:

AC cools the room by 2.6°F (1.4°C) relative to control with one hose

AC cools the room by 5.1°F (2.8°C) relative to control with two hoses (despite the outdoor temperature being slightly higher during the two-hose test)

How do you get these numbers? Shouldn’t they be 5.6°F and 9.1°F respectively?

I took:

  • (last-measured average indoor temperature with 2 hoses (63.8°F)) - (last-measured average indoor temperature in control condition (68.9°F)), and
  • (last-measured average indoor temperature with 1 hoses(66.3°F)) - (last-measured average indoor temperature in control condition (68.9°F))

There are a few different calculations one could reasonably do instead; I expect they yield qualitatively-similar numbers.

I’m referring to this:

Temperature difference between outdoor and indoor (higher is better), in each test:

  • Low fan: 20.6°F (11.4°C) with one hose, 22.7°F (12.6°C) with two hoses
  • High fan: 18.7°F (10.4°C) with one hose, 22.2°F (12.3°C) with two hoses
  • Control: 13.1°F (7.3°C)

How does that square with the other numbers you just gave…?

The numbers you quote were all relative to outdoor temperatures. The numbers I just gave ignore the outdoor temperature. They are different mainly because the outdoor temperature changed somewhat over the course of the day.

Since the outdoor temperature was lower in the control, ignoring it will inflate how much the two-hose unit outperforms by bringing the effect of both units closer to zero. If we assume the temperature difference the units and the control produce are approximately constant in this outdoor temperature range, then the difference to control would be 3.1ºC for the one hose unit and 5ºC for the two hose unit if the control outdoor temperature was the same, meaning two-hose only outperforms by ~60% with the fan on high, and merely ~30% with the fan on low.

I didn't even think to check this math, but now that I've gone and tried to calculate it myself, here's what I got:

  INSIDEΔINSIDE (CONTROL) 
AVERAGE OUTSIDE86.5   
AVERAGE ONE HOSE Δ19.6566.856.55 
AVERAGE TWO HOSE Δ22.4564.059.35 
CONTROL Δ13.173.4  
     
   1.42ΔTWO/ΔONE

EDIT: I see the issue. The parent post says that the control test was done at evening, where the temperature was 82 F. So it's not even comparable at all, imo.

The parent post says that the control test was done at evening, where the temperature was 82 F. So it's not even comparable at all, imo.

+1 to this criticism, that's a very valid problem which people should indeed be suspicious about, although "not even comparable at all" is overstating it (especially since we know what direction that problem should push).

The thermal time constant of a building is around a day, so you should really be running each of these tests for more than a day (and correcting for differences in ambient conditions). Basically, the control should exceed the average ambient temp because of solar and internal (e.g. electricity consumption) gains. And see my other comment about doing something about humidity removal. Then we might actually have something rigorous (based on doing an experiment with fairly expensive equipment, I still had error bars around +/-1°C, so I don't think you have very much confidence at this point).

Noting for myself: I didn't make an explicit prediction, but I emotionally expected John to be vindicated by this experiment. My emotional prediction was wrong, and that seems good to notice, even if I don't do much further reflection.

The air conditioner was intended as an example in which a product is shitty in ways the large majority of consumers don’t notice, and therefore market pressures don’t fix it.

But they do: among air-air heat pumps, dual hose air conditioners exist (but one hose versus two hoses is a huge gain in convenience), as do window air conditioners which are better (for efficiency; they cannot be installed in all windows), as do heat pumps with split indoor and outdoor units, which are much better (but more expensive). And ground-source heat pumps, which are better still, exist as well (but are still more expensive upfront and often not subsidized by utility companies and governments like air-air heat pumps are; but, like regulations on the units and on the people installing them, this depends on location and there are places where they are widely used for heating and air conditioning). And simple fans, which are not even air conditioners, also exist. The market offers the entire range of possible tradeoffs between efficiency, convenience, and cost. And different consumers are using products in this entire range.

… though at the same time, a counter has incremented in the back of my head, and I do have a slight concern that I’m avoiding evidence against the “people don’t notice major problems” model.

You are avoiding evidence against that model, but not in the way you think. It’s because you were looking at air conditioner ratings on Amazon, which gives you an impression of consumer preferences that is biased for convenience.

There are a lot of people using more efficient systems for air conditioning that they also use for heating. Searching for air conditioners on Amazon will give you a distorted picture because it selects against systems that are also meant for heating and systems that usually require professional installation—these are the most efficient systems, so searching on Amazon gives you a strong selection bias against efficiency and in favor of convenience. But that doesn’t mean that the majority of consumers don’t notice what products are more efficient. It’s just that Amazon search results for air conditioners aren’t representative of the market: the most efficient air conditioners aren’t marketed as air conditioners and consumers don’t purchase them on Amazon.

The most common product used by worldwide consumers, the ductless mini split, is highly efficient.  In most circumstances it is likely cheaper to operate than geothermal because of lower installation and equipment costs.  I think you're onto something here.  That consumers who need a temporary system sporadically or they need the cheapest possible system benefit from 1-hose.  And if they need efficiency, air-air and geothermal are much more efficient. 

 2 hose is less convenient and more expensive, and marginally more efficient.  It's niche is apparently just not very deep.  Window units are superior to 2 hose in every stat except the type of windows they work with and they are more visible from the outside of a building.

You've convinced me of your main points.

I recently noticed another complication for this analysis.

I've got a single hose AC that's closer to the window than yours, with some gaps near the hose that let some air in from the outside. That means I've accidentally set it up to partly act as if it had two hoses. At one point the AC reported its intake temperature at 86F, while a distant part of the room was 80F, versus around 95F outside. [The AC is nowhere near as effective as these number suggest. Mostly the room has enough insulation to delay the effects of the outside temperature by hours.]

Yup, I think that's plausibly an issue for my setup too, although I didn't park the AC too close to the window. At least we know what direction that problem should push: it should make the one-hose more effective in my test than it would be for someone with a better-sealed window.

Thanks for doing this, but this is a very frustrating result. Hard to be confident of anything based on it.

I don't think treating the 'control' result as a baseline is reasonable. My best-guess analysis is as follows:

Assume that dTin/dt = r ((Tout - C) - Tin)

where

  • Tin is average indoor temperature
  • t is time
  • r is some constant
  • Tout is outdoor temperature
  • C is the 'cooling power' of the current AC configuration. For the 'off' configuration we can assume this is zero.

r obviously will vary between configurations, but I have no better idea than pretending it doesn't so that we can solve for it in the control condition and then calculate C for the one-hose and two-hose conditions.

Results?

Using the average temperature difference to approximate dTin/dt as constant, we get:

In the 'off' configuration: 0.5 hours * dTin/dt = 0.5 hours * r * (14 degrees) = 0.889 degrees

Giving r = 0.127 (degrees per degree-hour)

In one-hose: 1 hour * dTin/dt = 1 hour * r * (19.1111 - C) = 0.3333 degrees

Giving C = 16.486 degrees

In two-hose: 0.5 hours * dTin/dt = 0.5 hours * r * ( 22.944 - C) = -0.555

Giving C = 31.693 degrees

Also finding that the two-hose version has roughly double the cooling power!

When I purchased an air conditioner recently, I paid extra for a fancy Media U-Shaped model. While this model is reportedly more energy efficient than a typical window AC, I chose it only because it was also quieter than a typical AC. I think I assumed the claims about energy efficiency were going to be overblown and not actually impact my bill in a visible way.

Surprisingly it really has a big impact, though I suspect the unit it replaced was particularly inefficient. If I had known I'd save more than $15 a month I would have prioritized it. (Electric costs have spiked recently in my area.)