Epistemic Status: This represents fairly early work. The terminology isn't at all set in stone.
Scholarship Status: I've spent several hours attempting to investigate this topic, and in the past have spent quite a while researching Information Theory and Applied Information Economics. I'm sure I'm missing very valuable literature, but I can't find it. Comments well appreciated.

Thanks to David Manheim and Nuño Sempere for comments on this

Introduction

As anyone who’s read an arduous textbook knows, learning comes with costs as well as benefits. As anyone who’s decided against reading at least one textbook knows, often the expected costs outweigh the expected benefits.

The interplay of the costs and benefits of information leads to an environment of efficient trade-offs. Tweets and infographics often do most of the work of popular nonfiction books. Lossy computational compression schemes offer good-enough quality for greatly reduced storage costs. Common verbal communication is typically far less precise and rigorous than formal philosophical proofs, but is far more practical.

As the old saying goes “All models are false, but some are useful.” A cost-benefit approach would say, “Models represent trade-offs in situations where absolute accuracy doesn’t justify its cost.” I think this basic fact; that educational materials, models, theories, and language all represent trade-offs between accuracy and costs, should really be fairly obvious and commonly acknowledged by now.

Interestingly enough, fairly little work around Information Theory seems to have gone deep into modeling these trade-offs. It’s often assumed that information is either freely absorbed, or in some situations limited to a particular fixed communication channel. Some work around Value of Information analyses estimate the cost one might be interested in paying for specific information, but often makes some very large assumptions for some particular settings.

I’m interested in developing better intuitions and vocabulary around these tradeoffs. Here are some initial attempts. I’m sure I’m missing a lot of great work, but I’ve had problems finding it. If you have thoughts or recommended references, I’d very much appreciate you sharing them. 

A Very Simple Model

We’re going to begin with models of textbooks; not because these are particularly important, but because I think these are particularly easy to intuit. These models will later be extended to more interesting areas.

Say you begin reading an interesting textbook to study for a test in one month. This is one of your reading materials among several, so you don’t have time to read and fully process every single word in said book. 

You might begin by skimming the book, then proceed to read what seem like the most important sections. You’ll continue it until you become convinced that your time would be better spent on different materials. More succinctly, you continue reading it until the expected costs begin to outweigh the expected benefits of marginal investment.

This is a classic Microeconomics case of diminishing marginal utility. Below is a representation of the corresponding curve. 

Note that the expected benefit and expected cost scales are equivalent. Therefore, at all points diagonal between them (the Indifference Line), the expected benefit is equal to the expected cost, meaning that the total net value is 0.

As is true for situations of marginal utility, assuming you are rational, you’d aim to stop reading the book around when the tangent of the curve is at 45%, or, when the marginal costs begin to exceed the marginal benefits. 

We can draw a “Stopping Point” at this point. The line directly below it, to the indifference line, represents the net benefit, or “value” achieved, if learning stops at this point. This point is also sometimes called the “point of maximum yield.” 

This book represents information, but the fact that there are known learning costs represents complexity not typically mentioned in models of information. We can call models of bundles of information that require learning costs, as information assets

Basic Functions

Information assets are items that represent trade-offs of information gained and cost to particular actors. We could imagine representing them with a few programming functions. (Here represented with Haskell style definitions)

information_aquisition_fn :: information_asset -> agent --> cost --> information
information_benefit_fn :: agent --> information --> expected_benefit 

Alternatively, you could either combine these or skip the intermediate step. 

information_benefit_fn :: information_asset -> agent --> cost --> expected_information_benefit

One could get more fancy if you were to represent a space of potential ways of converting costs to information. The equations above assume that the best option is chosen, but sometimes you might desire to model this extra complexity.

The details of what agents, costs, and information are, are abstracted here. I imagine that there are many potential definitions that would all work well enough to be interesting.

The important thing is that this main information_benefit_fn function should generally represent diminishing marginal utility, and should help us represent the most important aspect of information assets. 

We could of course also represent these functions as math equations, but I both am more experienced in programming, and also prefer thinking in programming for things like this.

Comparisons of different information assets

 

Now, we can compare the marginal value curves of different information assets. The green line represents a good book, the pink one an infographic. As you can see, the infographic produces great returns for a short period of time, but then levels off, as there’s typically not all too much to learn from an infographic. In comparison, the good book takes longer to provide the same amount of value, but in this case winds up producing more in total.

The bad book, on the other hand, might have a lot of potential benefit, but has negative marginal value, so would never be started. Therefore, in this model, the bad book has high potential benefit, but zero expected value.

 Potential 100% BenefitBenefit at Stopping PointCost at Stopping PointExpected Value
Good Book120703040
Bad Book1000x0
Infographic3030525



 

 

 

 

 

 

What do you think most available information assets would look like, on this graph? Well, if costs include opportunity costs, in an environment with many potential information assets, the vast majority would represent zero value. This might look like the curves in the following diagram.

Benefit vs. Value

Let’s now introduce a new plot, one of benefit vs. value. This can be valuable because it can hint at something like efficiency; how much of the benefit is realized as value? 100% efficiency is clearly the maximum, so at any particular benefit, the max potential value (assuming we might be able to have zero costs cost) is equal to that benefit. We’ll just go ahead and color off the area where value is greater than benefit as an “impossible zone”. 

We’ll show blue dots for the stopping points of the information assets (the infographic and the book).

One very obvious question we might have is how to produce information assets that will lie on different parts of this graph. To help answer that question, here’s a simple graph that’s filled in red to represent the very roughly expected costs of producing something in each part of it.
 

It’s normally more work to produce a book than it is to produce an infographic, especially if both must be positive expected utility. As the expected potential benefit increases, it becomes increasingly difficult to maintain a high efficiency. This is because within any particular topic, some subsets of information are typically much more valuable than other subsets. You’re kind of fighting multiple marginal utility diminishments; not only can a user skim to effectively compress each area, they can also prioritize the particularly valuable areas. 

The top right area is labeled “systems of delegation” to represent the sorts of information assets that I expect to exist here. For these, it’s important to point out that the costs of using information are not only time costs, but they can also be monetary. If you want to achieve a lot of value from available information about a medical condition, you can either read a lot about that condition, or defer to a professional who’s already analyzed all of the relevant knowledge. In some cases, these delegation systems can be automated, so can represent very low costs.

Fixed Costs vs. Variable Costs

Imagine we’re set on writing a textbook for the purpose of helping 500 students via informational expected value. There are clear fixed costs associated with writing the book. We probably have a spectrum of how much effort we can put into the book. We can rush it by simply transcribing our previous lectures without any cleanup, or we can spend a lot of time figuring it out from scratch. 

We’ll describe writing the book as the “fixed cost”, and the value of students (n) when they read the book as the “fixed benefit”.

Again, we have a diminishing marginal utility curve. The aim is to find some convenient balance between costs and benefits. The total resulting value is the difference between the total value received from the students (note that this takes the costs and benefits for each student into account), and the costs spent on writing the book.

This curve might be a bit different from the ones above, because there might well be a period at the start where no textbook would justify its cost.[1] Maybe there’s an upfront cost for just having anything printed, that requires a substantial benefit to overcome. But eventually diminishing marginal utility will come into play.

One important thing to consider here is that there are multiple total stages of fixed costs and benefits to variable costs and benefits. So there are many sorts of degrees of variability.

  • Information in a field gets converted into many books.
  • Each book has many readers.
  • Each reader will recall the information they have read many times.

At each point of the pipeline, authors or readers must make estimates to best spend fixed costs to gain later variable benefits. Most of these trade-offs will involve considerations of informational efficiencies. 

Information Asset Supply Chains

Let’s imagine that a new useful-but-not-groundbreaking medical discovery has been made. It begins with data and understanding in a particular scientist’s lab.

At this point, even if the scientist were to make this information public, it would be effectively useless to consumers. Consumers would need to discover, translate, and process this information. The costs of doing so would far exceed the expected benefits.

There are many situations where information assets must undergo several transformations of conversions before enabling utility in particular groups of consumers. We can attempt to make diagrams of this process, as shown below. This is the basic value to benefit graph shown a few times above, but with the addition of a slightly different y-axis going down. As stated earlier, consumers won’t be expected to attempt negative-value endeavors, so in order to represent the costs and benefits of these, we need to make conjectures. In this case, we can estimate the net value (which would be negative) were they to do the work necessary to obtain 20% of the benefit of these assets. 

For example, it might take a somewhat average educated consumer 300 hours to learn how to analyze the results of an experiment themselves and take away 20% of the potential benefit they could get from it.

 

A medical experiment might result in “raw data”, which is particularly difficult to interpret. This is shown in the bottom right of the diagram. This raw data gets cleaned up and goes through several steps to eventually transform into an immediately valuable asset. 

Information assets below the zero-expected-value line basically represent unfinished or *capital goods*. Information assets at the starting point are raw goods. Assets above the line are finished goods.  

Information Asset Conversion Options

For a particular information asset on the above chart, we might imagine ways it could be transferred to different regions. Mostly common are moves to the top-left. Moves in the top-right quadrant, and occasionally in the bottom-right quadrant are possible too. Moves in the bottom-left quadrant are only caused by accident or malice. Below we label a few clusters that refer to common names that might be used to refer to moves in different directions.
 

Translation: The information asset is converted into a format more amenable to a user; for example, from French to German. It’s likely that some information is lost, but hopefully fairly little.

Compression: Some information is lost. Sometimes it’s information that’s completely irrelevant to later users (“lossless” compression), but normally the lost information will present at least some loss of potential benefit (“lossy” compression). 

Summarization: A type of compression that typically aims to present a small fraction of the total information.

New context: Additional information is introduced to help contextualize the asset. This might help make the information asset easier to digest, and also might make it more valuable.

Accuracy verification: A small amount of information is introduced, in the form of a check on the accuracy of the information asset. This builds justified trust in the asset.

Introduction of new, messy data: New information is introduced, with some additional potential benefit. However, it makes the information asset initially more costly than the immediate benefit. 

Obfuscation: No data is lost, but it requires additional work to interpret. For example, maybe the data is removed from being publicly accessible, and now requires hunting down the right bureaucrat to access. 

Beneficiation

I've looked for some time for a good word to describe the process outlined in the Supply Chains section above. I think my favorite so far is beneficiation.

According to the Wikipedia page,

In the mining industry or extractive metallurgy, beneficiation is any process that improves (benefits) the economic value of the ore by removing the gangue minerals, which results in a higher grade product (ore concentrate) and a waste stream (tailings). There are many different types of beneficiation, with each step furthering the concentration of the original ore.

I believe the word came about by turning "benefit" into a verb.

I like "beneficiation" because it seems both very to the point, and because it doesn't already have a more narrow term around information.

Regarding information assets, we can define beneficiation as:

Beneficiation (information assets): Any process or action that assists increasing the total value of an information asset.

Beneficiation here describes every part of the process of converting information into its final form, before it gets directly used for a decision. This includes:

  • Learning
  • Writing
  • Summarization
  • Data organization
  • Rewriting
  • Teaching
  • Automated data systems

I'm really not sure what the neatest mathematical models to represent value-add at each part of a beneficiation pipeline, but it feels like it should be rather simple. In theory, one should be able to identify bottlenecks, or particularly valuable or costly transition points.


Next Steps

This is fairly early and messy work. I'd be curious to clean up the terminology, and figure out how much further it can be taken, while also be cost-effective.

Here are some questions I still have:

  1. Can we make decent cost-benefit charts or tables of common types of information? (Books, courses, computer recommender systems)
  2. Can we make decent cost-benefit charts or tables to represent tradeoffs among abstract models (Newtonian mechanics vs. Special Relativity, formal sentences vs. context-dependent speech, data compression methods, etc)?
  3. Can we come up with elegant categorizations to describe information that's simply not worth the necessary beneficiation costs? There's definitely a lot of very interesting information out there that's useful in theory, but the costs to make it net-valuable are likely to exceed the expected benefits of doing so.
  4. Can these sorts of models help direct us regarding to what sorts of information assets we should emphasize going forward?
  5. Can information asset models be used to help describe human language and communication, both descriptively and prescriptively?

As I said before, this sort of issue seems highly general, important, and neglected, which really surprises me.

One area with related work is that of anytime algorithms. Anytime algorithms face clear trade-offs of computation time to accuracy, and thus can be modeled as a type of information asset. Justin Svegliato has recently done work on using diminishing marginal utility curves to make decisions around anytime algorithms. This post is a good summary, with some very similar diagrams to what I made above. 

Diminishing marginal utility curve of an anytime algorithm, by Justin Svegliato.

There's definitely other work out there that seeks to understand the costs vs. benefits of information, though from what I can tell, it generally seeks less to try to find economics-style models. Laura Schulz did some relevant work, as shown in this lecture

The Information Bottleneck method in Information Theory is also very related, along with other theoretical work on compression.

New to LessWrong?

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 7:51 AM

This is neglected yet potentially highly important, thank you!

Update: 
I think some of the graphs could be better represented with upfront fixed costs.

When you buy a book, you pay for it via your time to read it, but you also have the fixed initial fee of the book.

This fee isn't that big of a deal for most books that you have a >20% chance of reading, but it definitely is for academic articles or similar.

I recently looked more into the phrase "All models are wrong", and found that it seems to be much more aligned with my thinking here than I expected. 

From Wikipedia, discussing earlier, related work by the author (George Box)


2.3  Parsimony
Since all models are wrong the scientist cannot obtain a "correct" one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity. 
2.4  Worrying Selectively
Since all models are wrong the scientist must be alert to what is importantly wrong. It is inappropriate to be concerned about mice when there are tigers abroad.

(bolding added to refer to the correct section)

https://en.wikipedia.org/wiki/All_models_are_wrong