Expansive translations: considerations and possibilities

by ozziegooen6 min read18th Sep 202013 comments


Distillation & PedagogyInferential DistanceWorld ModelingWorld Optimization
A crowd probably best served by a wide variety of translations

TLDR: Language translation is a decent first step for written works, but the ideal looks more like an empathetic personal tutor. There’s a lot to do in-between, both in the near term with human labor, and the longer term with Machine Learning.

Epistemic Status: I’m not an experienced researcher in this field. I’ve read a few audiobooks on language and thought about the area, but I’m sure I’m failing to reference many key papers and books. I’m fairly uncertain about all of this, I suggest taking my opinion very lightly (if at all) and thinking through the issue yourself. If you know of other materials I or other readers should know about, references would be appreciated. 

Feedback Preferences: I’m sure I’m wildly wrong on many things. Feedback is highly appreciated. I will take little offense on rude comments so go wild. That said, don’t expect long responses.

I’m not sure how to best write this, so I’ll divide things into a few vignettes. 

Some rough statements which think are misguided:

“Our book is available in 20 countries, so is accessible to 3 Billion people.”

“Once GPT-n can translate perfectly between languages, everyone will be able to communicate with each other”

“I’m not going to try to rephrase or re-explain this (technical) book, because you really should just read it directly”

“I don’t see why you wrote up those concepts, they were explained in more detail in a previous post”

Retellings and their skeptics:

There’s been an interesting trend recently of books that retell the ideas of other, older books. See: 
How Proust Can Change Your Life
How Adam Smith Can Change Your Life
A Jane Austen Education: How Six Novels Taught Me About Love, Friendship, and the Things That Really Matter
And all the other books mentioned in this post

I haven’t seen a better term for these, so I’ll refer to these as “retellings”.  

More near to our community, Robert Wiblin has recently posted a piece on “Ugh fields”, which acts as a retelling of LessWrong posts from 10 years ago. 

To some audiences, these retellings are not only a waste of time, but lossy explanations of superior sources. What if readers stop at the retellings and skip the sources, and are left with false impressions? Clearly the solution is to point readers to the source and skip the in-between. Maybe add a minor amount of context, but to be fair, the people with the time to attempt this are generally not capable of doing a good enough job to not cause net harm.

But why not stop there? Language translation is also a lossy process. Not only are languages famously challenging to translate, but sometimes substantive modifications are introduced. The Spanish Harry Potter translation changed a pet from a frog to a turtle.[1] Perhaps the ideal is to ask that people learn all languages they are interested in reading content in, in order to ensure they do not make the mistakes of deceitful translations.

<end strawmanning>

An expansive view of translation:

I’m going to stop here and get to postulates:

1. There’s a fuzzy line between language translation and retelling. 

Just because two people speak English doesn’t mean they think in the same words. There’s a whole lot that goes into retelling that’s different from modifying the language.

2. There’s a fuzzy line between language translation and linguistic variety translation.

Linguistics has the concept of varieties; languages are one type, but so are dialects, syles, and several other terms I wasn’t honestly previously familiar with before working on this piece. Just as there can be the translation of languages, it makes total sense to also have translation of these varieties as well. Google Translate already has support for a few specific dialects as of now but stops there.

3. There’s a fuzzy line between linguistic varieties, inferential distance, and worldviews.

Even if a translation matches one’s exact preferred language, dialect, register, lexicon, and style, they could be left with a distance of inference (or education) and worldview. With regards to inferential distance, specific topics could be expanded upon or contracted for different audiences. With regards to worldview (my quick word for “comprehensive set of beliefs”), topics could be discussed that best fit with a given worldview, even if there is some level of exclusion. The topics could also be presented with evidence for how they fit into one’s worldview.

4. Given that there are fuzzy lines between all of the above, it is reasonable to assume that translations on things other than “language” are quite reasonable.

I think we consider “retellings” as equivalent to a liberal definition of “translation”. How Proust Can Change Your Life can be viewed as a translation of Proust’s work for a specific cluster of modern audiences. This would indicate that we may have far too few works like this, not too many. Perhaps we could use a “How Proust Changed My Life, as a 60th-percentile-in-Math 10th Grader in Saint Mill’s Academy.”[2]

5. Even if the same effective message is recreated, there are aspects of its delivery that matter.

If Bill Gates rewrote Superintelligence in his own words, it would be a big deal, even if the writing was just as effective as that of Superintelligence. The fact that Bill Gates both took the time to write such a work, and took the risk and opportunity cost of publicizing it, is a valuable signal. (This is the point I’m least excited about, but wanted to point it out for completeness)

Why are modern translations so narrow?

So, if translation makes sense outside of language translation, why do book publishers stop at language translation? 

The obvious answer is cost, but I think some less obvious answers that are tradition and challenging categorization. It would seem weird to have a specific translation of Harry Potter where the characters all spoke in a specific Tumblr vernacular or where Harry Potter grew up with Amish parents. Shakespear’s plays use Early Modern English, but it would be juvenile for us to read them in Modern English. If Wikipedia were to attempt a new language geared for “analytical philosophers”, I’m sure any definition and separation would be met with a fair bit of controversy.

Things are changing. First, some of us are okay being weird if the costs are justified. Most novel ideas seem weird at first, but there are still groups pushing them forward.

Second, Machine Learning is progressing rapidly. It seems possible that if ML could succeed in language translation, it could later get to completely personalized translation. Imagine a system where when you land on a Wikipedia page, it translates it into a version optimized for you at that time. The examples change to things in your life, and any concepts difficult for you get explained in detail. It would be like a highly cognitively empathetic personal teacher.

Expansive Translations and Power

I think we can consider what I’m referring to as expansive translations, or “highly expansive translations”, which are distinct from narrow or language translations.

One intellectual criticism of expansive translations is that they could be used by powerful actors to manipulate culture in their favor. Expansive translation is a powerful tool and any power increase in malicious hands could produce disastrous outcomes. Perhaps The Crusades could have been avoided if religious figures weren’t allowed to stray from the original texts. Expansive translations allow for censorship when controlled by authorities. I think the crux here comes down to a more fundamental opinion on the potential of technology and intellectual progress. This gets messy, so I’m going to table this for now and return if there are readers who care about it.

Grab Bag of Related Thoughts

  • There are probably thousands of books and what amounts to billions of hours of teaching to explain the same small sets of religious teachings. I.E. I’m sure one could come up with subsets of Christianity thought for which thousands of books and hours of local teaching (religious sermons and similar) were focused on. I imagine that similar educational endeavors should expect proportional costs.
  • Whenever any new fad comes to Silicon Valley, it seems like everyone has to re-explain it. Search “What is Bitcoin?” or “What is Intermittent Fasting?” for examples. I remember being amused at digital currency magazines that would include multiple articles to define Bitcoin, in the same magazine. This might appear wasteful, and I’m sure it often is, but it seems to serve a bunch of purposes that are hard to get around.
  • From above, the “Chesterton’s Fence” thing to do here seems to be “have lots of explanations for different audiences”. This could mean things like having articles explaining even seemingly simple concepts on LessWrong and the EA Forum, for those audiences. Perhaps we should have our own articles describing intermittent fasting or Bitcoin.
  • In popular media, communicators seem to specialize in audiences more than topics. A “golf magazine” really writes anything of interest to a specific “golf interested community”, rather than everything about golfing to a broad set of audiences.
  • If one has a message they want to be spread widely, it would be near impossible to personally advocate it to all of these groups as well as existing communicators do. It could be better to try to partner with or encourage the existing communicators.
  • CFAR used the term “murphyjitsu” instead of “pre-mortem”, even though they are the same thing (I think). They knew this, but did it because “murphyjistu” was preferable to their community. This used to really annoy me because I was worried that this would further disconnect them from the literature, but I’ve grown to support the decision. As long as the link between the two is clear enough, the benefit of using a custom definition seems much greater than the costs.
  • Right now one of the greatest challenges to expansive translation seems that of poor terminological coordination and capabilities. For example, Ayn Rand originally wanted to use the name “existentialism” for her work, but later changed it to “objectivism” as “existentialism” was already taken.[3] Perhaps it would have been more ideal for her to call it “existentialism”, which would auto-translate to “existentialism(2)” when there could be sources of confusion.
  • I imagine it could be highly valuable to be able to experiment publicly with terminology. Right now definitional work that touches existing fields feels like touching on their toes, but lots of important terms are a mess between academic fields. It would be great to have experimenters iterate and test out a bunch of options in limited settings, but in a systematic and intentional way.
  • I’m a big fan of YouTube summaries. I’ve learned almost all of my knowledge from textbooks (which are summaries of other sources), Wikipedia, and other non-source teaching methods. Asking that people read all the original sources is not at all a scalable solution to growing fields.
  • All of the reinterpretations of Shakespear’s plays are other good examples of retellings. Maybe I should have started this post with those instead of those trite pop-lit examples.

[1] https://harrypotter.fandom.com/wiki/Trevor

[2] This brings to mind “Chicken soup for the X soul

[3] https://en.wikipedia.org/wiki/Ayn_Rand#Atlas_Shrugged_and_Objectivism


Note: The image on the top is by Taylor Heery and was posted on Unsplash.  Link here. I used it because the New York subway system is what I think of when I imagine a bunch of people with very different backgrounds meeting each other. Most speak English, but are diverse in a wide variety of ways that could hypothetically use a large set of customized translations.