rvnnt

Requesting feedback/advice: what Type Theory to study for AI safety?

Thanks for the suggestions!

Software Foundations in particular looks highly apposite, and in a useful format, to boot.

TaPL was indeed one of the books I was considering reading. (Hesitating between that and Lambda Calculus and Combinators: an Introduction.) Gave it another point in my mental ranking.

Based on the little I understood of MIRI’s papers (the first of which also prompted me to read their paper on Vingean Reflection), they seem interesting, but currently inaccessible to me. Added an edit/update to my post.

The space of all possible algorithms one could run on three-digit-addition-strings like "218+375" seems rather vast. Could it be that what GPT3 is doing is something like

Obviously this is just wild, vague speculation; but to me it intuitively seems like it would at least sort of answer your question. What do you think? (Could GPT3 be doing something like the above?)

(To a human, it might feel like [the actual algorithm for addition] is a glaringly obvious candidate. But, on something like a noisy simplicity prior over all possible string-manipulations-algorithms, [the actual algorithm for addition] maybe starts looking like just one of the more conspicuous needles in a haystack?)