The predicted cost for GPT-N parameter improvements is for the "classical Transformer" architecture? Recent updates like the Performer should require substantially less compute and therefore cost.
but indeed human utility functions will have to be aggregated in some manner
I do not see why that should be the case? Assuming virtual heavens, why couldn't each individuals personal preferences be fullfilled?
It seems pretty undeniable to me from these examples that GPT-3 can reason to an extend.
However, it can't seem to do it consistently.
Maybe analogous to people with mental and/or brain issues that have times of clarity and times of confusion?
If we can find a way to isolate the pattern of activity in GPT-3 that relates to reasoning we might be bale to enforce that state permanently?