Unfortunately this post hasn't gotten much attention, but I do think it offers some value to the conversation. Its main weakness is that it's very hypothetical, and it would have been better if I could have made it more formal and concrete. Also I could have probably wrote it better.
I sent the post to @abramdemski and @Gordon Seidoh Worley; Abram said it looks totally correct, and Gordon said it's thinking in the right direction.
I think this post makes an important observation which hasn't been made elsewhere (at least on LW) - even if any worldview has to make assumptions, we should (all else being equal) prefer worldviews that are more flexible about which assumptions they make. Ideally, we should have a worldview that doesn't require any specific assumption at all.
For specific beliefs, we should look for many ways to justify them which don't all rest on the same set of assumptions. Both for ourselves, to know we don't depend on that assumption, and for others, so when they tell us "Ok, but you assume X", we can suggest instead a different justification which rests on a different assumption.
If your question is whether an axiom of PA can be replaced by an equivalent statement which can serve as a replacement axiom and prove the old axiom as a theorem, then the answer is yes, and in a very boring way. Every mathematical statement has tons of interchangeable equivalent forms, like adding "and 1=1" to it. Then the new version proves the old version and all that jazz.
If your question is whether we should believe in PA more because it can arise from many different sets of axioms, then I'm not sure it's meaningful. By the previous point, of course PA can arise from tons of different sets of axioms; but also, why should we care about "believing" in PA? We care only whether PA's axioms imply this or that theorem, and that's an objective question independent of any belief.
If your question is whether we can have a worldview independent of any assumptions at all, the answer is that we can't. The toy example of math shows it clearly: if you have no axioms, you can't prove any theorems. You have latitude in choosing axioms, but you can't dispense of them completely.
re PA: I mean a statement that we would intuitively see as non-trivially different, not merely the same statement with "and 1=1" added.
re 3: I'm not asking if we can have a worldview independent of any assumptions at all. I'm asking if we can have a worldview independent of any specific assumptions. That is, among the various sets of assumptions we can use to justify our worldview, there's no assumption that's common to all sets (i.e, always required to justify the worldview).
Then I guess you need to quantify "intuitively see as non-trivially different". For example, take any axiom A in PA, and any theorem T that's provable in PA. Then A can be replaced by a pair of axioms: 1) T, 2) "T implies A". Is that nontrivial enough? And there's an unlimited amount of obfuscatory tricks like that, which can be applied in sequence. Enough to confuse your intuition when you're looking at the final result.
No, I'm not talking about obfuscatory tricks of this sort. If you could replace A with T without also adding "T implies A" as an axiom, and still be able to prove A (an everything else you could have proved before, and nothing else you couldn't prove before), that would be the sort of thing I'm talking about. Unfortunately I don't know how to state this more precisely.
But perhaps an analogy could help - there's a bunch of logic gates, and it turns out that some of them (NOR and NAND) can create every other logic gate if arranged correctly, while others can't. So let's say I build some circuit out of NANDs and someone blames me for "relying" on NANDs. In that case, I could show him that an equivalent circuit can be made out of NORs. I think before you learn the fact that both logic gates are universal, this would be a non-intuitive substitution.
When I talked to Abram he said that there's probably many different formulations of number theory which logicians have proven equivalent to PA. I tried to quickly look for something like that to give as an example, but I'm really out of my depth (if I wasn't I would have included such an example from the start).
I mean, consider a trick like replacing axioms {A, B} with {A or B, A implies B, B implies A}. Of course it's what you call an "obvious substitution": it requires only a small amount of Boolean reasoning. But showing that NOR and NAND can express each other also requires only a small amount of Boolean reasoning! To my intuition there doesn't seem any clear line between these cases.
I could be drawing too long of a bow, but this seems to recall the distinction Marvin Minsky makes between Logic and Common-Sense thinking. Logic is a single "thin" chain of true or false propositions, if any single link in the chain is false, the whole chain collapses. Commonsense, in his parlance, is less discrete, we can have degrees of belief in any part of a chain, some parts of the train will be deeper and stronger than others.
He also greatly admired a passage in Aristotle's De Anima that shows how a single object can be represented in multiple ways, which Minsky saw as being very significant to operating in the world.
"Thus the essence of a house is assigned in such a formula as ‘a shelter against destruction by wind, rain, and heat'; the physicist would describe it as 'stones, bricks, and timbers'; but there is a third possible description which would say that it was that form in that material with that purpose or end. Which, then, among these is entitled to be regarded as the genuine physicist? The one who confines himself to the material, or the one who restricts himself to the formulable essence alone? Is it not rather the one who combines both in a single formula?"
Am I conflating different things by saying this reads as similar to the idea of favoring Cross-Entropy rather than the shortest program?
Minsky extended to the idea of multiple representations to what he called Papert's Principle - that it is how we administer and use these multiple representations together, or when we opt for one and exclude others which is the most important part of 'mental growth'.
Some of the most crucial steps in mental growth are based not simply on acquiring new skills, but on acquiring new administrative ways to use what one already knows.
Returning to replacing axioms and how this relates to Minsky's ideas about multiple representations, take for example making an omelette. I may use a stone bench-top, a tiled backsplash, a spoon, or any sort of 'hard' surface to crack the egg. The "crack the egg" part of the process/recipe stays the same, with the same anticipated result, but it becomes replaced by mental representations about the perceived hardness of many different objects.
Does any of this seem relevant or have I made some crude, tenuous connections?
Epistemic status: hand waving but confidant.
I was thinking about reflective reasoning and "strange loops through the meta level" and it lead me to this intuition that even if you have to make some assumptions to support your beliefs, if your beliefs can be justified by various different sets of assumptions, and not just one specific set of assumptions, then it gives those beliefs more credence.
What do I mean by justification by different sets of assumptions? Let's take peano arithmetic (PA) as an example (though any other formal system can work for this purpose as well).
If (1) is correct, then it's not necessary to accept that axiom as long as you're willing to accept its replacement (and assuming that you're fine with PA but just don't like assuming stuff, you shouldn't have a specific objection with the replacement statement either).
If (2) is correct, then the same is true for any PA axiom.
if (3) is correct, then the same is true for any set of PA axioms. Which means if you think PA really does prove only correct things, you don't actually have to accept the axioms as mere assumptions, because you can prove them from their substitutes, which you already accept.
(Question: Is this actually true about peano arithmetic?)
So although at any moment you'd be using axioms to define what you're talking about and prove statements, none of them could be said to be required and permanent assumptions. There would be no assumptions you can be blamed for always assuming.
To bring it back to reflective reasoning, it would match the intuition that no belief, even the most fundamental ones like inductive and occemian priors, are beyond scrutiny under the full power of our reasoning, and therefore can't be said to be merely assumed.
This also reminds me of @So8res' cross-entropy idea - that instead of just prioritizing the hypothesis which has the shortest code, we should prioretize the hypothesis which has many different short codes.
In the same way, the more assumptions a belief requires, and the more complex these assumptions are, the more we discount that belief. But we should also look at how many different sets of assumptions can support that belief, and prioritize beliefs which can rely on more sets of assumptions rather than on ones that require very specific assumptions.
This also applies to beliefs supported by circular justifications. If a beliefs requires a very specific circle, it's less likely to be true than if it can fit in many different circles.
If there's no particular set of assumptions my worldview permanently depends on, if it can spring from many and various different sets of assumptions, then it's a stronger worldview. Even if it still requires assumptions or even circular reasoning.
I think this can be a step towards, or a part of, a solution to the regress problem or even The Problem of the Criterion, because it alleviates the need to permanently rely on just one criterion.
This post only shows an intuition, so I would love to see someone formalise it.