Ape in the coat

Wiki Contributions

Comments

They are not supposed to be two distinct systems. One is a subsystem of the other. There may be implementations where its the same LLM doing all the generative work for every step of the reasoning via prompt engineering but it doesn't have to be this way. It can can be multiple more specific LLMs that went through different RLHF processes.

I'm glad that someone is talking about automating philosophy. It seem to have huge potential for alignment because in the end alignment is about ethical reasoning. So

  1. Make an ethical simulator using LLM capable of evaluating plans and answering whether a course of action is ethical or not. Test this simulator in multiple situations.
  2. Use it as "alignment module" for an LLM-based agent composed of multiple LLMs processing every step of the reasoning explicitly and transparently. Everytime an agents is taking an action verify it with alignment module. If the action is ethical - proceed, else - try something else. 
  3. Test agents behavior in multiple situations. Check the reasoning process to figure out potential issues and fix them
  4. Restrict any other approach to agentic AI. Restrict training larger than current LLM.
  5. Improve the reasoning of the agent via Socratic method, rationality techniques, etc, explicitly writing them in the code of the agent.
  6. Congratulations! We've achived transparent interpretability; tractable alignment that can be tested with minimal real world consequenses and doesn't have to be done perfectly from the first try; slow take off. 

Something will probably go wrong. Maybe agents designed like that would be very inferior to humans. But someone really have to try investigating this direction.

There is a self fulfilling component in "Dialism". Because both our decision making depends on the number of dials and the number of dials depends on the collective actions of the humanity.

If there is only one dial = if humanity behaves as if there is only one dial. Social constructs do not evaporate if you don't believe in them. But if there is a critical mass of people who do not believe in them, who refuse to act on them, them they loose their power and cease to exist.

Don't agree to one dial. Make more of them. Don't let the Moloch win.

Are questions regarding reality of some phenomena, for instance morality and mathematical objects continue to be open problems for mainstream, professional philosophy? If so, seems that this very basic first step can be very helpful for at least some of mainstream philosophers. 

Of course there are those philosophers who understand it well, after all this idea itself was developped by philosophers.

Not to everything, no. 

This framework helps to clear the standard confusion in some philosophical questions, typically the ones phrased with such words as 'real', 'non-real' 'objective' and 'subjective'.

Sure. 

The part about me being weirded out is me noticing my own confusion that someone who I expect to know and understand what I know and understand to be confused about a thing that I'm not. Which can very well mean that it's me who is missing some crucial detail and the clarity that I experience is false. And I'm mentally preparing myself for it.

On a reread I noticed that 

I feel a bit weird explaining it to a co-founder of CFAR.

can be interpreted as a status-related reproach. I don't remember having intended it and I'm sorry it turned out to be this way.

I think fundamentals = realityfluid in this definition, in case that realityfluid  doesn't consist of even simpler elements which is possible but we do not need to commit to it forthe sake of this discussion. 

I don't like the term "realityfluid" being used for the most fundamental elements of the universe because 1) its made from two words which is a terrible fit for something that by definiton isn't made from anything else; 2) it has "real" in it and "real/unreal" distinction is a confusing and strictly inferrior to "map/territory".

I don't mind preserving the reminder that we do not know much about actual fundamental stuff. Lets call it "mages" instead of "fundamentals" then. A short world, and the idea that wizards are the fundamental elements of reality sounds even more ridiculous than some kind of magical fluid.

Even "territory" in "map vs. territory" is actually a map embedded in… something. ("The referent of 'territory'", although saying it this way just recurses the problem. 

This recursion itself is the artifact of the fact that we can comprehend territory only through maps. And it exists only in our map, not in the territory. Try reasoning on a fixed level, carefully noticing which elements are part of a map and which are part of a territory for this level. And then you can generalise this reasoning for every level of recursion.

Like reference itself is a more fundamental reality

I think you did a wrong turn here. By "reference" do you mean the ability of a map to correspond to a territory?

Territory is just a lot of fundamentals. The properties of these fundamentals turned out to allow specific configurantions of fundamentals that we call "brains" to arrange themselves in patterns that we call "having a map of a territory". Which properties of the fundamentals exactly do allow it? - is an interesting question which we do not know the answer yet. We can speculate in terms of laws of physics that are part of our map  - probably has something to do with "locality". Likewise, we can't exactly specify the principle of what it means to "be a brain" or "have a map representing a territory" in terms of configurations of fundamentals. But we can understand the principle that every referent of our map is some configuration of fundamentals.

It's clear to me that this is a simple case of map-territory confusion, though I feel a bit weird explaining it to a co-founder of CFAR.

"Thingness" comes from the map. Referents of "things" can be in the territory. "Realityfluid" - terrible name by the way, lets call it "fundamentals" - is the territory. Territory exists regardless whether it's interpreted by a subject of cognition or not, but subjects of cognition understand the territory only through a map. "Laws of physics" can be about properties of the fundamentals or about their reificated representation on the map and we need to be careful not to confuse these two

it's kind of like being the kind of person who, when observing having survived quantum russian roulette 20 times in a row, assumes that the gun is broken rather than saying "i guess i might have low quantum amplitude now" and fails to realize that the gun can still kill them — which is bad when all of our hopes and dreams rests on those assumptions

Yes, this is exactly the reason why you shouldn't update on "antropic evidence" and base your assumptions on it. The example with quantum russian roulette is a bit of a loaded one (pun intended), but here is the general case:

You have a model of reality, you gather some evidence which seem to contradict this model. Now you can either update your model, or double down on it, claiming that all the evidence is a bunch of outliners. 

Updating on antropics in such situation is refusing to update your model when it contradicts the evidence. It's adopting an anti-laplacian prior while reasoning about life or death (survival or extinction) situations - going a bit insane specifically in the circumstances with the highest stakes possible.

Load More