Jan_Kulveit

My current research interests:
- alignment in systems which are complex and messy, composed of both humans and AIs?
- actually good mathematized theories of cooperation and coordination
- active inference
- bounded rationality

Research at Alignment of Complex Systems Research Group (acsresearch.org), Centre for Theoretical Studies, Charles University in Prague. Formerly research fellow Future of Humanity Institute, Oxford University

Previously I was a researcher in physics, studying phase transitions, network science and complex systems.

Posts

Sorted by New

99You should go to ML conferences

24The Living Planet Index: A Case Study in Statistical Pitfalls

1mo

50Announcing Human-aligned AI Summer School

2mo

69InterLab – a toolkit for experiments with multi-agent interactions

6mo

41Box inversion revisited

9mo

36Snapshot of narratives and frames against regulating AI

9mo

86We don't understand what happened with culture enough

10mo

75Elon Musk announces xAI

179Talking publicly about AI risk

146The self-unalignment problem

Wiki Contributions

Comments

You should go to ML conferences

Jan_Kulveit2d10-2

I'm skeptical of the 'wasting my time' argument.

Stance like 'going to poster sessions is great for young researchers, I don't do it anymore and just meet friends' is high-status, so, on priors, I would expect people to take it more than what's optimal.

Realistically, poster session is ~1.5h, maybe 2h with skimming what to look at. It is relatively common for people in AI to spend many hours per week digesting what are the news on twitter. I really doubt the per hour efficiency of following twitter is better than of poster sessions when approached intentionally. (While obviously aimlessly wandering between endless rows of posters is approximately useless.)

You should go to ML conferences

Jan_Kulveit3d20

Corrected!

The last era of human mistakes

Jan_Kulveit3d90

I broadly agree with this - we tried to describe somewhat similar set of predictions in Cyborg periods.

List of Collective Intelligence Projects

Jan_Kulveit25d50

Surprised you haven't heard about any facilitated communication tools.

LLM Generality is a Timeline Crux

Jan_Kulveit1mo345

Few thoughts
- actually, these considerations mostly increase uncertainty and variance about timelines; if LLMs miss some magic sauce, it is possible smaller systems with the magic sauce could be competitive, and we can get really powerful systems sooner than Leopold's lines predict
- my take on what is one important thing which makes current LLMs different from humans is the gap described in Why Simulator AIs want to be Active Inference AIs; while that post intentionally avoids having a detailed scenario part, I think the ontology introduced is better for thinking about this than scaffolding
- not sure if this is clear to everyone, but I would expect the discussion of unhobbling being one of the places where Leopold would need to stay vague to not breach OpenAI confidentiality agreements; for example, if OpenAI was putting a lot of effort into make LLM-like systems be better at agency, I would expect he would not describe specific research and engineering bets

TsviBT's Shortform

Jan_Kulveit1mo20

Agreed we would have to talk more. I think I mostly get the homunculi objection. Don't have time now to write an actual response, so here are some signposts:
- part of what you call agency is explained by roughly active inference style of reasoning
-- some type of "living" system is characteristic by having boundaries between them and the environment (boundaries mostly in sense of separation of variables)
-- maintaining the boundary leads to need to model the environment
-- modelling the environment introduces a selection pressure toward approximating Bayes
- other critical ingredient is boundedness
-- in this universe, negentropy isn't free
-- this introduces fundamental tradeoff / selection pressure for any cognitive system: length isn't free, bitflips aren't free, etc.
(--- downstream of that is compression everywhere, abstractions)
-- empirically, the cost/returns function for scaling cognition usually hits diminishing returns, leading to minds where it's not effective to grow the single mind further
--- this leads to the basin of convergent evolution I call "specialize and trade"
-- empirically, for many cognitive systems, there is a general selection pressure toward modularity
--- I don't know what are all the reasons for that, but one relatively simple is 'wires are not free'; if wires are not free, you get colocation of computations like brain regions or industry hubs
--- other possibilities are selection pressures from CAP theorem, MVG, ...
(modularity also looks a bit like box-inverted specialize and trade)

So, in short, I think where I agree with the spirit of If humans didn't have a fixed skull size, you wouldn't get civilization with specialized members and my response is there seems to be extremely general selection pressure in this direction. If cells were able to just grow in size and it was efficient, you wouldn't get multicellulars. If code bases were able to just grow in size and it was efficient, I wouldn't get a myriad of packages on my laptop, it would all be just kernel. (But even if it was just kernel, it seems modularity would kick in and you still get the 'distinguishable parts' structure.)

TsviBT's Shortform

Jan_Kulveit1mo00

That's why solving hierarchical agency is likely necessary for success

Former OpenAI Superalignment Researcher: Superintelligence by 2030

Jan_Kulveit2mo8768

(crossposted from twitter) Main thoughts:
1. Maps pull the territory
2. Beware what maps you summon

Leopold Aschenbrenners series of essays is a fascinating read: there is a ton of locally valid observations and arguments. Lot of the content is the type of stuff mostly discussed in private. Many of the high-level observations are correct.

At the same time, my overall impression is the set of maps sketched pulls toward existential catastrophe, and this is true not only for the 'this is how things can go wrong' part, but also for the 'this is how we solve things' part. Leopold is likely aware of the this angle of criticism, and deflects it with 'this is just realism' and 'I don't wish things were like this, but they most likely are'. I basically don't buy that claim.

The Alignment Problem No One Is Talking About

Jan_Kulveit2mo30

You may be interested in 'The self-unalignment problem' for some theorizing https://www.lesswrong.com/posts/9GyniEBaN3YYTqZXn/the-self-unalignment-problem

Examples of Highly Counterfactual Discoveries?

Jan_Kulveit3mo134

Mendel's Laws seem counterfactual by about ˜30 years, based on partial re-discovery taking that much time. His experiments are technically something which someone could have done basically any time in last few thousand years, having basic maths