Malentropic Gizmo — LessWrong

The Origami Men

Malentropic Gizmo1mo10

Great!

Generalized Hangriness: A Standard Rationalist Stance Toward Emotions

Malentropic Gizmo4mo78

That is surprising. We often used the word in high school ~10 years ago and I'm not even a native speaker. Example

Malentropic Gizmo's Shortform

Malentropic Gizmo5mo10

What's up with glowfic.com? Getting a lot of 500 responses..

EDIT: I thought some might know here, because of the dath ilan stories, and I have no clue where else to ask, but nvm if it's too unrelated to this site.

When you downvote, explain why

Malentropic Gizmo9mo51

I think I can!

When I write, I am constantly balancing brevity (and aesthetics generally) with clarity. Unfortunately, I sometimes gravely fail at achieving the latter without me noticing. Your above comment immediately informs me of this mistake.

Level up your spreadsheeting

Malentropic Gizmo1y10

Thank you for this! Your companion piece instantly solved a problem I was having with my diet spreadsheet!

The Closed Eyes Argument For Thirding

Malentropic Gizmo2y20

Yes, I basically agree: My above comment is only an argument against the most popular halfer model.

However, in the interest of sparing reader's time I have to mention that your model doesn't have a probability for 'today is Monday' nor for 'today is Tuesday'. If they want to see your reasoning for this choice, they should start with the post you linked second instead of the post you linked first.

D&D.Sci: The Mad Tyrant's Pet Turtles [Evaluation and Ruleset]

Malentropic Gizmo2y60

I had to use keras backend's switch function for the automatic differentiation to work, but basically yes.

D&D.Sci: The Mad Tyrant's Pet Turtles [Evaluation and Ruleset]

Malentropic Gizmo2y40

I enjoyed the exercise, thanks!

My solution for the common turtles was setting up the digital cradle such that the mind forged inside was compelled to serve my interests (I wrote a custom loss function for the NN). I used 0.5*segments+x for the vampire one (where I used the x which had the best average gp result for the example vampire population). Annoyingly, I don't remember what I changed between my previous and my current solution, but the previous one was much better 🥲

Looking forward to the next challenge!

Malentropic Gizmo's Shortform

Malentropic Gizmo2y10

Random Musing on Autoregressive Transformers resulting from Taelin's A::B Challenge

Let's model an autoregressive transformer as a Boolean circuit or, for simpler presentation, a n-ary circuit with m inputs and 1 output.

Model the entire system the following way: Given some particular m length starting input:

circuit calculates the output token (/integer) from the input
appends calculated output token to the end of the inputword
deletes first token of input
go to 1

It's easy to see that, strictly speaking, this system is not very powerful computationally: we have finite number of possible tokens (n) and finite length context window (m), so we only have finite possible states (n*m), therefore our model is as powerful as a finite state machine (it's pretty much equivalent in its behaviour to a regular grammar only containing A → aB rules)

However, real life computers also have finite memory yet we never let that bother us!

How should we manually design our circuit to enable us to solve the most types of problems with an appropriate selection of the initial input?

I think one very straightforward solution is to simply emulate a computer with random access memory the following way:

Select some fixed instruction set with k instructions and from our n tokens choose k to correspond to these k instructions.
Select another k tokens from the remaining to denote that the given instruction is under execution.
design the circuit such that if the emulated computer's memory is M_t (m element vector, M_{ti} is the ith token) after the execution of the t-th instruction, then our circuit should compute the following tokens (including the starting input) : M_{00}, M_{01}, M_{02}, .. M_{0m}, M_{10}, M_{11}, .., M_{1m}, M_{20}, ...

This can be done efficiently with relatively few cicuit nodes and relatively low depth, but I don't want to write down the construction.

It's interesting to see that actual large autoregressive transformers on human language seem to be fitting this model more and more closely:

With GPT-3 (possibly GPT-2), it was shown that after an instruction is given in the initial prompt, the transformer can execute that instruction in its continuation (eg. translate this french sentence to english, french: Je mange une pomme, english: ). This corresponds to having a fixed instruction set in the above model (where the instruction set is in common english instead of singular tokens)
With ChatGPT-3.5 and even more with newer models, it was shown that chain of thought prompting works well for solving more complex problems than asking for a solution immediately. I think the newest models often don't even require an explicit instruction to break their reasoning down into steps, they often do so anyway. I expect this behaviour to be more and more common as newer models get smarter and also, encounter more and more transformer/human interactions in their training set. This corresponds to iteratively calculating M_1, M_2, ... according to the given instructions. However, at this point, the instructions and subsequent "memory snapshots" are all in the transformer's context window.
Might we expect this to change? Will future models be able to notice when the initial prompt or some still relevant previous data is about to exit the context window and autonomously re-generate them and subsequently pick up the calculation where they left off? I expect they will! What do you think?

The Solution to Sleeping Beauty

Malentropic Gizmo2y*10

No she does not. And it's easy to see if you actually try to formally specify what is meant here by "today" and what is meant by "today" in regular scenarios. Consider me calling your bluff about being ready to translate to first order logic at any moment.

I said that I can translate the math of probability spaces to first order logic, and I explicitly said that our conversation can NOT be translated to first order logic as proof that it is not about math, rather, it's about philosophy. Please, reread that part of my previous comment.

And frankly, it baffles me that you think that you need to explain that it's possible to talk about math using natural language, to a person who has been doing it for multiple posts in a row.

That is not what I explained and I suggest you reread that part. Here it is again:

This whole conversation isn't about math. It is about philosophy. Math is proving theorems in various formal systems. If you are a layman, I imagine you might find it confusing that you can encounter mathematicians who seem to have conversations about math in common English. I can assure you that every mathematician in that conversation is able to translate their comments into the simple language of the given formal system they are working in, they are just simply so much of an expert that they can transmit and receive the given information more efficiently by speaking on a higher level of abstraction.
It is not possible to translate the conversation that we're having to a simple formal system as it's about how we should/can model some aspect of reality (which is famously dirty and complicated) with some specific mathematical object.

The structure of my argument here is the following:

Math is about concepts in formal systems, therefore an argument about math can be expressed in some simple, formal language
We are having an argument which can't be translated to a formal system.
Therefore, we are not arguing about math.

The more I post about anthropics the clearer it becomes that I should've started with posting about probability theory 101. My naive hopes that average LessWrong reader is well familiar with the basics and just confused about more complicated cases are crushed beyond salvation.

Ah yes, clearly, the problem is that I don't understand basic probability theory. (I'm a bit sad that this conversation happened to take place with my pseudonymous account.) In my previous comment, I explicitily prepared to preempt your confusion about seeing the English word 'experiment' with my paragraph (the part of it that you, for some reason, did not quote), and specifically linking a wiki which only contains the mathematical part of 'probability', and not philosophical interpretations that are paired with it commonly, but alas, it didn't matter.

>In particular, Beauty, when awoken, has a certain credence in the statement "Today is Monday."
No she does not. And it's easy to see if you actually try to formally specify what is meant here by "today" and what is meant by "today" in regular scenarios. Consider me calling your bluff about being ready to translate to first order logic at any moment.

If you are not ready to accept that people have various levels of belief in the statement "Today is Monday" at all times, then I don't think this conversation can go anywhere, to be honest. This is an extremely basic fact about reality.

EDIT: gears, in the first part you selected i''m answering an accusation of bluffing in a matter-of-fact way, how is that too combative? Also, fell free to chime in at any point it is an open forum after all..

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments