Templarrr — LessWrong

LESSWRONG
LW

least by my eyes even when they have relatively good taste they all reliably have terrible taste and even the samples people say are good are not good

We can get a lot here if we remember that a lot of "good writing" is centered around "not repeating itself" in different forms (words/phrases/structures etc) and current models are absolutely terrible in that. IF we can add temporary negative weights to the terms that were already used in answer that would decrease to zero with time, we can incentivise the LLMs to utilize wider variety of language.

AI #108: Straight Line on a Graph

Templarrr7mo10

engineer, honestly

First I thought this was hilarious, as in "we really just want an engineer FFS", but then I checked.

Engineer, honesTY. As in "engineer to research and improve models honesty".

Fun With GPT-4o Image Generation

Templarrr7mo00

Fun safety hiccup - the image generator is very persistent in not allowing to draw a hand that touches the blade of the sword, regardless how safe the context is. The hand can hover over it, be close to, touch the guard, but not the blade. I barely made it able to touch a blade by invoking the Mordhau and medieval fencing manuals, and even then it was just one hand on the blade, while it should've been both.

No trouble making it work with a wooden toy sword though, but that defeated the entire point of the picture.

Monthly Roundup #28: March 2025

Templarrr7mo21

Would this even be legal in Germany? No wonder Europe is falling behind.

Case study "how to make your post much worse in a single sentence".

There's literally nothing they describe that requires to do active face recognition (the only part that could be a problem in Europe).
Most of the office spaces use personalized electronic key cards.
Office systems KNOW who just entered.

Solving non-existing problem by harder-then-necessary and illegal-in-some-places way can be fun, but isn't as much of a dunk on others as author believes it to be. Without the last part it was fun experiment of a fellow tech person, with it ...

AI #100: Meet the New Boss

Templarrr9mo10

Which means, in turn, that you must (for that to make any sense) be using the AI in its non-aligned state to align itself and solve all those other problems

Strongly disagree on this.

The text doesn't imply this at all. "While doing it" doesn't mean you will be using AI, it just means that during the development your team uncovers a lot of corner cases and knowledge and skills needed that weren't available to them before they started, which is how most of the engineering projects are done.

You may have general plan, but it is expected that you will come up with the details as your knowledge of the area extends.

Monthly Roundup #26: January 2025

Templarrr9mo10

Pointless busywork is bad.

100%. The problem usually is hidden in people mixing "I don't (understand/agree with) the point of something" with "something is pointless".

AI #93: Happy Tuesday

Templarrr11mo10

what the median essay, story, or response to the assignment will look like so they can avoid and transcend it all

Obligatory joke about how terrible our education is, that half of the scores are below median!

AI #89: Trump Card

Templarrr1y10

they’re 99% sure are AI-generated, but the current rules mean they can’t penalise them.
The issue is proving it.

That is very much not the issue. The issue is that academy spent last few hundred years to make sure papers are written in the most inhuman way possible. No human being ever talks like whitepapers are written. The "we can't distinguish if this was written by a machine or human that is really good at pretending being one" can't be a problem if it was heavily encouraged for centuries. Also fun reverse-Turing test situation.

Occupational Licensing Roundup #1

Templarrr1y10-1

Two things to note.

First - I feel like putting every occupation in the same pile and deciding are you for or against licensing isn't helpful? I personally don't need licensed lawnmower, but I would very much prefer licensed doctor. The cost of mistake in two occupations differs a lot and can be used for a threshold which jobs should require a license.

Second - there should be a difference between doing a thing to yourself (argument can be made even that here we shouldn't have any limits), doing things for free to your friends/relatives with their full knowledge of your skill level and experience (most of the non life-threatening things can probably be allowed here) and selling your craft for money.

AI #87: Staying in Character

Templarrr1y63

llms don’t work on unseen data

Unfortunately I hear this quite often, sometimes even from people who should know better.

A lof of them confuses this with the actual thing that exist: "supervised ML models (which LLM is just a particular type of) tend to work much worse on the out-of-training distribution data". If you train your model to determine the volume of apples and oranges and melons and other round-y shapes - it will work quite well on any round-y shape, including all kind of unseen ones. But it will suck at predicting the volume of a box.

You don't need model to see every single game of chess, you just need the new situations to be within the distribution built from massive training data, and they most often are.

Real out-of-distribution example in this case would've been to only train it on chess and then ask what is the next best move in checkers (relatively easy OOD - same board, same type of game) or minecraft.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments