alexlyzhov

Posts

Sorted by New

Comments

Covid 1/7: The Fire of a Thousand Suns

Suppose some variant like the SA one is vaccine-evading and some people will have to vaccinate a second time with an adapted vaccine. What are our priors for the safety of vaccinating repeatedly this way (either with the same or different delivery methods)? If we have two vaccines that are pretty safe, are side effects of vaccinating with the first one and then vaccinating with the second, similar one almost surely on the order of side effects from using just one kind of vaccine?

DALL-E by OpenAI

This is the link to Yudkowsky discussion of concept merging with the triangular lightbulb example: https://intelligence.org/files/LOGI.pdf#page=10

Generated lightbulb images: https://i.imgur.com/EHPwELf.png

DALL-E by OpenAI

Given that the details in generated objects are often right, you can use superresolution neural models to upscale the images to a needed size.

DALL-E by OpenAI

On prior work: they cited l-lxmert (Sep 2020) and TReCS (Nov 2020) in the blogpost. These are the baselines it seems.
https://arxiv.org/abs/2011.03775
https://arxiv.org/abs/2009.11278

The quality of objects and scenes there is far below the new model. They are often just garbled and not looking quite right.

But more importantly, the best they could sometimes understand from the text is something like "a zebra is standing in the field", i.e. the object and the background, all the other stuff was lost. With this model, you can actually use much more language features for visualization. Specifying spatial relations between the objects, specifying attributes of objects in a semantically precise way, camera view, time and place, rotation angle, printing text on objects, introducing text-controllable recolorations and reflections. I may be mistaken but I think I haven't seen any convincing demonstrations of any of these capabilities in an open-domain image+text generation before.

One evaluation drawback that I see is they haven't included any generated human images in the blogpost besides busts. Because of this, there's a chance scenes with humans are of worse quality, but I think they would nevertheless be very impressive compared to prior work, given how photorealistic everything else looks.

I'm not sure what accounts for this performance, but it may well mostly be more parameters (2-3 orders of magnitude more compared to previous models?) plus more and better data (that new dataset of image-text pairs they used for CLIP?)

How to eradicate the desire to check time-wasting sites

Great approach. I use it in a slightly different way - I have a rule that each time I open a website from a list, I have to report it to my assistant, and I have to report a good enough reason. I also use website blockers on all platforms as an additional cost (Block Site on Chrome, Screen Timer on Android). But website blockers don't work that well on their own - I sometimes have to visit those websites for legitimate reasons and so I have to disable a blocker, and after a while I slip and the bar for disabling them gets too low.

Commitment and credibility in multipolar AI scenarios

Super thoughtful post!

I get the feeling that I'm more optimistic about post-hoc interpretability approaches working well in the case of advanced AIs. I'm referring to the ability of an advanced AI in the form of a super large neural network-based agent to take another super large neural network-based agent and verify its commitment successfully. I think this is at least somewhat likely to work by default (i.e. scrutinizing advanced neural network-based AIs may be easier than obfuscating intentions). I also think this may potentially not require that much information about the training method and training data.

I thought before that this doesn't matter in practice because of possibility of self-modification and successor agents. But I now think that at least in some range of potential situations verifying the behavior of a neural network seems enough for credible commitment when an agent pre-commits to using this neural network e.g. via a blockchain.

Also, are you sure that the fact that people can't simulate nematodes fits well in this argument? I may well be mistaken but I thought that we do not really have neural network weights for nematodes, we only have the architecture. In this case it seems natural that we can't do forward passes.

Where is human level on text prediction? (GPTs task)

I agree that the difference in datasets between 1BW and PTB is making precise comparisons impossible. Also, the "human perplexity = 12" on 1BW is not measured directly. It's extrapolated from their constructed "human judgement score" metric based on values of both "human judgement score" and perplexity metrics for pre-2017 language models, with authors noting that the extrapolation is unreliable.

Agentic Language Model Memes
With enough iterations, we could end up with a powerful self replicating memetic agent with arbitrary goals and desires coordinating with copies and variations of itself to manipulate humans and gain influence in the real world.

I felt initially cold towards the whole article, but now I mostly agree.

The goals of text agents might be programmable by humans directly (consider the economic pressure towards creating natural language support agents / recommendation systems / educators / etc). Prompts in their current form 1) only have significant influence over short text window after the prompt and 2) only cause likely text continuations to emerge (whereas you might want to write a text that has low probability conditional on the prompt to achieve your goal). Prompts could be replaced by specific programs by modifying the processes of training and inference. For example, additional sources of self-supervision can be incorporated (debate, or consistency losses).

The closest analogues are probably modern social media memes.

I would name chain letters as the closest analogue. Another one is computer viruses (because humans design viruses with a goal in mind, and then viruses might achieve these goals and self-replicate).