The comparison lines (dotted) have completely arbitrary y-intercepts. You should only take the slope seriously.
The building block concept was just something that I found intuitive. It's not backed by rigorous research or intense thinking. I think they can easily be called tasks or traits or other things that relate to psychological findings. You should really think of this post as something that I found worth sharing without doing a lot of background reading.
Thanks for all the good references!
I'm not sure either to be fair. My friend with aphantasia says it doesn't make that much of a practical difference for her. But it's hard to compare since we don't know the counterfactual.
I'm generally pretty uncertain how large the differences are but some discussions lead me to believe that they are bigger than I expected. At some point I was just like "Wait, you can't rotate the shape in your head?" or "What do you mean, you feel music?".
I think there are a ton of interesting questions to dive into. Probably a lot have already been answered by psychologists. I think the independence question is very interesting as well.
thanks! fixed it.
I would expect the results to be better on, let's say PaLM. I would also expect it to base more of its answers on content than form.
I think there are a ton of experiments in the direction of natural story plots that one could test and I would be interested in seeing them tested. The reason we started with relatively basic toy problems is that they are easier to control. For example, it is quite hard to differentiate whether the model learned based on form or content in a natural story context.
Overall, I expect there to be many further research projects and papers in this direction.
Thank you for the feedback. I will update the post to be more clear on imitative generalization.
I think the claim you are making is correct but it still misses a core point of why some people think that Bayes nets are more interpretable than DL.
a) Complexity: a neural network is technically a Bayes net. It has nodes and variables and it is non-cyclical. However, when people talk about the comparison of Bayes nets vs. NNs, I think they usually mean a smaller Bayes net that somehow "captures all the important information" of the NN.
b) Ontology: When people look at a NN they usually don't know what any particular neuron or circuit does... (read more)
I strongly agree. There are two claims here. The weak one is that, if you hold complexity constant, directed acyclic graphs (DAGs; Bayes nets or otherwise) are not necessarily any more interpretable than conventional NNs because NNs are DAGs at that level. I don't think anyone who understands this claim would disagree with it.
But that is not the argument being put forth by Pearl/Marcus/etc. and arguably contested by LeCun/etc.; they claim that in practice (i.e., not holding anything constant), DAG-inspired or symbolic/hybrid AI approaches like Neural Causa... (read more)
Thanks for taking the time. I now understand all of your arguments and am convinced that most of my original criticisms are wrong or inapplicable. This has greatly increased my understanding of and confidence in AI safety via debate. Thank you for that. I updated the post accordingly. Here are the updated versions (copied from above):
Re complexity:
Update 2: I misunderstood Rohin’s response. He actually argues that, in cases where a claim X breaks down into claims X1 and X2, the debater has to choose which one is more effective to attack, i.e. it... (read more)
Thank you for the detailed responses. You have convinced me of everything but two questions. I have updated the text to reflect that. The two remaining questions are (copied from text):
On complexity: There was a second disagreement about complexity. I argued that some debates actually break down into multiple necessary conditions, e.g. if you want to argue that you played Fortnite you have to show that it is possible to play Fortnite that and then that it is plausible that you played it. The pro-Fortnite debater has to show both claims while the anti... (read more)
Thanks for your detailed comment. Let me ask some clarifications. I will update the post afterward.
Assumption 1:
I understand where you are going but the underlying path in the tree might still be very long, right? The not-Fortnite-debater might argue that you couldn't have played Fortnite because electricity doesn't exist. Then the Fortnite-debater has to argue that it does exist, right?
Furthermore, I don't see why it should just be one path in the tree. Some arguments have multiple necessary conditions/burdens. Why do I not have to prove... (read more)
Thank you for your expertise and criticism.
Just to be clear, we still think nuclear is good and we explicitly phrase it as a high-risk, high-reward bet against the failure of renewables. Furthermore, we very explicitly say that nuclear is better than all fossil alternatives which also implies that we would be in favor of having more nuclear.
I think our overall framing is more like: get away from fossil fuels, a bit more into nuclear and much more into renewables. If you think that is not clear from the text, we should definitely clarify this. Let me know how you understood it.
We discuss this in our post. We think it is a plausible story but it's not clear if decisive. Most people who are critical of the status quo don't see regulation as the primary issue but I'm personally sympathetic to it.
Unfortunately, we didn't find a good source for that. However, given that fossils usually don't need storage and solar+batteries are dropping exponentially in price, we think both options should be cheaper. But good estimates of that would be very welcome.
solar+batteries are dropping exponentially in price
Pulling the data from this chart from your source:
...and fitting[1] an exponential trend with offset[2], I get:
(Pardon the very rough chart.)
This appears to be a fairly good fit[3], and results in the following trend/formula:
This is an exponentially-decreasing trend... but towards a decidedly positive horizontal asymptote.
This essentially indicates that we will get minimal future scaling, if any. $37.71/MWh is already within the given range.
For reference, he... (read more)
Random side note: GPT-3 seems to be able to generate decent one liners.
Generate one-liners describing the dangers of AI. An example is "Inventing machines that are smarter than us is playing with fire."
1. Machines that are smarter than us could easily become our masters.
2. If we're not careful, AI could spell the end of humanity as we know it.
3. AI could be used to create weapons of mass destruction that could devastate the planet.
4. AI could be used to create powerful robots that could enslave humans.
5. AI could be used to create artificial intellig... (read more)
thanks for the clarification!
Right, but Reichenbach's principle of common cause doesn't tell you anything about how they are causally related? They could just be some nodes in a really large complicated causal graph. So I agree that we can assume causality somehow but we are much more interested in how the graph looks like, right?
You state in the first comment that they can be given causal justification. As far as I understand you argue with covariances above. Can you elaborate on what this causal justification is?
I'm very interested in a collaboration!! Let's switch to DMs for calls and meetings.
Adding to what Jaime already said: I think there are two things we might care about when thinking about FLOP.
1. We could care about the theoretical number of FLOP a method might use independent of architecture, exact GPU, etc. This might be useful to compare methods independent of which exact setup is the current flavor of the month. It might also be useful to compare current methods against methods from 5 years ago or 5 years from now. Then the setup we describe in the post seems to be the best fit.
2. We could care about the actual FLOP count that a... (read more)
We will add a second blog post in which we discuss how accurate this rule is under different conditions. It looks like it depends on many factors such as batch size, type of parameters, depth, etc.
This is definitely a possibility and one we should take seriously. However, I would estimate that the scenario of "says it suffers as deception" needs more assumptions than "says it suffers because it suffers". Using Occam's razor, I'd find the second one more likely. The deception scenario could still dominate an expected value calculation but I don't think we should entirely ignore the first one.
It's definitely not optimal. But our goal with these questions is to establish whether GPT-3 even has a consistent model of suffering. If it answers these questions randomly, it seems more likely to me that it does not have the ability to suffer than if it answered them very consistently.
I think I might make a short version when I feel like I have more practical experience with some of the things I'm suggesting. Maybe in half a year or so. Thanks for the kind feedback and suggestions :)
Thank you for the valuable feedback. I'll try to improve that more in future posts and add more paragraphs here.
I don't think whoop influenced my rhythm that much. But I had quite a steady rhythm already, so it wasn't too much of a problem. I think I mostly profited from whoop by having an assessment for my quality of sleep. This made it much easier for me to decide how to prioritize my work, e.g. when I should do intellectually intense stuff and when light work.
I haven't tried any of your other suggestions, e.g. CBT, so I cannot comment on them. But thank you for sharing your experiences.
We're currently looking deeper into how we can extrapolate this trend. Our preliminary high uncertainty estimate is that it is more likely to slow down than speed up over the foreseeable future.