Critiques of prominent AI safety labs: Conjecture

Omega.

Cross-posted from the EA Forum. See the original here. Internal linking has not been updated for LW due to time constraints and will take you back to the original post.

In this series, we consider AI safety organizations that have received more than $10 million per year in funding. There have already been several conversations and critiques around MIRI (1) and OpenAI (1,2,3), so we will not be covering them. The authors include one technical AI safety researcher (>4 years experience), and one non-technical community member with experience in the EA community. We’d like to make our critiques non-anonymously but believe this will not be a wise move professionally speaking. We believe our criticisms stand on their own without appeal to our positions. Readers should not assume that we are completely unbiased or don’t have anything to personally or professionally gain from publishing these critiques. We’ve tried to take the benefits and drawbacks of the anonymous nature of our post seriously and carefully, and are open to feedback on anything we might have done better.

This is the second post in this series and it covers Conjecture. Conjecture is a for-profit alignment startup founded in late 2021 by Connor Leahy, Sid Black and Gabriel Alfour, which aims to scale applied alignment research. Based in London, Conjecture has received $10 million in funding from venture capitalists (VCs), and recruits heavily from the EA movement. We shared a draft of this document with Conjecture for feedback prior to publication, and include their response below. We also requested feedback on a draft from a small group of experienced alignment researchers from various organizations, and have invited them to share their views in the comments of this post.

We would like to invite others to share their thoughts in the comments openly if you feel comfortable, or contribute anonymously via this form. We will add inputs from there to the comments section of this post, but will likely not be updating the main body of the post as a result (unless comments catch errors in our writing).

Key Takeaways

For those with limited knowledge and context on Conjecture, we recommend first reading or skimming the About Conjecture section.

Time to read the core sections (Criticisms & Suggestions and Our views on Conjecture) is 22 minutes.

Criticisms and Suggestions

We think Conjecture’s research is low quality (read more).
- Their posts don’t always make assumptions clear, don’t make it clear what evidence base they have for a given hypothesis, and evidence is frequently cherry-picked. We also think their bar for publishing is too low, which increases the signal to noise ratio. Conjecture has acknowledged some of these criticisms, but not all (read more).
- We make specific critiques of examples of their research from their initial research agenda (read more).
- There is limited information available on their new research direction (cognitive emulation), but from the publicly available information it appears extremely challenging and so we are skeptical as to its tractability (read more).
We have some concerns with the CEO’s character and trustworthiness because, in order of importance (read more):
- The CEO and Conjecture have misrepresented themselves to external parties multiple times (read more);
- The CEO’s involvement in EleutherAI and Stability AI has contributed to race dynamics (read more);
- The CEO previously overstated his accomplishments in 2019 (when an undergrad) (read more);
- The CEO has been inconsistent over time regarding his position on releasing LLMs (read more).
We believe Conjecture has scaled too quickly before demonstrating they have promising research results, and believe this will make it harder for them to pivot in the future (read more).
We are concerned that Conjecture does not have a clear plan for balancing profit and safety motives (read more).
Conjecture has had limited meaningful engagement with external actors (read more):
- Conjecture lacks productive communication with external actors within the TAIS community, often reacting defensively to negative feedback and failing to address core points (read more);
- Conjecture has not engaged sufficiently with the broader ML community, we think they would receive valuable feedback by engaging more. We’ve written more about this previously (read more).

Our views on Conjecture

We would advise against working at Conjecture, given their history of low quality research, concerns with the CEO’s character and trustworthiness and the leadership team’s lack of research experience (read more).
We would advise Conjecture to avoid unilateral engagement with important stakeholders and strive to represent their place in the TAIS ecosystem accurately because they have misrepresented themselves multiple times (read more).
We do not think that Conjecture should receive additional funding before addressing key concerns because of the reasons cited above (read more).
We encourage the TAIS and EA community members and organizations reflect to what extent they want to legitimize Conjecture until Conjecture addresses these concerns (read more).

About Conjecture

Funding

Conjecture received (primarily via commercial investment) roughly $10 million in 2022. According to them, they’ve received VC backing from Nat Friedman (ex-CEO of GitHub), Patrick and John Collison (co-founders of Stripe), Daniel Gross (investor and cofounder of a startup accelerator), Andrej Karpathy (ex-OpenAI), Sam Bankman-Fried, Arthur Breitman and others. We are not aware of any later funding rounds, but it’s possible they have raised more since then.

Outputs

Products

Verbalize is an automatic transcription model. This is a B2C SaaS product and was released in early 2023. Our impression is that it's easy to use but no more powerful than existing open-source models like Whisper, although we are not aware of any detailed empirical evaluation. We do not think the product has seen commercial success yet, as it was released recently. Our estimate is that about one third of Conjecture’s team are actively working on developing products.

Alignment Research

Conjecture studies large language models (LLMs), with a focus on empirical and conceptual work. Mechanistic interpretability was a particular focus, with output such as the polytope lens, sparse autoencoders and analyzing the SVD of weight matrices, as well as work more broadly seeking to better understand LLMs, such as simulator theory.

They have recently pivoted away from this agenda towards cognitive emulation, which is reminiscent of process-based supervision. Here is a link to their full research agenda and publication list. Due to their infohazard policy (see below), some of their research may not have been publicly released.

Infohazard policy

Conjecture developed an infohazard policy in their first few months and shared it publicly to encourage other organizations to publish or adopt similar policies. They say that while many actors were “verbally supportive of the policy, no other organization has publicly committed to a similar policy”.

Governance outreach

We understand that CEO Connor Leahy does a lot of outreach to policymakers in the UK, and capabilities researchers at other prominent AI companies. He’s also appeared on several podcasts (1, FLI (1,2,3,4), 3, 4, 5) and been interviewed by several journalists (1, 2, 3, 4, 5, 6, 7, 8).

Incubator Program

Adam Shimi ran an incubator called Refine in 2022, whose purpose was to create new independent conceptual researchers and help them build original research agendas. Based on Adam’s retrospective, it seems like this project wasn’t successful at achieving its goals and Adam is now pursuing different projects.

Team

The Conjecture team started as a team of 4 employees in late 2021 and have grown to at least 22 employees now (according to their LinkedIn), with most employees joining in 2022.

Their CEO, Connor Leahy, has a technical background (with 2 years of professional machine learning experience and a Computer Science undergrad) and partially replicated GPT-2 in 2019 (discussed in more detail below). Their Chief of Staff, has experience with staffing and building team culture from her time at McKinsey, and has similar experience at Meta. Their co-founder Gabriel Alfour has the most relevant technical and scaling experience as the CEO of Marigold,^[1] a firm performing core development on the Tezos cryptocurrency infrastructure with over 30 staff members.

Two individuals collectively publishing under the pseudonym janus published simulator theory, one of Conjecture's outputs that we understand the TAIS community to have been most favorable towards. They left Conjecture in late 2022. More recently, many researchers working on mechanistic interpretability left the team after Conjecture's pivot towards cognitive emulation. Those departing include Lee Sharkey, the lead author on the sparse autoencoders post and a contributor to the polytope lens post.

Conjecture in the TAIS ecosystem

Conjecture staff are frequent contributors on the Alignment Forum and recruit heavily from the EA movement. Their CEO has appeared on a few EA podcasts (including several times on the FLI podcast). Some TAIS researchers are positive about their work. They fiscally sponsor two TAIS field-building programs, MATS and ARENA, in London (where they are based).

Their team also spent a month in the Bay Area in 2022 (when many TAIS researchers were visiting through programs like MLAB, SERI MATS and on independent grants). Conjecture made an effort to build relationships with researchers, decisionmakers and grantmakers, and were actively fundraising from EA funders during this period. 3-4 Conjecture staff regularly worked out of the Lightcone Offices, with a peak of ~11 staff on a single day. The largest event run by Conjecture was an EA Global afterparty hosted at a Lightcone venue, with a couple hundred attendees, predominantly TAIS researchers.

Criticisms and Suggestions

Low quality research

General thoughts on Conjecture’s research

We believe most of Conjecture’s publicly available research to date is low-quality compared to the average ML conference paper. A direct comparison is difficult as some Conjecture members have prioritized releasing small, regular updates; however our impression is that even combining these they would at best meet the much lower bar of a workshop paper.

As we discuss below, Conjecture does not present their research findings in a systematic way that would make it accessible for others to review and critique. Conjecture’s work often consists of isolated observations that are not built upon or adequately tested in other settings. We recommend Conjecture focus more on developing empirically testable theories, and also suggest they introduce an internal peer-review process to evaluate the rigor of work prior to publicly disseminating their results. Conjecture might also benefit from having researchers and reviewers work through (although not rigidly stick to) the Machine Learning Reproducibility Checklist.

These limitations may in part be because Conjecture is a young organization with a relatively inexperienced research team, a point they have readily acknowledged in retrospectives and when criticized on research quality. However, taking their youth and inexperience into account, we still think their research is below the bar for funding or other significant support. When we take into account the funding that Conjecture has (at least $10M raised in their last round), we think they are significantly underperforming standard academic research labs (see our discussion on this in the Redwood post; we are significantly more excited about Redwood’s research than Conjecture). We believe they could significantly improve their research output by seeking out mentorship from more experienced ML or alignment researchers, and recommend they do this in the future.

Initial research agenda (March 2022 - Nov 2022)

Conjecture’s initial research agenda focused on interpretability, conceptual alignment and epistemology. Based on feedback from Conjecture, our understanding is that Conjecture is now much more excited about their new research direction in cognitive emulation. We discuss this new direction in the following section. However, as an organization's past track record is one of the best predictors of their future impact, we believe it is important to understand Conjecture's previous approach.

To Conjecture's credit, they acknowledged a number of mistakes in their retrospective. For example, they note that their simulators post was overinvested in, and "more experienced alignment researchers who have already developed their own deep intuitions about GPT-like models didn’t find the framing helpful." However, there are several issues we identify (such as lack of rigor) that are not discussed in the retrospective. There are also issues discussed in the retrospective where Conjecture leadership comes to the opposite conclusion to us: for example, Conjecture writes that they "overinvested in legibility and polish" whereas we found many of their posts to be difficult to understand and evaluate.

We believe three representative posts, which Conjecture leadership were excited by as of 2022 Q3, were: janus’s post on simulators, Sid and Lee's post on polytopes, and their infohazard policy. These accomplishments were also highlighted in their retrospective. Although we find these posts to have some merit, we would overall assess them as having limited impact. Concretely, we would evaluate Redwood's Indirect Object Identification or Causal Scrubbing papers as both more novel and scientifically rigorous. We discuss their infohazard policy, simulators and polytopes post in turn below.

Their infohazard policy is a fairly standard approach to siloing research, and is analogous to structures common in hedge funds or classified research projects. It may be positive for Conjecture to have adopted such a policy (although it introduces risks of concentrating power in the CEO, discussed in the next section), but it does not provide any particular demonstration of research capability.

The simulators and polytopes posts are both at an exploratory stage, with limited empirical evidence and unclear hypotheses. Compared to similar exploratory work (e.g. the Alignment Research Center), we think Conjecture doesn’t make their assumptions clear enough and have too low a bar for sharing, reducing the signal-to-noise ratio and diluting standards in the field. When they do provide evidence, it appears to be cherry picked.

Their posts also do not clearly state the degree of belief they have in different hypotheses. Based on private conversations with Conjecture staff, they often appear very confident in their views and results of their research despite relatively weak evidence for them. In the simulators post, for example, they describe sufficiently large LLMs as converging to simulators capable of simulating “simulacra”: different generative processes that are consistent with the prompt. The post ends with speculative beliefs that they stated fairly confidently that took the framing to an extreme (e.g if the AI system adopts the “superintelligent AI persona” it’ll just be superintelligent).

We think the framing was overall helpful, especially to those newer to the field, although it can also sometimes be confusing: see e.g. these critiques. The framing had limited novelty: our anecdotal impression is that most researchers working on language model alignment were already thinking along similar lines. The more speculative beliefs stated in the post are novel and significant if true, but the post does not present any rigorous argument or empirical evidence to support them. We believe it’s fine to start out with exploratory work that looks more like an op-ed, but at some point you need to submit your conjectures to theoretical or empirical tests. We would encourage Conjecture to explicitly state their confidence levels in written output and make clear what evidence base they do or do not have for a given hypothesis (e.g. conceptual argument, theoretical result, empirical evidence).

New research agenda (Nov 22 - Present)

Conjecture now has a new research direction exploring cognitive emulation. The goal is to produce bounded agents that emulate human-like thought processes, rather than agents that produce good output but for alien reasons. However, it’s hard to evaluate this research direction as they are withholding details of their plan due to their infohazard policies. On the face of it, this project is incredibly ambitious, and will require huge amounts of effort and talent. Because of this, details on how they will execute the project are important to understanding how promising this project may be. We would encourage Conjecture to share some more technical detail unless there are concrete info-hazards they are concerned about. In the latter case we would suggest sharing details with a small pool of trusted TAIS researchers for external evaluation.

CEO’s character and trustworthiness

We are concerned by the character and trustworthiness of Conjecture's CEO, Connor Leahy. We are also concerned that Connor has demonstrated a lack of attention to rigor and engagement with risky behavior, and that he, along with other staff, have demonstrated an unwillingness to take external feedback (see below).

Although this section focuses on the negatives, there are of course positive aspects to Connor's character. He is clearly a highly driven individual, who has built a medium-sized organization in his early twenties. He has shown a willingness to engage with arguments and change his mind on safety concerns, for example delaying the release of his GPT-2 replication. Moreover, in recent years Connor has been a vocal public advocate for safety: although we disagree in some cases with the framing of the resulting media articles, in general we are excited to see greater public awareness of safety risks.^[2]

The character of an organization’s founder and CEO is always an important consideration, especially for early-stage companies. Moreover, we believe this consideration is particularly strong in the case of Conjecture:

Conjecture engages in governance outreach that involves building relationships between government actors and the TAIS community, and there are multiple accounts of Conjecture misrepresenting themselves.
As the primary stakeholder & CEO, Connor will be responsible for balancing incentives to develop capabilities from stakeholders (see below).
Conjecture's infohazard policy has the consequence of heavily centralizing power to the CEO (even more so than a typical tech company). The policy mandates projects are siloed, and staff may be unaware of the details (or even the existence) of significant fractions of Conjecture's work. The CEO is Conjecture's "appointed infohazard coordinator" with "access to all secrets and private projects" – and thus is the only person with full visibility. This could substantially reduce staff's ability to evaluate Conjecture's strategy and provide feedback internally. Additionally, if they don’t have the full information, they may not know if Conjecture is contributing to AI risk.^[3] We are uncertain the degree to which this is a problem given Conjecture's current level of internal secrecy.

Conjecture and their CEO misrepresent themselves to various parties

We are generally worried that Connor will tell the story that he expects the recipient to find most compelling, making it challenging to confidently predict his and Conjecture's behavior. We have heard credible complaints of this in their interactions with funders. One experienced technical AI safety researcher recalled Connor saying that he will tell investors that they are very interested in making products, whereas the predominant focus of the company is on AI safety.

We have heard that Conjecture misrepresent themselves in engagement with the government, presenting themselves as experts with stature in the AIS community, when in reality they are not. We have heard reports that Conjecture's policy outreach is decreasing goodwill with policymakers. We think there is a reasonable risk that Connor and Conjecture’s actions may be unilateralist and prevent important relationships from being formed by other actors in the future.

Unfortunately we are unable to give further details about these incidents as our sources have requested confidentiality; we understand this may be frustrating and acknowledge it is difficult for Conjecture to substantively address these concerns.

We would recommend Connor be more honest and transparent about his beliefs, plans and Conjecture’s role in the TAIS ecosystem. We also recommend the Conjecture introduce a strong, robust governance structure (see below).

Contributions to race dynamics

We believe that Connor Leahy has contributed to increasing race dynamics and accelerating capabilities research, through supporting the creation of Stability AI through founding EleutherAI. EleutherAI is a community research group focused on open-source AI research founded in 2020. Under Connor's leadership, their plan was to build and release large open-source models to allow more people to work on important TAIS research that is only possible on pretrained LLMs. At the time, several members of the TAIS community, including Dan Hendrycks (founder of CAIS), privately warned Connor and EleutherAI that it would be hard to control an open source collective.

Stability AI

Stability AI brands themselves as an AGI lab and has raised $100M to fund research into and training of large, state-of-the-art models including Stable Diffusion.^[4] The addition of another AGI focused lab is likely to further exacerbate race dynamics. Stability is currently releasing the majority of the work they create as open-source: this has some benefits, enabling a broader range of researchers (including alignment researchers) to study these models. However, it also has significant drawbacks, such as making potential moratoriums on capabilities research much harder (if not impossible) to enforce. To our knowledge, Stability AI has not done much algorithmic advancement yet.

EleutherAI was pivotal in the creation of Stability AI. Our understanding is that the founder of Stability AI, Emad Mostaque, was active on the EleutherAI Discord and recruited much of his initial team from there. On the research side, Stability AI credited EleutherAI as supporting the initial version of Stable Diffusion in August 2022, as well as their most recent open-source language model release StableLM in April 2023. Emad (in Feb 2023) described the situation as: “Eleuther basically split into two. Part of it is Stability and the people who work here on capabilities. The other part is Conjecture that does specific work on alignment, and they're also based here in London.”

Stability AI continues to provide much of EleutherAI’s compute and is a sponsor of EluetherAI, alongside Nat Friedman (who also invested in Conjecture). Legally, Stability AI directly employed key staff of EleutherAI in a relationship we believe was similar to fiscal sponsorship. We understand that EleutherAI have recently transitioned to employing staff directly via their own non-profit entity (Connor and Emad sit on the board).

EleutherAI

EleutherAI is notable for having developed open-source LLMs such as GPT-NeoX. In the announcement post in February 2022, they claimed that "GPT-NeoX-20B is, to our knowledge, the largest publicly accessible pretrained general-purpose autoregressive language model, and we expect it to perform well on many tasks."

We do not think that there was much meaningful alignment output from EleutherAI itself during Connor’s tenure – most of the research published is capabilities research, and the published alignment research is of mixed quality. On the positive side, EleutherAI’s open-source models have enabled some valuable safety research. For example, GPT-J was used in the ROME paper and is widely used in Jacob Steinhardt’s lab. EleutherAI is also developing a team focusing on interpretability, with their initial work includes developing the tuned lens in a collaboration with FAR AI and academics from Boston and Toronto.

Connor’s founding and management of EleutherAI indicates to us that he was overly optimistic about rapidly growing a community of people interested in language models and attracting industry sponsorship translating into meaningful alignment research. We see EleutherAI as having mostly failed at its goals of AI safety, and instead accelerated capabilities via their role in creating Stability.ai and Stable Diffusion.

In particular, EleutherAI's supporters were primarily interested in gaining access to state-of-the-art LLM capabilities with limited interest in safety. For example, the company Coreweave provided EleutherAI with compute and then used their models to sell a LM inference API called GooseAI. We conjecture that the incentive to please their sponsors, enabling further scale-up, may have contributed to EleutherAI's limited safety output.

We feel more positively about Conjecture than early-stage EleutherAI given Conjecture's explicit alignment research focus, but are concerned that Connor appears to be bringing a very similar strategy to Conjecture as to EleutherAI: scaling before producing tangible alignment research progress and attracting investment from external actors (primarily investors) with opposing incentives that they may not be able to withstand. We would encourage Conjecture to share a clear theory of change which includes safeguards against these risks.

To be clear, we think Conjecture's contribution to race dynamics is far less than that of OpenAI or Anthropic, both of which have received funding and attracted talent from the EA ecosystem. We would assess OpenAI as being extremely harmful for the world. We are uncertain on Anthropic: they have undoubtedly contributed to race dynamics (albeit less so than OpenAI), but have also produced substantial safety research. We will discuss Anthropic further in an upcoming post, but in either case we do not think that AGI companies pushing forward capabilities should exempt Conjecture or other organizations from criticisms.

Overstatement of accomplishments and lack of attention to precision

In June 2019, Connor claimed to have replicated GPT-2 while he was an undergraduate. However, his results were inaccurate and his 1.5B parameter model was weaker than even the smallest GPT-2 series model.^[5] He later admitted to these mistakes, explaining that his metric code was flawed and that he commingled training and evaluation datasets. Additionally, he said that he didn’t evaluate the strength of his final model, only one halfway through training. He said the reason he did this was because “I got cold feet once I realized what I was sitting on [something potentially impressive] and acted rashly.”^[6] We think this points to a general lack of thoughtfulness for making true and accurate claims.

We don’t want to unfairly hold people’s mistakes from their college days against them – many people exaggerate or overestimate (intentionally or not) their own accomplishments. Even a partial replica of GPT-2 is an impressive technical accomplishment for an undergraduate, so this project does attest to Connor's technical abilities. It is also positive that he admitted his mistake publicly. However, overall we do believe the project demonstrates a lack of attention to detail and rigor. Moreover, we haven’t seen signs that his behavior has dramatically changed.

Inconsistency over time regarding releasing LLMs

Connor has changed his stance more than once regarding whether to publicly release LLMs. Given this, it is difficult to be confident that Conjecture's current approach of defaulting to secrecy will persist over time.

In July 2019, Connor released the source code used to train his replica along with pretrained models comparable in size to the already released GPT-2 117M and and 345M models. The release of the training code seems hasty, enabling actors with sufficient compute but limited engineering skills to train their own, potentially superior, models. At this point, Connor was planning to release the full 1.5B parameter model to the public, but was persuaded not to.^[7] In the end, he delayed releasing the model to Nov 13 2019, a week after OpenAI released their 1.5B parameter version, on his personal GitHub.

In June 2021 Connor changed his mind and argued that releasing large language models would be beneficial to alignment as part of the team at EleutherAI (see discussion above). In Feb 2022, EleutherAI released an open-source 20B parameter model, GPT-NeoX. Their stated goal, endorsed by Connor in several places, was to "train a model comparable to the biggest GPT-3 (175 billion parameters)" and release it publicly. Regarding the potential harm of releasing models, we find Connor's arguments plausible – whether releasing open-source models closer to the state-of-the-art is beneficial or not remains a contested point. However, we are confident that sufficiently capable models should not be open-sourced, and expect strong open-source positive messaging to be counterproductive. We think EleutherAI made an unforced error by not at least making some gesture towards publication norms (e.g. they could have pursued a staggered release giving early access to vetted researchers).

In July 2022, Connor shared Conjecture’s Infohazard Policy. This policy is amongst the most restrictive at any AI company – even more restrictive than what we would advocate for. To the best of our knowledge, Conjecture's Infohazard Policy is an internal policy that can be overturned by Connor (acting as chief executive), or by a majority of their owners (of whom Connor as a founder will have a significant stake). Thus we are hesitant to rely on Conjecture’s Infohazard Policy remaining strictly enforced, especially if subject to commercial pressures.

Scaling too quickly

We think Conjecture has grown too quickly, from 0 to at least 22 staff from 2021 to 2022. During this time, they have not had what we would consider to be insightful or promising outputs, making them analogous to a very early stage start-up. This is a missed opportunity: their founding team and early employees include some talented individuals who, given time and the right feedback, might well have been able to identify a promising approach.

We believe that Conjecture’s basic theory of change for scaling is:

1) they’ve gotten good results relative to how young they are, even though the results themselves are not that insightful or promising in absolute terms, and

2) the way to improve these results is to scale the team so that they can test out more ideas and get more feedback on what does and doesn’t work.

Regarding 1) we think that others of similar experience level – and substantially less funding – have produced higher-quality output. Concretely, we are more excited about Redwood’s research than Conjecture (see our criticisms of Conjecture’s research), despite being critical of Redwood’s cost-effectiveness to date.^[8] Notably, Redwood drew on a similar talent pool to Conjecture, largely hiring people without prior ML research experience.

Regarding 2), we disagree that scaling up will improve their research quality. In general, the standard lean startup team advice is that it’s important to keep your team small while you are finding product-market fit or, in Conjecture's case, developing an exciting research agenda. We think it’s very likely Conjecture will want to make major pivots in the next few years. Rapid growth will make it harder for them to pivot. With growing scale, more time will be spent on management, and it will be easier to get people locked into the wrong project or create dynamics where people are more likely to defend their pet projects. We can't think of examples where scale up has taken place successfully before finding product-market fit.

This growth would be challenging to manage in any organization. However, in our opinion alignment research is more challenging to scale than a traditional tech start-up due to the weaker feedback loops: it's much harder to tell if your alignment research direction is promising than whether you've found product-market fit.

Compounding this problem, their founding team Connor, Sid and Gabriel have limited experience in scaling research organizations. Connor and Sid's experience primarily comes from co-founding EleutherAI, a decentralized research collective: their frustrations with that lack of organization are part of what drove them to found Conjecture. Gabriel has the most relevant experience.

Conjecture appeared to have rapid scaling plans, but their growth has slowed in 2023. Our understanding is that this slow-down is primarily due to them being unable to raise adequate funding for their expansion plans.

To address this problem, we would recommend that Conjecture:

Freeze hiring of junior staff until they identify scalable research directions that they and others in the alignment community are excited by. Conjecture may still benefit from making a small number of strategic hires that can help them manage their current scale and continue to grow, such as senior research engineers and people who have experience managing large teams.
Consider focusing on one area (e.g. technical research) and keeping other teams (e.g. product and governance) lean, or even consider whether they need them.
While we don’t think it’s ideal to let go of staff, we tentatively suggest Conjecture consider whether it might be worth making the team smaller to focus on improving their research quality, before growing again.

Unclear plan for balancing profit and safety motives

According to their introduction post, they think being a for-profit company is the best way to reach their goal because it lets them “scale investment quickly while maintaining as much freedom as possible to expand alignment research.” We think this could be challenging in practice: scaling investment requires delivering results that investors find impressive, as well as giving investors some control over the firm in the form of voting shares and, frequently, board seats.

Conjecture has received substantial backing from several prominent VCs. This is impressive, but since many of their backers (to our knowledge) have little interest in alignment, Conjecture will be under pressure to develop a pathway to profitability in order to raise further funds.

Many routes to developing a profitable AI company have significant capabilities externalities. Conjecture’s CEO has indicated they plan to build "a reliable pipeline to build and test new product ideas" on top of internal language models. Although this seems less bad than the OpenAI model of directly advancing the state-of-the-art in language models, we expect demonstrations of commercially viable products using language models to lead to increased investment in the entire ecosystem – not just Conjecture.

For example, if Conjecture does hit upon a promising product, it would likely be easy for a competitor to copy them. Worse, the competitor might be able to build a better product by using state-of-the-art models (e.g. those available via the OpenAI API). To keep up, Conjecture would then have to either start training state-of-the-art models themselves (introducing race dynamics), or use state-of-the-art models from competitors (and ultimately provide revenue to them).

Conjecture may have good responses to this. Perhaps there are products which are technically intricate to develop or have other barriers to entry making competition unlikely, and/or where Conjecture's internal models are sufficient. We don’t have reason to believe Verbalize falls into this category as there are several other competitors already (e.g. fireflies.ai, otter.ai, gong.io). We would encourage Conjecture to share any such plans they have to simultaneously serve two communities (for-profit VCs and TAIS), with sometimes conflicting priorities, for review with both sets of stakeholders.

Our impression is that they may not have a solid plan here (but we would invite them to share their plans if they do). Conjecture was trying to raise a series B from EA-aligned investors to become an alignment research organization. This funding round largely failed, causing them to pivot to focus more on VC funding. Based on their past actions we think it’s likely that they may eventually hit a wall with regards to product development, and decide to focus on scaling language models to get better results, contributing to race dynamics. In fairness to Conjecture, we would consider the race risk of Conjecture to be much smaller than that of Anthropic, which operates at a much bigger scale, is scaling much more rapidly, and has had more commercial success with its products.

It's not uncommon that people and orgs who conceive of or present themselves as AIS focused end up advancing capabilities much more than safety. OpenAI is perhaps the most egregious case of this, but we are also concerned about Anthropic (and will write about this in a future post). These examples should make us suspect that by default Conjecture's for-profit nature will end up causing it to advance capabilities, and demand a clear and detailed plan to avoid this to be convinced otherwise.

In addition to sharing their plans for review, we would recommend that Conjecture introduce robust corporate governance structures. Our understanding is that Conjecture is currently structured as a standard for-profit start-up with the founders controlling the majority of voting shares and around a third of the company owned by VCs. This is notably worse than OpenAI LP, structured as a "capped-profit" corporation with non-profit OpenAI, Inc. the sole controlling shareholder.^[9] One option would be for Conjecture to implement a "springing governance" structure in which given some trigger (such as signs that AGI is imminent, or that their total investment exceeds some threshold) its voting shares become controlled by a board of external advisors. This would pass governance power, but not financial equity, to people who Conjecture considers to be a good board – rather than being controlled wholly by their founding team.

Limited meaningful engagement with external actors

Lack of productive communication between TAIS researchers and Conjecture staff

We know several members of the EA and TAIS community who have tried to share feedback privately with Conjecture but found it very challenging. When negative feedback is shared, members of the Conjecture team sometimes do not engage meaningfully with it, missing the key point or reacting defensively. Conjecture leadership will provide many counter-arguments, none of which address the core point, or are particularly strong. This is reminiscent of the Gish gallop rhetorical technique, which can overwhelm interlocutors as it’s very difficult to rebut each counter-argument. Some Conjecture staff members also frequently imply that the person giving the criticism has ulterior motives or motivated reasoning.

It can be hard to hear criticism of a project you are passionate about and have invested considerable time in, so it’s natural that Conjecture staff are defensive over their work. However, we would recommend that Conjecture staff and especially leadership make an effort to constructively engage in criticism, seeking to understand where the critique is coming from, and take appropriate steps to correct misunderstandings and/or resolve the substance of the critique.

Lack of engagement with the broader ML community

Conjecture primarily disseminates their findings on the Alignment Forum. However, many of their topics (particularly interpretability) are at least adjacent to active research fields, such that a range of academic and industry researchers could both provide valuable feedback on Conjecture's research and gain insights from their findings.

Conjecture is not alone in this: as we wrote previously, we also think that Redwood could engage further with the ML community. Conjecture has not published any peer-reviewed articles, so we think they would benefit even more than Redwood from publishing their work and receiving external feedback. We would recommend Conjecture focus on developing what they consider to be their most insightful research projects into a conference-level paper, and hiring more experienced ML research scientists or advisors to help them both effectively communicate their research and improve rigor.

Our views on Conjecture

We are genuinely concerned about Conjecture’s trustworthiness and how they might negatively affect the TAIS community and the TAIS community’s efforts to reduce risk from AGI. These are the main changes we call for, in rough order of importance.

We would advise against working at Conjecture

Given Conjecture's weak research track record, we expect the direct impact of working at Conjecture to be low. We think there are many more impactful places to work, including non-profits such as Redwood, CAIS and FAR; alignment teams at Anthropic, OpenAI and DeepMind; or working with academics such as Stuart Russell, Sam Bowman, Jacob Steinhardt or David Krueger. Note we would not in general recommend working at capabilities-oriented teams at Anthropic, OpenAI, DeepMind or other AGI-focused companies.

Additionally, Conjecture seems relatively weak for skill building, since their leadership team is relatively inexperienced and also stretched thin due to Conjecture's rapid scaling. We expect most ML engineering or research roles at prominent AI labs to offer better mentorship than Conjecture. Although we would hesitate to recommend taking a position at a capabilities-focused lab purely for skill building, we find it plausible that Conjecture could end up being net-negative, and so do not view Conjecture as a safer option in this regard than most competing firms.

In general, we think that the attractiveness of working at an organization that is connected to the EA or TAIS communities makes it more likely for community members to take jobs at such organizations even if this will result in a lower lifetime impact than alternatives. Conjecture's sponsorship of TAIS field building efforts may also lead new talent, who are unfamiliar with Conjecture's history, to have an overly rosy impression of them.

We would advise Conjecture to take care when engaging with important stakeholders and represent their place in the TAIS ecosystem accurately

We are concerned that Conjecture has misrepresented themselves to various important stakeholders, including funders and policymakers. We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk. These unilateral actions may therefore prevent important relationships from being formed by other actors in the future. This risk is further exacerbated by Connor’s unilateralist actions in the past, Conjecture’s overall reluctance to take feedback from external actors, and their premature and rapid scaling.

We do not think that Conjecture should receive additional funding before addressing key concerns

We have substantial concerns with the organization’s trustworthiness and the CEO’s character. We would strongly recommend that any future funding from EA sources be conditional on Conjecture putting in place a robust corporate governance structure to bring them at least on par with other for-profit and alignment-sympathetic firms such as OpenAI and Anthropic.

Even absent these concerns, we would not currently recommend Conjecture for funding due to the lack of a clear impact track record despite a considerable initial investment of $10mn. To recommend funding, we would want to see both improvements in corporate governance and some signs of high-quality work that the TAIS community are excited by.

Largely we are in agreement with the status quo here: so far Conjecture has largely been unsuccessful fundraising from prominent EA funders, and where they have received funding it was for significantly less than their initial asks.

We encourage TAIS and EA community members to consider to what extent they want to legitimize Conjecture until Conjecture addresses these concerns

Conjecture has several red flags and a weak track record for impact. Although the TAIS and EA community have largely refrained from explicit endorsements of Conjecture (such as funding them), there are a variety of implicit endorsements. These include tabling at EA Global career fairs, Lightcone hosting Conjecture events and inviting Conjecture staff, field-building organizations such as MATS and ARENA working with Conjecture as a fiscal sponsor,^[10] as well as a variety of individuals in the community (mostly unaware of these issues) recommending Conjecture as a place to work.

To clarify, we think individuals should still read and engage with Conjecture's research where they judge it to be individually worth their time. We also welcome public debates involving Conjecture staff, such as the one between Paul Christiano and Gabriel Alfour. Our goal is not to shun Conjecture, but to avoid giving them undue influence until their research track record and governance structure improves.

We recognize that balancing these considerations can be tricky, which is why our main recommendation is to encourage people to spend time actively reflecting on how they want to engage with Conjecture in light of the information we present in this post (alongside other independent sources).

Appendix

Communication with Conjecture

We shared a draft of this post with Conjecture to review, and have included their full response (as they indicated they would post it publicly) below. We thank them for their engagement and made several minor updates to the post in response, however we disagree with several key claims made by Conjecture in their response. We describe the changes we made, and where we disagree, in the subsequent section.

Conjecture’s Reply

Hi,

Thank you for your engagement with Conjecture’s work and for providing us an opportunity to share our feedback.

As it stands, the document is a hit piece, whether intentional or not. It is written in a way such that it would not make sense for us to respond to points line-by-line. There are inaccuracies, critiques of outdated strategies, and references to private conversations where the details are obscured in ways that prevent us from responding substantively. The piece relies heavily on criticism of Connor, Conjecture CEO, but does not attempt to provide a balanced assessment: there are no positive comments written about Connor along with the critiques, and past mistakes he admitted to publicly are spun as examples of “low-integrity” behavior. Nuanced points such as the cost/benefit of releasing small open source models (pre-Chinchilla) are framed as “rash behavior,” even when you later write that you find Connor’s arguments “plausible.” Starting from this negative frame does not leave room for us to reply and trust that an object-level discussion will proceed.

We also find it surprising to see that most of the content of the piece is based on private discussions and documents shared between Conjecture, ~15 regrantors, and the FTX Future Fund team in August 2022. The piece does not disclose this context. Besides the fact that much of that information is outdated and used selectively, the information has either been leaked to the two anonymous authors, or one of the authors was directly involved in the regranting process. In either case, this is a violation of mutual confidentiality between Conjecture and regrantors/EA leadership involved in that channel.

We don’t mind sharing our past plans and discussions now and would be happy to publish the entire discussions from the Slack channel where those conversations took place (with consent of the other participants). However, it is a sad conclusion of that process that our openness to discussing strategy in front of regrantors formed the majority set of Bay Area TAIS leadership opinions about Conjecture that frame us as not open, despite these conversations being a deeper audit than pretty much any other TAIS organization.

We’d love to have a productive conversation here, but will only respond in detail if you reframe this post from a hit piece to something better informed. If your aim is to promote coordination, we would recommend asking questions about our plans and beliefs, focusing on the parts that do not make sense to you, and then writing your summary. Conjecture’s strategy is debatable, and we are open to changing it - and have done so in the past. Our research is also critiqueable: we agree that our research output has been weak and have already written about this publicly here. But as described above, this post doesn’t attempt to engage with Conjecture’s current direction.

Going further, if the aim of your critique is to promote truth-seeking and transparency, we would gladly participate in a project about creating and maintaining a questionnaire that all AI orgs should respond to, so that there is as little ambiguity in their plans as possible. In our posts we have argued for making AI lab’s safety plans more visible, and previously ran a project of public debates aimed at highlighting cruxes in research disagreements. Conjecture is open to our opinion being on the record, so much so that we have occasionally declined private debates with individuals who don’t want to be on record. This decision may contribute to some notion of our “lack of engagement with criticism.”

—

As a meta-point, we think that certain strategic disagreements between Conjecture and the Bay Area TAIS circles are bleeding into reputational accusations here. Conjecture has been critical of the role that EA actors have played in funding and supporting major AGI labs historically (OAI, Anthropic), and critical of current parts of the EA TAIS leadership and infrastructure that continue to support the development of superintelligence. For example, we do not think that GPT-4 should have been released and are concerned at the role that ARC’s benchmarking efforts played in safety-washing the model. These disagreements in the past have created friction, and we’d hazard that concerns about Conjecture taking “unilateral action” are predicted on this.

Instead of a more abstract notion of “race dynamics,” Conjecture’s main concern is that a couple of AI actors are unabashedly building superintelligence. We believe OpenAI, Deepmind, and Anthropic are not building superintelligence because the market and investors are demanding it. We believe they are building superintelligence because they want to, and because AGI has always been their aim. As such, we think you’re pointing the finger in the wrong direction here about acceleration risks.

If someone actually cares about curtailing “the race”, their best move would be to push for a ban on developing superintelligence and strongly oppose the organizations trying to build it. Deepmind, OpenAI, and Anthropic have each publicly pushed the AI state of the art. Deepmind and OpenAI have in their charters that they want to build AGI. Anthropic’s most recent pitch deck states that they are planning to train an LLM orders of magnitude larger than competitors, and that “companies that train the best 2025/26 models will be too far ahead for anyone to catch up in subsequent cycles,” which is awfully close to talking about DSA. No one at the leadership of these organizations (which you recommend people work at rather than Conjecture) have signed FLI's open letter calling for a pause in AI development. Without an alignment solution, the reasonable thing for any organization to do is stop development, not carve out space to continue building superintelligence unimpeded.

While Conjecture strongly disagrees with the strategies preferred by many in the Bay Area TAIS circles, we’d hope that healthy conversations would reveal some of these cruxes and make it easier to coordinate. As written, your document assumes the Bay Area TAIS consensus is superior (despite being what contributed largely to the push for ASI), casts our alternative as “risking unilateral action,” and deepens the rift.

—

We have a standing offer to anyone to debate with us, and we’d be very happy to discuss with you any part of our strategy, beliefs about AI risks, and research agenda.

More immediately, we encourage you to rewrite your post as a Q&A aimed at asking for our actual views before forming an opinion, or at a minimum, rewrite your post with more balance and breathing room to hear our view. As it stands, this post cleaves the relationship between part of the TAIS ecosystem and Conjecture further and is unproductive for both sides.

Given the importance of having these conversations in the open, we plan to make this reply public.

Thanks for your time and look forward to your response,

Conjecture C-Suite

Brief response and changes we made

Conjecture opted not to respond to our points line-by-line and instead asked us to rewrite the post as a Q&A or “with more balance and breathing room to hear our view.” While we won’t be rewriting the post, we have made changes to the post in response to their feedback, some of which are outlined below.

Conjecture commented that the tone of the post was very negative, and in particular there was a lack of positive comments written about Connor. We have taken that feedback into consideration and have edited the tone to be more neutral & descriptive (with particular attention to the section on Connor). Conjecture also noted that Connor admitted to some of his mistakes publicly. We had previously linked to Connor’s update post on the partial GPT-2 replication, but we edited the section to make it more clear that he did acknowledge his mistake. They also pointed out that we framed the point on releasing models “as “rash behavior,” even when you later write that you find Connor’s arguments “plausible.” We’ve changed this section to be more clear.

They say “this post doesn’t attempt to engage with Conjecture’s current direction.” As we write in our section on their cognitive emulation research, there is limited public information on their current research direction for us to comment on.

They believe that “most of the content of the piece is based on private discussions and documents shared between Conjecture, ~15 regrantors, and the FTX Future Fund team in August 2022.” This is not the case: the vast majority (90+%) of this post is based on publicly available information and our own views which were formed from our independent impression of Conjecture via conversations with them and other TAIS community members. We think the content they may be referring to is:

One conversation that we previously described in the research section regarding Conjecture's original research priorities. We have removed this reference.
One point providing quantitative details of Conjecture's growth plans in the scaling section, which we have removed the details of.
The section on how Conjecture and their CEO represent themselves to other parties. This information was not received from those private discussions and documents.

They say they wouldn’t mind “sharing our past plans and discussions now and would be happy to publish the entire discussions from the Slack channel where those conversations took place (with consent of the other participants).” We welcome and encourage the Conjecture team to share their past plans publicly.

They note that “Conjecture is open to our opinion being on the record, so much so that we have occasionally declined private debates with individuals who don’t want to be on record. This decision may contribute to some notion of our 'lack of engagement with criticism.'" This is not a reason for our comment on their lack of engagement. They mentioned they have “a standing offer to anyone to debate with us”. We appreciate the gesture, but do not have capacity to engage in something as in-depth as a public debate at this time (and many others who have given feedback don’t either).

Conjecture points out the role “EA actors have played in funding and supporting major AGI labs historically (OAI, Anthropic)”, that our “document assumes the Bay Area TAIS consensus is superior … casts our alternative as “risking unilateral action”, and that “these disagreements in the past have created friction, and we’d hazard that concerns about Conjecture taking “unilateral action” are predicted on this.” We outline our specific concerns on unilateralist action, which don’t have to do with Conjecture’s critiques of EA TAIS actors, here. Examples of disagreements with TAIS actors that they cite include:

Conjecture being critical of the role EA actors have played in funding/supporting major AGI labs.
EA TAIS leadership that continue to support development of AGI.
They don’t think GPT-4 should have been released.
They are concerned that ARC’s benchmarking efforts might have safety-washed GPT-4.

We are also concerned about the role that EA actors have and potentially continue to play in supporting AGI labs (we will cover some of these concerns in our upcoming post on Anthropic). We think that Conjecture’s views on ARC are reasonable (although we may not agree with their view). Further, many other EAs and TAIS community members have expressed concerns on this topic, and about OpenAI in particular. We do not think holding this view is particularly controversial or something that people would be critical of. Views like this did not factor into our critique.

Finally, they propose that (rather than critiquing them), we should push for a ban on AGI and oppose organizations trying to build it (OpenAI, DM & Anthropic). While we agree that other labs are concerning, that doesn’t mean that our concerns about Conjecture are erased.

Notes

Gabriel Alfour is still listed as the CEO on Marigold's website: we are unsure if this information is out of date, or if Gabriel still holds this position. We also lack a clear understanding of what Marigold's output is, but spent limited time evaluating this. ↩︎

In particular, Connor has referred to AGI as god-like multiple times in interviews (CNN, Sifted). We are skeptical if this framing is helpful. ↩︎

Employee retention is a key mechanism by which tech companies have been held accountable: for example, Google employees' protest over Project Maven led to Google withdrawing from the project. Similarly, the exodus of AIS researchers from OpenAI to found Anthropic was partly fueled by concerns that OpenAI was contributing to AI risk. ↩︎

Stable Diffusion is a state-of-the-art generative model with similar performance to OpenAI’s DALL-E. They are open-source and open-access - there are no restrictions or filters, so you're not limited by what restrictions a company like OpenAI might apply. This means that people can use the model for abusive behavior (such as deepfakes) ↩︎

Connor reports a WikiText2 perplexity of 43.79 for his replica. This is considerably worse than the 18.34 perplexity achieved by GPT-2 1.5B on this dataset (reported in Table 3 of Radfort et al), and substantially worse than the perplexity achieved by even the smallest GPT-2 117M of 29.41. It is slightly worse than the previously reported state-of-the-art prior to the GPT-2 paper, of 39.14 (reported in Table 2 of Gong et al). Overall, it’s a substantial accomplishment, especially for an undergraduate who built the entire training pipeline (including data scraping) from scratch, but is far from a replication. ↩︎

Here is the full text from the relevant section of the article: “model is not identical to OpenAI’s because I simply didn’t have all the details of what they did … [and] the samples and metrics I have shown aren’t 100% accurate. For one, my metric code is flawed, I made several rookie mistakes in setting up accurate evaluation (let train and eval data mix, used metrics whose math I didn’t understand etc), and the model I used to generate the samples is in fact not the final trained model, but one about halfway through the training. I didn’t take my time to evaluate the strength of my model, I simply saw I had the same amount of hardware as OpenAI and code as close to the paper as possible and went with it. The reason for this is a simple human flaw: I got cold feet once I realized what I was sitting on and acted rashly.” ↩︎

This was in part due to conversations with OpenAI and Buck Shlegeris (then at MIRI) ↩︎

Redwood and Conjecture have received similar levels of funding ↩︎

Anthropic has a public benefit corporation structure, with reports that it includes a long-term benefit committee of people unaffiliated with the company who can override the composition of its board. Overall we have too little information to judge whether this structure is better or worse than OpenAI’s, but both seem better than being a standard C-corporation. ↩︎

Conjecture has been active in running or supporting programs aimed at AI safety field-building. Most notably, they ran the Refine incubator, and are currently fiscally sponsoring ARENA and MATS for their London based cohort. We expect overall these programs are net-positive, and are grateful that Conjecture is contributing to them. However, it may have a chilling effect: individuals may be reluctant to criticize Conjecture if they want to be part of these sponsored programs. It may also cause attendees to be more likely than they otherwise would to work for Conjecture. We would encourage ARENA and MATS to find a more neutral fiscal sponsor in the UK to avoid potential conflicts of interest. For example, they could hire staff members using employer-of-record services such as Deel or Remote. If Conjecture does continue fiscally sponsoring organizations, we would encourage them to adopt a clear legal separation between Conjecture and fiscally sponsored entities along with a conflict-of-interest policy to safeguard the independence of the fiscally sponsored entities. ↩︎

(cross-commented from EA forum)

I personally have no stake in defending Conjecture (In fact, I have some questions about the CoEm agenda) but I do think there are a couple of points that feel misleading or wrong to me in your critique.

1. Confidence (meta point): I do not understand where the confidence with which you write the post (or at least how I read it) comes from. I've never worked at Conjecture (and presumably you didn't either) but even I can see that some of your critique is outdated or feels like a misrepresentation of their work to me (see below). For example, making recommendations such as "freezing the hiring of all junior people" or "alignment people should not join Conjecture" require an extremely high bar of evidence in my opinion. I think it is totally reasonable for people who believe in the CoEm agenda to join Conjecture and while Connor has a personality that might not be a great fit for everyone, I could totally imagine working with him productively. Furthermore, making a claim about how and when to hire usually requires a lot of context and depends on many factors, most of which an outsider probably can't judge.
Given that you state early on that you are an experienced member of the alignment community and your post suggests that you did rigorous research to back up these claims, I think people will put a lot of weight on this post and it does not feel like you use your power responsibly here.
I can very well imagine a less experienced person who is currently looking for positions in the alignment space to go away from this post thinking "well, I shouldn't apply to Conjecture then" and that feels unjustified to me.

2. Output so far: My understanding of Conjecture's research agenda so far was roughly: "They started with Polytopes as a big project and published it eventually. On reflection, they were unhappy with the speed and quality of their work (as stated in their reflection post) and decided to change their research strategy. Every two weeks or so, they started a new research sprint in search of a really promising agenda. Then, they wrote up their results in a preliminary report and continued with another project if their findings weren't sufficiently promising." In most of their public posts, they stated, that these are preliminary findings and should be treated with caution, etc. Therefore, I think it's unfair to say that most of their posts do not meet the bar of a conference publication because that wasn't the intended goal.
Furthermore, I think it's actually really good that the alignment field is willing to break academic norms and publish preliminary findings. Usually, this makes it much easier to engage with and criticize work earlier and thus improves overall output quality.
On a meta-level, I think it's bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don't find a lot until you hit. These kinds of critiques make it more likely that people follow small incremental research agendas and alignment just becomes academia 2.0. When you make a critique like that, at least acknowledge that hits-based research might be the right approach.

3. Your statements about the VCs seem unjustified to me. How do you know they are not aligned? How do you know they wouldn't support Conjecture doing mostly safety work? How do you know what the VCs were promised in their private conversations with the Conjecture leadership team? Have you talked to the VCs or asked them for a statement?
Of course, you're free to speculate from the outside but my understanding is that Conjecture actually managed to choose fairly aligned investors who do understand the mission of solving catastrophic risks. I haven't talked to the VCs either, but I've at least asked people who work(ed) at Conjecture.

In conclusion:
1. I think writing critiques is good but really hard without insider knowledge and context.
2. I think this piece will actually (partially) misinform a large part of the community. You can see this already in the comments where people without context say this is a good piece and thanking you for "all the insights".
3. The EA/LW community seems to be very eager to value critiques highly and for good reason. But whenever people use critiques to spread (partially) misleading information, they should be called out.
4. That being said, I think your critique is partially warranted and things could have gone a lot better at Conjecture. It's just important to distinguish between "could have gone a lot better" and "we recommend not to work for Conjecture" or adding some half-truths to the warranted critiques.
5. I think your post on Redwood was better but suffered from some of the same problems. Especially the fact that you criticize them for having not enough tangible output when following a hits-based agenda just seems counterproductive to me.

(cross-posted from EAF)

Some clarifications on the comment:
1. I strongly endorse critique of organisations in general and especially within the EA space. I think it's good that we as a community have the norm to embrace critiques.
2. I personally have my criticisms for Conjecture and my comment should not be seen as "everything's great at Conjecture, nothing to see here!". In fact, my main criticism of leadership style and CoEm not being the most effective thing they could do, are also represented prominently in this post.
3. I'd also be fine with the authors of this post saying something like "I have a strong feeling that something is fishy at Conjecture, here are the reasons for this feeling". Or they could also clearly state which things are known and which things are mostly intuitions.
4. However, I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.
5. My main problem with the post is that they make a list of specific claim with high confidence and I think that is not warranted given the evidence I'm aware of. That's all.

(cross-posted from EAF, thanks Richard for suggesting. There's more back-and-forth later.)

I'm not very compelled by this response.

It seems to me you have two points on the content of this critique. The first point:

I think it's bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don't find a lot until you hit.

I'm pretty confused here. How exactly do you propose that funding decisions get made? If some random person says they are pursuing a hits-based approach to research, should EA funders be obligated to fund them?

Presumably you would want to say "the team will be good at hits-based research such that we can expect a future hit, for X, Y and Z reasons". I think you should actually say those X, Y and Z reasons so that the authors of the critique can engage with them; I assume that the authors are implicitly endorsing a claim like "there aren't any particularly strong reasons to expect Conjecture to do more impactful work in the future".

The second point:

Your statements about the VCs seem unjustified to me. How do you know they are not aligned? [...] I haven't talked to the VCs either, but I've at least asked people who work(ed) at Conjecture.

Hmm, it seems extremely reasonable to me to take as a baseline prior that the VCs are profit-motivated, and the authors explicitly say

We have heard credible complaints of this from their interactions with funders. One experienced technical AI safety researcher recalled Connor saying that he will tell investors that they are very interested in making products, whereas the predominant focus of the company is on AI safety.

The fact that people who work(ed) at Conjecture say otherwise means that (probably) someone is wrong, but I don't see a strong reason to believe that it's the OP who is wrong.

At the meta level you say:

I do not understand where the confidence with which you write the post (or at least how I read it) comes from.

And in your next comment:

I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.

But afaict, the only point where you actually disagree with a claim made in the OP (excluding recommendations) is in your assessment of VCs? (And in that case I feel very uncompelled by your argument.)

In what way has the OP failed to say true things? Where should they have had more uncertainty? What things did they present as facts which were actually feelings? What claim have they been confident about that they shouldn't have been confident about?

(Perhaps you mean to say that the recommendations are overconfident. There I think I just disagree with you about the bar for evidence for making recommendations, including ones as strong as "alignment researchers shouldn't work at organization X". I've given recommendations like this to individual people who asked me for a recommendation in the past, on less evidence than collected in this post.)

I'm not going to crosspost our entire discussion from the EAF.

I just want to quickly mention that Rohin and I were able to understand where we have different opinions and he changed my mind about an important fact. Rohin convinced me that anti-recommendations should not have a higher bar than pro-recommendations even if they are conventionally treated this way. This felt like an important update for me and how I view the post.

(crossposted from the EA Forum)

We appreciate your detailed reply outlining your concerns with the post.

Our understanding is that your key concern is that we are judging Conjecture based on their current output, whereas since they are pursuing a hits-based strategy we should expect in the median case for them to not have impressive output. In general, we are excited by hits-based approaches, but we echo Rohin's point: how are we meant to evaluate organizations if not by their output? It seems healthy to give promising researchers sufficient runway to explore, but $10 million dollars and a team of twenty seems on the higher end of what we would want to see supported purely on the basis of speculation. What would you suggest as the threshold where we should start to expect to see results from organizations?

We are unsure where else you disagree with our evaluation of their output. If we understand correctly, you agree that their existing output has not been that impressive, but think that it is positive they were willing to share preliminary findings and that we have too high a bar for evaluating such output. We've generally not found their preliminary findings to significantly update our views, whereas we would for example be excited by rigorous negative results that save future researchers from going down dead-ends. However, if you've found engaging with their output to be useful to your research then we'd certainly take that as a positive update.

Your second key concern is that we provide limited evidence for our claims regarding the VCs investing in Conjecture. Unfortunately for confidentiality reasons we are limited in what information we can disclose: it's reasonable if you wish to consequently discount this view. As Rohin said, it is normal for VCs to be profit-seeking. We do not mean to imply these VCs are unusually bad for VCs, just that their primary focus will be the profitability of Conjecture, not safety impact. For example, Nat Friedman has expressed skepticism of safety (e.g. this Tweet) and is a strong open-source advocate, which seems at odds with Conjecture's info-hazard policy.

We have heard from multiple sources that Conjecture has pitched VCs on a significantly more product-focused vision than they are pitching EAs. These sources have either spoken directly to VCs, or have spoken to Conjecture leadership who were part of negotiation with VCs. Given this, we are fairly confident on the point that Conjecture is representing themselves differently to separate groups.

We believe your third key concern is our recommendations are over-confident. We agree there is some uncertainty, but think it is important to make actionable recommendations, and based on the information we have our sincerely held belief is that most individuals should not work at Conjecture. We would certainly encourage individuals to consider alternative perspectives (including expressed in this comment) and to ultimately make up their own mind rather than deferring, especially to an anonymous group of individuals!

Separately, I think we might consider the opportunity cost of working at Conjecture higher than you. In particular, we'd generally evaluate skill-building routes fairly highly: for example, being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company. These are generally close to capabilities-neutral, and can make individuals vastly more productive. Given the limited information on CogEm it's hard to assess whether it will or won't work, but we think there's ample evidence that there are better places to develop skills than Conjecture.

We wholeheartedly agree that it is important to maintain high epistemic standards during the critique. We have tried hard to differentiate between well-established facts, our observations from sources, and our opinion formed from those. For example, the About Conjecture section focuses on facts; the Criticisms and Suggestions section includes our observations and opinions; and Our Views on Conjecture are more strongly focused on our opinions. We'd welcome feedback on any areas where you feel we over-claimed.

(cross-posted from EAG)

Meta: Thanks for taking the time to respond. I think your questions are in good faith and address my concerns, I do not understand why the comment is downvoted so much by other people.

1. Obviously output is a relevant factor to judge an organization among others. However, especially in hits-based approaches, the ultimate thing we want to judge is the process that generates the outputs to make an estimate about the chance of finding a hit. For example, a cynic might say "what has ARC-theory achieve so far? They wrote some nice framings of the problem, e.g. with ELK and heuristic arguments, but what have they ACtUaLLy achieved?" To which my answer would be, I believe in them because I think the process that they are following makes sense and there is a chance that they would find a really big-if-true result in the future. In the limit, process and results converge but especially early on they might diverge. And I personally think that Conjecture did respond reasonably to their early results by iterating faster and looking for hits.
2. I actually think their output is better than you make it look. The entire simulators framing made a huge difference for lots of people and writing up things that are already "known" among a handful of LLM experts is still an important contribution, though I would argue most LLM experts did not think about the details as much as Janus did. I also think that their preliminary research outputs are pretty valuable. The stuff on SVDs and sparse coding actually influenced a number of independent researchers I know (so much that they changed their research direction to that) and I thus think it was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.
3. (copied from response to Rohin): Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I'm aware of (not all of which are mentioned in the post and I'm not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like "Connor didn't tell the VCs about the alignment plans or neglects them in conversation". However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it's clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it's really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don't have enough insight to confidently say who is right here. I'm mainly saying, the confidence of you surprised me given my previous discussions with staff.
4. Regarding confidence: For example, I think saying "We think there are better places to work at than Conjecture" would feel much more appropriate than "we advice against..." Maybe that's just me. I just felt like many statements are presented with a lot of confidence given the amount of insight you seem to have and I would have wanted them to be a bit more hedged and less confident.
5. Sure, for many people other opportunities might be a better fit. But I'm not sure I would e.g. support the statement that a general ML engineer would learn more in general industry than with Conjecture. I also don't know a lot about CoEm but that would lead me to make weaker statements than suggesting against it.

Thanks for engaging with my arguments. I personally think many of your criticisms hit relevant points and I think a more hedged and less confident version of your post would have actually had more impact on me if I were still looking for a job. As it is currently written, it loses some persuasion on me because I feel like you're making too broad unqualified statements which intuitively made me a bit skeptical of your true intentions. Most of me thinks that you're trying to point out important criticism but there is a nagging feeling that it is a hit piece. Intuitively, I'm very averse against everything that looks like a click-bait hit piece by a journalist with a clear agenda. I'm not saying you should only consider me as your audience, I just want to describe the impression I got from the piece.

(cross-posted from EAF)

appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.

1) We agree it's worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We're not aware of any equally significant advances from Connor or other key staff members at Conjecture; we'd be interested to hear if you have examples of their pre-Conjecture output you find impressive.

We're not particularly impressed by Conjecture's process, although it's possible we'd change our mind if we knew more about it. Maintaining high velocity in research is certainly a useful component, but hardly sufficient. The Builder/Breaker method proposed by ARC feels closer to a complete methodology. But this doesn't feel like the crux for us: if Conjecture copied ARC's process entirely, we'd still be much more excited about ARC (per-capita). Research productivity is a product of a large number of factors, and explicit process is an important but far from decisive one.

In terms of the explicit comparison with ARC, we would like to note that ARC Theory's team size is an order of magnitude smaller than Conjecture. Based on ARC's recent hiring post, our understanding is the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.

2) Thanks for the concrete examples, this really helps tease apart our disagreement.

We are overall glad that the Simulators post was written. Our view is that it could have been much stronger had it been clearer which claims were empirically supported versus hypotheses. Continuing the comparison with ARC, we found ELK to be substantially clearer and a deeper insight. Admittedly ELK is one of the outputs people in the TAIS community are most excited by so this is a high bar.

The stuff on SVDs and sparse coding [...] was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.

This sounds similar to our internal evaluation. We're a bit confused by why "3 people in two weeks" is the relevant reference class. We'd argue the costs of Conjecture's "misses" need to be accounted for, not just their "hits". Redwood's team size and budget are comparable to that of Conjecture, so if you think that causal scrubbing is more impressive than Conjecture's other outputs, then it sounds like you agree with us that Redwood was more impressive than Conjecture (unless you think the Simulator's post is head and shoulders above Redwood's other output)?

Thanks for sharing the data point this influenced independent researchers. That's useful to know, and updates us positively. Are you excited by those independent researchers' new directions? Is there any output from those researchers you'd suggest we review?

3) We remain confident in our sources regarding Conjecture's discussion with VCs, although it's certainly conceivable that Conjecture was more open with some VCs than others. To clarify, we are not claiming that Connor or others at Conjecture did not mention anything about their alignment plans or interest in x-risk to VCs (indeed, this would be a barely tenable position for them given their public discussion of these plans), simply that their pitch gave the impression that Conjecture was primarily focused on developing products. It's reasonable for you to be skeptical of this if your sources at Conjecture disagree; we would be interested to know how close to the negotiations those staff were, although understand this may not be something you can share.

4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.

5) This certainly depends on what "general industry" refers to: a research engineer at Conjecture might well be better for ML skill-building than, say, being a software engineer at Walmart. But we would expect ML teams at top tech companies, or working with relevant professors, to be significantly better for skill-building. Generally we expect quality of mentorship to be one of the most important components of individuals developing as researchers and engineers. The Conjecture team is stretched thin as a result of rapid scaling, and had few experienced researchers or engineers on staff in the first place. By contrast, ML teams at top tech companies will typically have a much higher fraction of senior researchers and engineers, and professors at leading universities comprise some of the best researchers in the field. We'd be curious to hear your case for Conjecture as skill building; without that it's hard to identify where our main disagreement lies.

I agree with Conjecture's reply that this reads more like a hitpiece than an even-handed evaluation.

I don't think your recommendations follow from your observations, and such strong claims surely don't follow from the actual evidence you provide. I feel like your criticisms can be summarized as the following:

Conjecture was publishing unfinished research directions for a while.
Conjecture does not publicly share details of their current CoEm research direction, and that research direction seems hard.
Conjecture told the government they were AI safety experts.
Some people (who?) say Conjecture's governance outreach may be net-negative and upsetting to politicians.
Conjecture's CEO Connor used to work on capabilities.
One time during college Connor said that he replicated GPT-2, then found out he had a bug in his code.
Connor has said at some times that open source models were good for alignment, then changed his mind.
Conjecture's infohazard policy can be overturned by Connor or their owners.
They're trying to scale when it is common wisdom for startups to try to stay small.
It is unclear how they will balance profit and altruistic motives.
Sometimes you talk with people (who?) and they say they've had bad interactions with conjecture staff or leadership when trying to tell them what they're doing wrong.
Conjecture seems like they don't talk with ML people.

I'm actually curious about why they're doing 9, and further discussion on 10 and 8. But I don't think any of the other points matter, at least to the depth you've covered them here, and I don't know why you're spending so much time on stuff that doesn't matter or you can't support. This could have been so much better if you had taken the research time spent on everything that wasn't 8, 9, or 10, and used to to do analyses of 8, 9, and 10, and then actually had a conversation with Conjecture about your disagreements with them.

I especially don't think your arguments support your suggestions that

Don't work at Conjecture.
Conjecture should be more cautious when talking to media, because Connor seems unilateralist.
Conjecture should not receive more funding until they get similar levels of organizational competence than OpenAI or Anthropic.
Rethink whether or not you want to support conjecture's work non-monetarily. For example, maybe think about not inviting them to table at EAG career fairs, inviting Conjecture employees to events or workspaces, and taking money from them if doing field-building.

(1) seems like a pretty strong claim, which is left unsupported. I know of many people who would be excited to work at conjecture, and I don't think your points support the claim they would be doing net-negative research given they do alignment at Conjecture.

For (2), I don't know why you're saying Connor is unilateralist. Are you saying this because he used to work on capabilities?

(3) is just absurd! OpenAI will perhaps be the most destructive organization to-date. I do not think your above arguments make the case they are less organizationally responsible than OpenAI. Even having an info-hazard document puts them leagues above both OpenAI and Anthropic in my book. And add onto that their primary way of getting funded isn't building extremely large models... In what way do Anthropic or OpenAI have better corporate governance structures than Conjecture?

(4) is just... what? Ok, I've thought about it, and come to the conclusion this makes no sense given your previous arguments. Maybe there's a case to be made here. If they are less organizationally competent than OpenAI, then yeah, you probably don't want to support their work. This seems pretty unlikely to me though! And you definitely don't provide anything close to the level of analysis needed to elevate such hypotheses.

Edit: I will add to my note on (2): In most news articles in which I see Connor or Conjecture mentioned, I feel glad he talked to the relevant reporter, and think he/Conjecture made that article better. It is quite an achievement in my book to have sane conversations with reporters about this type of stuff! So mostly I think they should continue doing what they're doing.

I'm not myself an expert on PR (I'm skeptical if anyone is), so maybe my impressions of the articles are naive and backwards in some way. This is something which if you think is important, it would likely be good to mention somewhere why you think their media outreach is net-negative, ideally pointing to particular things you think they did wrong rather than vague & menacing criticisms of unilateralism.

From my perspective 9 (scaling fast) makes perfect sense since Conjecture is aiming to stay "slightly behind state of the art", and that requires engineering power.

I'm pretty skeptical they can achieve that right now using CoEm given the limited progress I expect them to have made on CoEm. And in my opinion of greater importance than "slightly behind state of the art" is likely security culture, and commonly in the startup world it is found that too-fast scaling leads to degradation in the founding culture. So a fear would be that fast scaling would lead to worse info-sec.

However, I don't know to what extent this is an issue. I can certainly imagine a world where because of EA and LessWrong, many very mission-aligned hires are lining up in front of their door. I can also imagine a lot of other things, which is why I'm confused.

(cross-posted from the EA Forum)

Regarding your specific concerns about our recommendations:

1) We address this point in our response to Marius (5th paragraph)

2) As we note in the relevant section: “We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk.” This kind of relationship-building is unilateralist when it can decrease goodwill amongst policymakers.

3) To be clear, we do not expect Conjecture to have the same level of “organizational responsibility” or “organizational competence” (we aren’t sure what you mean by those phrases and don’t use them ourselves) as OpenAI or Anthropic. Our recommendation was for Conjecture to have a robust corporate governance structure. For example, they could change their corporate charter to implement a "springing governance" structure such that voting equity (but not political equity) shift to an independent board once they cross a certain valuation threshold. As we note in another reply, Conjecture’s infohazard policy has no legal force, and therefore is not as strong as either OpenAI or Anthropic’s corporate governance models. As we’ve noted already, we have concerns about both OpenAI and Anthropic despite having these models in place: Conjecture doesn’t even have those, which makes us more concerned.

I responded to a very similar comment of yours on the EA Forum.

To respond to the new content, I don't know if changing the board of conjecture once a certain valuation threshold is crossed would make the organization more robust (now that I think of it, I don't even really know what you mean by strong or robust here. Depending on what you mean, I can see myself disagreeing about whether that even tracks positive qualities about a corporation). You should justify claims like those, and at least include them in the original post. Is it sketchy they don't have this?

We have heard that Conjecture misrepresent themselves in engagement with the government, presenting themselves as experts with stature in the AIS community, when in reality they are not.

What does it mean for Conjecture to be "experts with stature in the AIS community"? Can you clarify what metrics comprise expertise in AIS -- are you dissatisfied with their demonstrated grasp of alignment work, or perhaps their research output, or maybe something a little more qualitative?

Basically, this excerpt reads like a crisp claim of common knowledge ("in reality") but the content seems more like a personal judgment call by the author(s).

Hi TurnTrout, thanks for asking this question. We're happy to clarify:

'experts': We do not consider Conjecture at the same level of expertise as [edit] alignment leaders and researchers at other organizations such as Redwood, ARC, researchers at academic labs like CHAI, and the alignment teams at Anthropic, OpenAI and DeepMind. This is primarily because we believe their research quality is low.
'with stature in the AIS community': Based on our impression (from conversations with many senior TAIS researchers at a range of organizations, including a handful who reviewed this post and didn't disagree with this point) of the TAIS community, Conjecture is not considered a top alignment research organization within the community.

We do not consider Conjecture at the same level of expertise as other organizations such as Redwood, ARC, researchers at academic labs like CHAI, and the alignment teams at Anthropic, OpenAI and DeepMind. This is primarily because we believe their research quality is low.

This isn't quite the right thing to look at IMO. In the context of talking to governments, an "AI safety expert" should have thought deeply about the problem, have intelligent things to say about it, know the range of opinions in the AI safety community, have a good understanding of AI more generally, etc. Based mostly on his talks and podcast appearances, I'd say Connor does decently well along these axes. (If I had to make things more concrete, there are a few people I'd personally call more "expert-y", but closer to 10 than 100. The AIS community just isn't that big and the field doesn't have that much existing content, so it seems right that the bar for being an "AIS expert" is lower than for a string theory expert.)

I also think it's weird to split this so strongly along organizational lines. As an extreme case, researchers at CHAI range on a spectrum from "fully focused on existential safety" to "not really thinking about safety at all". Clearly the latter group aren't better AI safety experts than most people at Conjecture. (And FWIW, I belong to the former group and I still don't think you should defer to me over someone from Conjecture just because I'm at CHAI.)

One thing that would be bad is presenting views that are very controversial within the AIS community as commonly agreed-upon truths. I have no special insight into whether Conjecture does that when talking to governments, but it doesn't sound like that's your critique at least?

Hi Erik, thanks for your points, we meant to say "at the same level of expertise as alignment leaders and researchers other organizations such as...". This was a typo on our part.

As a person not affiliated with Conjecture, I want to record some of my scattered reactions. A lot of upvotes on such a post without substantial comments seems... unfair?

On one hand, it is always interesting to read something like that. Many of us have pondered Conjecture, asking ourselves whether what they are doing and the way they are doing it make sense. E.g. their infohazard policy has been remarkable, super-interesting, and controversial. My own reflections on that have been rather involved and complicated.

On the other hand, when I am reading the included Conjecture response, what they are saying there seems to me to make total sense (if I were in an artificial binary position of having to fully side with the post or with them, I would have sided with Conjecture on this). Although one has to note that their https://www.conjecture.dev/a-standing-offer-for-public-discussions-on-ai/ is returning a 404 at the moment. Is that offer still standing?

Specifically, on their research quality, the Simulator theory has certainly been controversial, but many people find it extremely valuable, and I personally tend to recommend it to people as the most important conceptual breakthrough of 2022 (in my opinion) (together with the notes I took on the subject) . It is particularly valuable as a deconfusion tool on what LLMs are and aren't, and I found that framing the LLM-related problems in terms of properties of simulation runs and in terms of sculpting and controlling the simulations is very productive. So I am super-greatful for that part of their research output.

On the other hand, I did notice that the authors of that work and Conjecture had parted ways (and when I noticed that I told myself, "perhaps I don't need to follow that org all that closely anymore, although it is still a remarkable org").

I think what makes writing comments on posts like this one difficult is that the post is really structured and phrased in such a way as to make this a situation of personal conflict, internal to the relatively narrow AI safety community.

I have not downvoted the post, but I don't like this aspect, I am not sure this is the right way to approach things...

Apologies for the 404 on the page, it's an annoying cache bug. Try to hard refresh your browser page (CMD + Shift + R) and it should work.

I am afraid, this is a more persistent problem (or, perhaps, it comes and goes, but I am even trying browsers I don't normally use (in addition to hard reload on those I do normally use), and it still returns 404).

I'll be testing this further occasionally... (You might want to check whether anyone else who does not have privileged access to your systems is seeing it at the moment; some systems like, for example, GitHub often show 404 to people who don't have access to an actually existing file instead of showing 403 as one would normally expect.)

Thanks for commenting and sharing your reactions Mishka.

Some quick notes on what you've shared:

Although one has to note that their https://www.conjecture.dev/a-standing-offer-for-public-discussions-on-ai/ is returning a 404 at the moment. Is that offer still standing?

In their response to us they told us this offer was still standing.

A lot of upvotes on such a post without substantial comments seems... unfair?

As of the time of your comment, we believe there were about 8 votes and 30 karma and the post had been up a few hours. We are not sure what voting frequency is on LW (e.g. we're not sure if this is higher or lower than average?) but if it's higher, some hypotheses (we'd love to hear inputs from folks who have upvoted without a comment):

Some people are supportive of criticism in general, and may have upvoted to support more critical discussion (even though they may disagree with object level comments)
Some people who upvoted may already agree with the views of this post (e.g. some of the upvoters could be our reviewers)
Some people may have upvoted so this post gets more attention / discussion so they could see what others think of it
Some folks may have upvoted for now and might come back to the post to leave more substantive comments when they have time

I think what makes writing comments on posts like this one difficult is that the post is really structured and phrased in such a way as to make this a situation of personal conflict, internal to the relatively narrow AI safety community.

I have not downvoted the post, but I don't like this aspect, I am not sure this is the right way to approach things...

If understanding correctly, we think what you're saying is that because there are many claims in this post, it seems suboptimal that people can't indicate that via post-level voting.

We think this is a great point. We'd love to see an option for people to agree/disagree with specific claims on posts to provide a more nuanced understanding of where consensus lies. We think it's very plausible that some of our points will end up being much more controversial than others. (if you wanted to add separate comments for specific claims that people could vote on, we'd love to see that and would be happy to add a note to the top-level post encouraging folks to do so)

Our hope is that folks can comment with areas of disagreement to start a discussion on those points.

we think Conjecture [...] have too low a bar for sharing, reducing the signal-to-noise ratio and diluting standards in the field. When they do provide evidence, it appears to be cherry picked.

This is an ironic criticism, given that this post has very low signal-to-noise quality and when it does provide evidence, it's obviously cherry-picked. Relatedly, I am curious whether you used AI to write many parts of this post because the style is reminiscent and it reeks of a surplus of cognitive labor put to inefficient use, and seems to include some confabulations. A large percentage of the words in this post are spent on redundant, overly-detailed summaries.

I actually did not mind reading this style, because I found intriguing, but if typical lesswrong posts were like this it would be annoying and harm the signal-to-noise ratio.

Confabulation example:

(The simulators) post ends with speculative beliefs that they stated fairly confidently that took the framing to an extreme (e.g if the AI system adopts the “superintelligent AI persona” it’ll just be superintelligent).

This is... not how the post ends, nor is it a claim made anywhere in the post, and it's hard to see how it could even be a misinterpretation of anything at the end of the post.

Your criticisms of Conjecture's research are vague statements that it's "low quality" and "not empirically testable" but you do not explain why. These potentially object-level criticisms are undermined from an outside view by your exhaustive, one-sided nitpicking of Connor's character, which gives the impression that the author is saying every possible negative thing they can against Conjecture without regard for salience or even truth.

Having known some of Conjecture's founders and their previous work in the context of "early-stage EleutherAI", I share some^[1] of the main frustrations outlined in this post. At the organizational level, even setting aside the departure of key researchers, I do not think that Conjecture's existing public-facing research artifacts have given much basis for me to recommend the organization to others (aside from existing personal ties). To date, only^[2] a few posts like their one on the polytope lens and their one on circumventing interpretability were at the level of quality & novelty I expected from the team. Maybe that is a function of the restrictive information policies, maybe a function of startup issues, maybe just the difficulty of research. In any case, I think that folks ought to require more rigor and critical engagement from their future research outputs^[3].

^{^}
I didn't find the critiques of Connor's "character and trustworthiness" convincing, but I already consider him a colleague & a friend, so external judgments like these don't move the needle for me.
^{^}
The main other post I have in mind was their one on simulators. AFAICT the core of "simulator theory" predated (mid-2021, at least) Conjecture, and yet even with a year of additional incubation, the framework was not brought to a sufficient level of technical quality.
^{^}
For example, the "cognitive emulation" work may benefit from review by outside experts, since the nominal goal seems to be to do cognitive science entirely inside of Conjecture.

I think the critique of Redwood Research made a few valid points. My own critique of Redwood would go something like:

they hired too few support staff to keep their primary researchers well supported and happy, and thus had unnecessarily high turnover
they hired too high a proportion of junior researchers, in an unsettled phase of life without high likelihood of sticking with a current job, again contributing to too much turnover and to a lack of researchers who knew what to expect from a workplace and how to maintain their work-life balance.

Not much of a critique, honestly. A reasonable mistake that a lot of start-ups led by young inexperienced people would make, and certainly something fixable. Also, they have longer AGI timelines than me, and thus are not acting with what I see as sufficient urgency. But I don't think that it's necessarily fair for me to critique orgs for having their own well-considered opinions on this different from my own. I'm not even sure if them having my timelines would improve their output any.

This critique on the other hand seems entirely invalid and counterproductive. You criticize Conjecture's CEO for being... a charismatic leader good at selling himself and leading people? Because he's not... a senior academic with a track record of published papers? Nonsense. Expecting the CEO to be the primary technical expert seems highly misguided to me. The CEO needs to know enough about the technical aspects to be able to hire good technical people, and then needs to coordinate and inspire those people and promote the company. I think Connor is an excellent pick for this, and your criticisms of him are entirely beside the point, and also rather rude.

Conjecture, and Connor, seem to actually be trying to do something which strikes at the heart of the problem. Something which might actually help save us in three years from now when the leading AI labs have in their possession powerful AGI after a period of recursive self-improvement by almost-but-not-quite-AGI. I expect this AGI will be too untrustworthy to make more than very limited use of. So then, looking around for ways to make use of their newfound dangerous power, what will they see? Some still immature interpretability research. Sure. And then? Maybe they'll see the work Conjecture has started and realize that breaking down the big black magic box into smaller more trustworthy pieces is one of the best paths forward. Then they can go knocking on Conjecture's door, collect the research so far, and finish it themselves with their abundant resources.

My criticism of their plan is primarily: you need even more staff and more funding to have a better chance of this working. Which is basically the opposite of the conclusion you come to.

As for the untrustworthiness of their centralized infohazard policy... Yeah, this would be bad if the incentives were for the central individual to betray the world for their own benefit. That's super not the case here. The incentive is very much the opposite. For much the same reason that I feel pretty trusting of the heads of Deepmind, OpenAI, and Anthropic. Their selfish incentives to not destroy themselves and everyone they love are well aligned with humanity's desire to not be destroyed. Power-seeking in this case is a good thing! Power over the world through AGI, to these clever people, clearly means learning to control that untrustworthy AGI... thus means learning how to save the world. My threat model says that the main danger comes from not the heads of the labs, but the un-safety-convinced employees who might leave to start their own projects, or outside people replicating the results the big labs have achieved but with far fewer safety precautions.

I think reasonable safety precautions, like not allowing unlimited unsupervised recursive self-improvement, not allowing source code or model weights to leave the lab, sandbox testing, etc can actually be quite effective in the short term in protecting humanity from rogue AGI. I don't think surprise-FOOM-in-a-single-training-run-resulting-in-a-sandbox-escaping-superintelligence is a likely threat model. I think a far more likely threat model is foolish amatuers or bad actors tinkering with dangerous open source code and stumbling into an algorithmic breakthrough they didn't expect and don't understand and foolishly releasing it onto the web.

I think putting hope in compute governance is a very limited hope. We can't govern compute for long, if at all, because there will be huge reductions in compute needed once more efficient training algorithms are found.

You criticize Conjecture's CEO for being... a charismatic leader good at selling himself and leading people? Because he's not... a senior academic with a track record of published papers? Nonsense. Expecting the CEO to be the primary technical expert seems highly misguided to me.

Yeah this confiused me a little too. My current job (in soil science) has a non academic boss, and a team of us boffins, and he doesn't need to be an academic, because its not his job, he just has to know where the money comes from, and how to stop the stakeholders from running away screaming when us soil nerds turn up to a meeting and start emitting maths and graphs out of our heads. Likewise the previous place I was at, I was the only non PhD haver on technical staff (being a 'mere' postgrad) and again our boss wasn't academic at all. But he WAS a leader of men and herder of cats, and cat herding is probably a more important skill in that role than actually knowing what those cats are taking about.

And it all works fine. I dont need an academic boss, even if I think an academic boss would be nice. I need a boss who knows how to keep the payroll from derailing, and I suspect the vast majority of science workers feel the same way.

Note that we don't criticize Connor specifically, but rather the lack of a senior technical expert on the team in general (including Connor). Our primary criticisms of Connor don't have to do with his leadership skills (which we don't comment on this at any point in the post).

I'm confused about the disagree votes. Can someone who disagree-voted say which of the following claims they disagreed with:
1. Omega criticized the lack of a senior technical expert on Conjecture's team.

2. Omega's primary criticisms of Connor doesn't have to do with his leadership skills.

3. Omega did not comment on Connorship's leadership skills at any point in the post.

Beren Millidge is not a senior technical expert?

Nathan Helm-Burger' used a different notion of "leadership" (like a startup CEO) to criticise the post and Omega responded to it by saying something about "management" leadership, which doesn't respond to Nathan's comment really.

Ah I see. Hmm, if I say "Yesterday I said X," people-who-talk-like-me will interpret contextless disagreement with that claim as "Yesterday I didn't say X" and not as "X is not true." Perhaps this is a different communication norm from LW standards, in which case I'll try to interpret future agree/disagree comments in that light.

I agree from quickly looking at Beren's LinkedIn page that he seems like a technical expert (I don't know enough about ML to have a particularly relevant inside-view about ML technical expertise).

I think the (perhaps annoying) fact is that LW readers aren't a monolith and different people interpret disagreement votes differently.

BTW, from the comment to the EA forum cross-post, I discovered that Beren reportedly left Conjecture very recently. That's indeed a negative update on Conjecture for me (maybe not as much as he specifically left but rather that this indicates a high turnover rate), but regardless, this doesn't apply to the inference made by Omega in this report, along the lines that "Conjecture's research is iffy because they don't have senior technical experts and don't know what are they doing", because this wasn't true until very recently and probably still isn't true (overwhelmingly likely there are other technical experts who are still working at Conjecture), so this doesn't invalidate or stain the research that has been done and published previously.

Interestingly, the reception on the EA Forum is more positive (154 net karma at 136 votes), compared to here (24 net karma at 105 votes).

(cross-posted from EAF, thanks Richard for suggesting. There's more back-and-forth later.)

I'm not very compelled by this response.

It seems to me you have two points on the content of this critique. The first point:

I think it's bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don't find a lot until you hit.

The second point:

Your statements about the VCs seem unjustified to me. How do you know they are not aligned? [...] I haven't talked to the VCs either, but I've at least asked people who work(ed) at Conjecture.

Hmm, it seems extremely reasonable to me to take as a baseline prior that the VCs are profit-motivated, and the authors explicitly say

We have heard credible complaints of this from their interactions with funders. One experienced technical AI safety researcher recalled Connor saying that he will tell investors that they are very interested in making products, whereas the predominant focus of the company is on AI safety.

The fact that people who work(ed) at Conjecture say otherwise means that (probably) someone is wrong, but I don't see a strong reason to believe that it's the OP who is wrong.

At the meta level you say:

I do not understand where the confidence with which you write the post (or at least how I read it) comes from.

And in your next comment:

I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.

(crossposted from the EA Forum)

We appreciate your detailed reply outlining your concerns with the post.

(cross-posted from EAF)

appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.

2) Thanks for the concrete examples, this really helps tease apart our disagreement.

The stuff on SVDs and sparse coding [...] was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.

4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.

I agree with Conjecture's reply that this reads more like a hitpiece than an even-handed evaluation.

Conjecture was publishing unfinished research directions for a while.
Conjecture does not publicly share details of their current CoEm research direction, and that research direction seems hard.
Conjecture told the government they were AI safety experts.
Some people (who?) say Conjecture's governance outreach may be net-negative and upsetting to politicians.
Conjecture's CEO Connor used to work on capabilities.
One time during college Connor said that he replicated GPT-2, then found out he had a bug in his code.
Connor has said at some times that open source models were good for alignment, then changed his mind.
Conjecture's infohazard policy can be overturned by Connor or their owners.
They're trying to scale when it is common wisdom for startups to try to stay small.
It is unclear how they will balance profit and altruistic motives.
Sometimes you talk with people (who?) and they say they've had bad interactions with conjecture staff or leadership when trying to tell them what they're doing wrong.
Conjecture seems like they don't talk with ML people.

I especially don't think your arguments support your suggestions that

Don't work at Conjecture.
Conjecture should be more cautious when talking to media, because Connor seems unilateralist.
Conjecture should not receive more funding until they get similar levels of organizational competence than OpenAI or Anthropic.
Rethink whether or not you want to support conjecture's work non-monetarily. For example, maybe think about not inviting them to table at EAG career fairs, inviting Conjecture employees to events or workspaces, and taking money from them if doing field-building.

For (2), I don't know why you're saying Connor is unilateralist. Are you saying this because he used to work on capabilities?

From my perspective 9 (scaling fast) makes perfect sense since Conjecture is aiming to stay "slightly behind state of the art", and that requires engineering power.

(cross-posted from the EA Forum)

Regarding your specific concerns about our recommendations:

1) We address this point in our response to Marius (5th paragraph)

I responded to a very similar comment of yours on the EA Forum.

We have heard that Conjecture misrepresent themselves in engagement with the government, presenting themselves as experts with stature in the AIS community, when in reality they are not.

Basically, this excerpt reads like a crisp claim of common knowledge ("in reality") but the content seems more like a personal judgment call by the author(s).

Hi TurnTrout, thanks for asking this question. We're happy to clarify:

'experts': We do not consider Conjecture at the same level of expertise as [edit] alignment leaders and researchers at other organizations such as Redwood, ARC, researchers at academic labs like CHAI, and the alignment teams at Anthropic, OpenAI and DeepMind. This is primarily because we believe their research quality is low.
'with stature in the AIS community': Based on our impression (from conversations with many senior TAIS researchers at a range of organizations, including a handful who reviewed this post and didn't disagree with this point) of the TAIS community, Conjecture is not considered a top alignment research organization within the community.

We do not consider Conjecture at the same level of expertise as other organizations such as Redwood, ARC, researchers at academic labs like CHAI, and the alignment teams at Anthropic, OpenAI and DeepMind. This is primarily because we believe their research quality is low.

Hi Erik, thanks for your points, we meant to say "at the same level of expertise as alignment leaders and researchers other organizations such as...". This was a typo on our part.

As a person not affiliated with Conjecture, I want to record some of my scattered reactions. A lot of upvotes on such a post without substantial comments seems... unfair?

I have not downvoted the post, but I don't like this aspect, I am not sure this is the right way to approach things...

Apologies for the 404 on the page, it's an annoying cache bug. Try to hard refresh your browser page (CMD + Shift + R) and it should work.

Thanks for commenting and sharing your reactions Mishka.

Some quick notes on what you've shared:

Although one has to note that their https://www.conjecture.dev/a-standing-offer-for-public-discussions-on-ai/ is returning a 404 at the moment. Is that offer still standing?

In their response to us they told us this offer was still standing.

A lot of upvotes on such a post without substantial comments seems... unfair?

Some people are supportive of criticism in general, and may have upvoted to support more critical discussion (even though they may disagree with object level comments)
Some people who upvoted may already agree with the views of this post (e.g. some of the upvoters could be our reviewers)
Some people may have upvoted so this post gets more attention / discussion so they could see what others think of it
Some folks may have upvoted for now and might come back to the post to leave more substantive comments when they have time

I think what makes writing comments on posts like this one difficult is that the post is really structured and phrased in such a way as to make this a situation of personal conflict, internal to the relatively narrow AI safety community.

I have not downvoted the post, but I don't like this aspect, I am not sure this is the right way to approach things...

If understanding correctly, we think what you're saying is that because there are many claims in this post, it seems suboptimal that people can't indicate that via post-level voting.

Our hope is that folks can comment with areas of disagreement to start a discussion on those points.

we think Conjecture [...] have too low a bar for sharing, reducing the signal-to-noise ratio and diluting standards in the field. When they do provide evidence, it appears to be cherry picked.

I actually did not mind reading this style, because I found intriguing, but if typical lesswrong posts were like this it would be annoying and harm the signal-to-noise ratio.

Confabulation example:

(The simulators) post ends with speculative beliefs that they stated fairly confidently that took the framing to an extreme (e.g if the AI system adopts the “superintelligent AI persona” it’ll just be superintelligent).

This is... not how the post ends, nor is it a claim made anywhere in the post, and it's hard to see how it could even be a misinterpretation of anything at the end of the post.

^{^}
I didn't find the critiques of Connor's "character and trustworthiness" convincing, but I already consider him a colleague & a friend, so external judgments like these don't move the needle for me.
^{^}
The main other post I have in mind was their one on simulators. AFAICT the core of "simulator theory" predated (mid-2021, at least) Conjecture, and yet even with a year of additional incubation, the framework was not brought to a sufficient level of technical quality.
^{^}
For example, the "cognitive emulation" work may benefit from review by outside experts, since the nominal goal seems to be to do cognitive science entirely inside of Conjecture.

I think the critique of Redwood Research made a few valid points. My own critique of Redwood would go something like:

they hired too few support staff to keep their primary researchers well supported and happy, and thus had unnecessarily high turnover
they hired too high a proportion of junior researchers, in an unsettled phase of life without high likelihood of sticking with a current job, again contributing to too much turnover and to a lack of researchers who knew what to expect from a workplace and how to maintain their work-life balance.

My criticism of their plan is primarily: you need even more staff and more funding to have a better chance of this working. Which is basically the opposite of the conclusion you come to.

You criticize Conjecture's CEO for being... a charismatic leader good at selling himself and leading people? Because he's not... a senior academic with a track record of published papers? Nonsense. Expecting the CEO to be the primary technical expert seems highly misguided to me.

2. Omega's primary criticisms of Connor doesn't have to do with his leadership skills.

3. Omega did not comment on Connorship's leadership skills at any point in the post.

Beren Millidge is not a senior technical expert?

I agree from quickly looking at Beren's LinkedIn page that he seems like a technical expert (I don't know enough about ML to have a particularly relevant inside-view about ML technical expertise).

I think the (perhaps annoying) fact is that LW readers aren't a monolith and different people interpret disagreement votes differently.

Interestingly, the reception on the EA Forum is more positive (154 net karma at 136 votes), compared to here (24 net karma at 105 votes).

12

12

Key Takeaways

About Conjecture

Funding

Outputs

Products

Alignment Research

Governance outreach

Incubator Program

Team

Conjecture in the TAIS ecosystem

Criticisms and Suggestions

Low quality research

General thoughts on Conjecture’s research

Initial research agenda (March 2022 - Nov 2022)

New research agenda (Nov 22 - Present)

CEO’s character and trustworthiness

Conjecture and their CEO misrepresent themselves to various parties

Contributions to race dynamics

Overstatement of accomplishments and lack of attention to precision

Inconsistency over time regarding releasing LLMs

Scaling too quickly

Unclear plan for balancing profit and safety motives

Limited meaningful engagement with external actors

Lack of productive communication between TAIS researchers and Conjecture staff

Lack of engagement with the broader ML community

Our views on Conjecture

We would advise against working at Conjecture

We would advise Conjecture to take care when engaging with important stakeholders and represent their place in the TAIS ecosystem accurately

We do not think that Conjecture should receive additional funding before addressing key concerns

We encourage TAIS and EA community members to consider to what extent they want to legitimize Conjecture until Conjecture addresses these concerns

Appendix

Communication with Conjecture

Conjecture’s Reply

Brief response and changes we made

Notes

12

12