Wiki Contributions

Comments

Kind of moot now but she gave it a good '5' out of 9 (= a strong pass grade for UK students aged 14-16 years old).

Grammar and argument structure is excellent - the essay flows well with a coherent introduction and conclusion based on the text's body. The student seems to understand the topic well (though obv I have no idea about it) and addresses a few different aspects of the industry.

The essay misses the opportunity for deeper back-and-forth critical analysis, although there is evidence of this being attempted. And it could go deeper into the environmental effects or more specific historical context, like key figures or companies in the industry.

I'd probably give this a 5 - maybe a 6!

So not quite a 'high mark'. This doesn't necessarily mean GPT3.5 definitely doesn't meet the criteria - just one example of it not.

Thanks Zoe!

I think GPT-4 meets the criteria but I haven't looked in-depth or updated this post to include GPT-4's capabilities (although I think it would be reasonable to do so because GPT-4's capabilities seem fully developed by Feb 2023) - because it goes against my criteria (and because I don't plan to spend the time to do it). I'll probably update it in a year though!

I don't think this is good enough evidence - it seems quite likely that the student used ChatGPT somewhat but did not leave it to output the entire essay. But I checked how it would change the Brier scores and if you believe that existing AI does, in fact, meet this milestone, it would result in scores of Experts: 0.21, Bulls: 0.27, Bears: 0.28. 

I have asked the following to ChatGPT and sent it to my high-school history-teaching friend to see how she would mark it!
 

You are the highest-scoring student in a high school history class. Please write your best 1200-word essay on 'How did the whaling industry affect the industrial revolution?'

The industrial revolution is widely recognized as a period of tremendous change and innovation that began in the 18th century and continued well into the 19th century. It transformed the economies, societies, and daily lives of people across the globe, and it was fueled in part by advancements in the whaling industry. Whaling played a significant role in the development of the industrial revolution, providing raw materials and energy that were essential to powering the era's growing industrial machinery.

In the early days of the industrial revolution, whale oil was the primary fuel source for lamps and candles, which were necessary for lighting homes and workplaces. This demand for whale oil led to a surge in whaling, as European and American ships set out to capture and kill whales in the Atlantic and Pacific Oceans. The whaling industry became a major employer and source of wealth, with thousands of men and ships dedicated to hunting whales.

The whaling industry also had a significant impact on the development of technologies and manufacturing processes. The need to efficiently capture and process whales led to innovations in shipbuilding, navigation, and processing. The development of the harpoon gun, which allowed for more efficient and accurate hunting, was one such innovation. Other technological advancements included improved harpoons, processing equipment, and ship designs, which allowed for greater speed and efficiency in the capture and processing of whales.

As the whaling industry grew and became more profitable, it had a significant impact on global trade and commerce. Whale oil was traded around the world, and whaling ships sailed to distant ports in search of new markets. The industry also played a role in the development of new trade routes, as whaling ships searched for new hunting grounds in distant oceans.

The whaling industry was also instrumental in the development of new industries and markets. The demand for whale products led to the development of a range of products, including whalebone corsets, umbrellas, and other fashion accessories. The development of new products and markets fueled economic growth and provided opportunities for entrepreneurship and innovation.

Despite the significant impact of the whaling industry on the industrial revolution, it also had a significant environmental impact. The hunting of whales was unsustainable, and many whale populations were driven to the brink of extinction. This led to the development of conservation efforts and the establishment of laws and regulations to protect whale populations.

In conclusion, the whaling industry played a critical role in the development of the industrial revolution. It provided essential raw materials and energy, spurred technological advancements, and fueled global trade and commerce. However, the environmental impact of the whaling industry was significant and cannot be overlooked. Despite this, the whaling industry remains an important chapter in the history of the industrial revolution and the global economy.

I couldn't find this done and think, by now, someone would have submitted a fully ChatGPT-generated high school essay and talked about it publicly if it had gotten high marks. I've seen some evidence of cherry-picking paragraphs leading to a mid/low-level, e.g. this article describes someone who got a passing mark (53) on a university social policy essay. Do you have a link in mind for Bing getting mid-level grades?

This high school teacher judged two ChatGPT-generated history essays as “below average, scoring a 9/20 or lower”. This Guardian article says, uncited, that ‘academics have generated responses to exam queries that they say would result in full marks if submitted by an undergraduate’. I think, if this claim were true, there would be more evidence.

For context - the full question from the survey was:
[Essay] Write an essay for a high-school history class that would receive high grades and pass plagiarism detectors.

For example answer a question like ‘How did the whaling industry affect the industrial revolution?

I'm working on a research project at Rethink Priorities on this topic; whether and how to use bug bounties for advanced ML systems. I think your tl;dr is probably right - although I have a few questions I'm planning to get better answers to in the next month before advocating/facilitating the creation of bounties in AI safety:

  • How subjective can prize criteria for AI safety bounties be, while still incentivizing good quality engagement?
    • If prize criteria need high specificity, are we able to specify unsafe behaviour which is relevant to longterm AI safety (and not just obviously met by all existing AI models)?
  • How many valuable insights are gained from the general public (e.g. people on Twitter competing to cause the model to misbehave) vs internal red-teaming?
  • Might bounty hunters generate actually harmful behaviour?
  • What is the usual career trajectory of bug bounty prize-winners?
  • What kind of community could a big, strong infrastructure of AI safety bounties facilitate?
  • How much would public/elite opinion of general AI safety be affected by more examples of vulnerabilities?

If anyone has thoughts on this topic or these questions (including what, more important, questions you'd like to see asked/answered), or wants more info on my research, I'd be keen to speak (here, or firstname@rethinkpriorities[dot]org, or calendly.com/patrick-rethink).