As I understand it, ChatGPT does not have internet access beyond being able to chat with you. Therefore it did not "review the datasheets". Its apparent self-awareness is no more reliable than its factual reliability.

Reply

[-]alxgoldstn3y10

Yep! That's something that I wrote in my original writeup:

Even when it claims to do so, [ChatGPT] doesn’t consult a datasheet or look up information — it’s not even connected to the internet! Therefore, what seems like “reasoning” is really pattern recognition and extrapolation, providing what is most likely to be the case based on training data. This explains its failures in well-defined problem spaces: statistically likely extrapolation becomes wholly untrue when conditioned on narrow queries.

My last comment about "self-awareness seems to be 100%" was a (perhaps non-obvious) joke; mainly that at least it is trained to recommend that it shouldn't be trusted blindly. But even this is a conclusion that isn't arrived at via "awareness" or "reasoning" in the traditional sense — again, it's just training data and machine learning.

Reply

[-]Trevor Hill-Hand3y10

I've been doing similar things with my day-to-day work like making stuff in CSS/Bootstrap or Excel, and my hobbies like mucking about in Twine or VCV Rack, and have noticed:

a similar vibe of there seems to be a "goldilocks prompt narrowness" that gives really good results
that goldilocks band is different for different topics
plausible-sounding errors sneak in at all levels except the broadest, where it tends more towards very hedged "fluffy" statements like "be careful!"

However, if you treat it almost like a student, and inform it of the errors/consequences of whatever it suggested, it's often surprisingly good at correcting the error, but here is where differences between how much it "understands" domains like "CSS" vs. "Twine's Harlowe 3.3.4 macro format" become easier to see- it seems much more likely to make up function and features of Harlowe that resemble things from more popular languages.

For whatever reason, it's really fun to engage it on things you have expertise in and correct it and/or rubber duck off of it. It gives you a weird child of expertise and outsider art.

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

17

Transcript: Testing ChatGPT's Performance in Engineering

17

17

General Troubleshooting

Narrow Prompts

Fabricating References

Meta-Questions of Information Validity