Impressions from base-GPT-4?
I wonder if some people here had a chance to play with base-GPT-4 (the access is given very selectively for research purposes) and would not mind sharing some of their impressions? I know that some people have been playing with it, but I've never seen a discussion of impressions and lessons from that. And I know that it is quite nontrivial to get access to this model, but that some access is given. I think it would be super-interesting for many people here to hear this kind of conversation...
I think this just means that one needs to spend more time to constructs good test coverage (probably with help of the agents involved).
282 unit tests does not sound like nearly enough for something like SQLite (Google AI thinks that the original SQLite release had dozens of thousands of tests and that the current number of tests is in millions and fuzzers run through about a billion test mutations each day).
I don’t think one needs that much for a proof-of-concept work, but the famous recent port of JustHTML library by Simon Willison was made possible by html5lib-tests having 9200 test cases or so. Perhaps that’s the ballpark number of tests one... (read more)