I had read a bit about this here and there, but was not planning on messing with these Large Language Model "AIs" myself. However, I was explicitly asked to do so by a professional contact who wanted me to evaluate them. I ran three, which produced significantly different results.
One thing I asked it to do was to explain Aristotle's ethical theory. The answer it gave was plausible at about the college level, or even at the grad school level for people who weren't specialists. The mistakes it made are mistakes that even ethicists who haven't actually studied Aristotle closely might make: for example, it claimed that Aristotle's virtues are means between two extremes. I've heard even trained philosophers make that error, because it's very close to what Aristotle does say; it's just not quite right. I decided that wasn't a good test for Chat GPT, though, because it's too easy for the kind of model it is: if it's just mapping out what experts have said about Aristotle and regurgitating it in a slightly reordered format, that's what you'd expect. Actually understanding and being able to apply the knowledge, as humans do, that's hard. Chat GPT doesn't have to understand, it just has to know that there are very frequent connections between various words that imply that using those words together in the commonly-encountered order is correct.
So the next thing I asked it to do was to diagnose a problem with a 2007 Jeep Wrangler Rubicon that I just finished resolving. It involved a poltergeist-like failure of multiple electrical systems. The answer it gave was wrong but plausible: it started with the assumption that there could be multiple system failures and walked through how to diagnose possible issues with each in turn. In fact the problem was that the ECM had gone bad, which I told it. It said that was also a possible cause of the multiple failures I described, and said it was too complex for me to fix so I should take the Jeep to a shop. I told it the shop had refused the job because the ECM was discontinued, and therefore they couldn't get parts from an authorized source. It offered four ways to obtain a functional discontinued ECM, all of which were plausible, but cautioned me that it was too complex to try to fix without substantial technical knowledge.
In fact, it was the easiest car repair I've ever done: I bought a refurbished one from Flagship One, and just dropped it in. You do have to know which numbers are the right ones so you order exactly the right thing, and you have to take care to have it programmed to the right VIN, which you can do yourself if you buy the diagnostic software from Alfa Romeo (the parent company of Jeep, these days). But FS1 will be happy to do it for you, if you send them your VIN. Once you get the right part there are only three bolts and three electrical connections.
A plausible reason it might have been thought difficult, which Chat GPT did not mention when I asked why it thought the repair was difficult, is that the ECM is normally located against the firewall. Getting to it is already potentially a pain. This particular Jeep, however, has had it relocated to an easily-accessed space further forward. That's something Chat GPT couldn't possibly know, and didn't; but it didn't know that I ought to have worried about the firewall issue either.
So it was wrong on several points, but the answer was still useful if I had been someone who knew little about car repair. It's not terrible even with physical technology, because a lot has been published online in various help fora.
The third example was actually terrible, though, so I'll post it separately.
2 comments:
I found the ChatAI bot totally incompetent with the arithmetic involving negative powers of ten, and the prefixes (atta-, pico-, femto-) associated with those numbers. Said an attometer sized ball was too big for a picometor sized box...
The AI also tended to repeat the typical most often issued instructions for an MS-Windows feature even for apps where the feature doesn't exist. "Look for the menu icon on the upper right corner, click, and choose "Print" from the drop down list..." Yeah, well, not in "Media Player V12".
I've had some pretty strange results from ChatGPT; for instance, in a review of 'The Caine Mutiny', it said that Captain Queeg was (a) a highly competent naval officer and (b) an intellectual. Regarding Goethe's 'Faust', it was insistent that Faust wound up going to Hell. Some reviews have included characters who appeared nowhere in the subject book.
I got to the top of the waitlist for the Bing version last year, and initial impression is that it's a lot better.
Post a Comment