Turing test? LMAO.
I asked it simply to recommend me a supermarket in our next bigger city here.
It came up with a name and it told a few of it’s qualities. Easy, I thought. Then I found out that the name does not exist. It was all made up.
You could argue that humans lie, too. But only when they have a reason to lie.
That’s not what LLMs are for. That’s like hammering a screw and being irritated it didn’t twist in nicely.
The turing test is designed to see if an AI can pass for human in a conversation.
turing test is designed to see if an AI can pass for human in a conversation.
I’m pretty sure that I could ask a human that question in a normal conversation.
The idea of the Turing test was to have a way of telling humans and computers apart. It is NOT meant for putting some kind of ‘certified’ badge on that computer, and …
That’s not what LLMs are for.
…and you can’t cry ‘foul’ if I decide to use a question for which your computer was not programmed :-)
It wasn’t programmed for any questions. It was trained hehe
Each conversation lasted a total of five minutes. According to the paper, which was published in May, the participants judged GPT-4 to be human a shocking 54 percent of the time. Because of this, the researchers claim that the large language model has indeed passed the Turing test.
That’s no better than flipping a coin and we have no idea what the questions were. This is clickbait.
While I agree it’s a relatively low percentage, not being sure and having people pick effectively randomly is still an interesting result.
The alternative would be for them to never say that gpt-4 is a human, not 50% of the time.
Participants only said other humans were human 67% of the time.
Which makes the difference between the AIs and humans lower, likely increasing the significance of the result.
Aye, I’d wager Claude would be closer to 58-60. And with the model probing Anthropic’s publishing, we could get to like ~63% on average in the next couple years? Those last few % will be difficult for an indeterminate amount of time, I imagine. But who knows. We’ve already blown by a ton of “limitations” that I thought I might not live long enough to see.
Oh no!! the AImageddon it’s closer everyday… Skynet it’s coming for us!!