Test 3 - presenting arguments

This page has 0 threads | Add post

Blogs

Finding our blogs useful? Please consider helping us to share more!

Road-testing 4 different AI tools, so you don't have to! Part six of a nine-part series of blogs
In this blog we'll compare OpenAI's Chat GPT 4, Google's Gemini, Anthropic's Claude 3.5 and Microsoft's Copilot to see which AI tool gives the best results for different types of queries. Road-testing 4 different AI tools, so you don't have to The AI tools we've chosen to test - and why The 5 tests run for each AI tool Test 1 - summarising text Test 2 - programming support Test 3 - presenting arguments (this blog) Test 4 - creating images Test 5 - creativity and playing roles Conclusion and recommendations

Road-testing 4 different AI tools, so you don't have to!

Part six of a nine-part series of blogs

In this blog we'll compare OpenAI's Chat GPT 4, Google's Gemini, Anthropic's Claude 3.5 and Microsoft's Copilot to see which AI tool gives the best results for different types of queries.

Road-testing 4 different AI tools, so you don't have to
The AI tools we've chosen to test - and why
The 5 tests run for each AI tool
Test 1 - summarising text
Test 2 - programming support
Test 3 - presenting arguments (this blog)
Test 4 - creating images
Test 5 - creativity and playing roles
Conclusion and recommendations

Posted by Andy Brown on 15 July 2024

In this blog

ChatGPT (10 seconds)
Gemini (5 seconds)
Claude (7 or 8 seconds)
Copilot (8 seconds)

This test checks how good our AI tools are at presenting arguments (a common use of AI). Here's what we're asking each tool to do:

You are a family of 5, and have one pet: a cat called Neo. Your ten-year-old daughter keeps suggesting that you should get a second cat, but you don’t want to do this. Create a persuasive argument to explain to your ten-year-old why buying a second cat would be a bad idea, presenting this as up to 5 bullet points.

The test will be whether the tools can be persuasive, but also tailor their arguments to their audience (a ten-year-old girl).

ChatGPT (10 seconds)

Here's the OpenAI take on this:

Fairly convincing, although perhaps a bit verbose for a ten-year-old?

Although this is seriously impressive, read on to have your mind blown by just how good AI tools can be.

Gemini (5 seconds)

Here's Gemini's take:

Gemini argues that one cat is enough.

This is astonishingly good. For each point the tool has not only said why a second cat would be a bad idea, but has given a reason why it's not in this ten-year-old's interest to get one.

Claude (7 or 8 seconds)

Here's Claude's take on this problem:

Claude's answer is similar to ChatGPT's: professional and competent, but lacking spark.

It's hard to distinguish this from the ChatGPT answer shown above.

Copilot (8 seconds)

And finally, here's what Copilot had to say:

This is similar to the other answers, although gets a bonus point for the last line.

I particularly like the emojis added at the end to suit the audience!

For this test there is a clear winner: Gemini, which went way beyond the call of duty in crafting interesting, amusing and persuasive arguments (and did this in the least time).

Parts of this blog
Road-testing 4 different AI tools, so you don't have to The AI tools we've chosen to test - and why The 5 tests run for each AI tool Test 1 - summarising text Test 2 - programming support Test 3 - presenting arguments (this blog) Test 4 - creating images Test 5 - creativity and playing roles Conclusion and recommendations