Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
551 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
Comparing the 5 leading AI tools for image generaton from text Part three of an eight-part series of blogs |
---|
Having recently compared the 4 main AI tools for text prompts, I thought I'd do the same for image generation tools. In this blog series we compare Dall-E (via ChatGPT and separately via Copilot), Firefly, Midjourney and Stable Diffusion for 3 pre-defined tests to see which ones score most highly for cost, ease-of-use, speed, editing ability and above all for quality of image.
|
In this blog
To make this test as fair as possible I'm pre-defining exactly what I'll ask each of our text-to-image tools to do, so that not only is there a level playing field, but there's also no risk of me tweaking the tests to suit one tool more than the others.
For a recent newsletter I wanted to pose this rebus (visual puzzle) for the book title Pride and Prejudice (it's Prejudging Mice with the given letters removed):
I want to generate an image of someone pre-judging two shifty looking mice, but without the speech bubble.
Here's the prompt I'm going to put to each of our 4 tools:
Create a photo-realistic image using 4k detail. This should be a side view of the following scene. On the left of the image should be a judge sitting in their chair in a UK courtroom. The judge should be wearing a legal wig and have a severe expression. Facing the judge in the dock on the right of the picture should be two mice with guilty-looking expressions.
The main thing I want to check with this test is how good each tool is at generating photograph-quality images. I'm probably expecting Midjourney and Adobe Firefly to excel at this one, but we shall see!
For another recent newsletter I wanted to have a treasure island map, and drew this:
Picasso this isn't; I could never draw. However, it does show the sort of thing I wanted to achieve.
Here's the prompt I'm going to input to try to generate my treasure island map:
Create a line drawing of a treasure island with a curvy coastline. The background of the island should be white apart from child-like drawings of the following features:
Hell’s Kitchen (a volcano)
Hangman’s lookout (a gibbet with a skeleton hanging from it)
Dead man’s desert
Sulphur Springs
Magwitch marshes
Tombs Town
Lone Palm
Treachery Hills
These features should be roughly equally spaced around the island. Do not show any text on the picture – I will add that later.
The coastline of the island should be clearly delineated, and child-like drawings appear in the sea of the following:
2 or 3 child-like drawings of whales
A line-drawing of a large bird like an albatross
A pirate ship in a bay
Keep everything simple. Avoid having much fill or background colouring. When in doubt, use simple outline drawings of things. Use pastel colours throughout.
I've no idea what I'll get from this, but it should at least test the various tools' ability to follow instructons. I'm expecting ChatGPT and Copilot (both via Dall-E) and Stable Diffiusion to perform strongly for this one.
This is Sheapey:
Bought from Build-a-Bear many years ago, Sheapey has suffered indignities that no sheep should have to suffer.
To restore Sheapey's spirits, I want to portray him as a super-hero by using this prompt:
The attached picture shows Sheapey, a slightly faded child’s toy sheep. Create a picture showing Sheapey in the style of a Marvel Comic poster. Sheapey should be depicted as a heroic figure come to save the world.
There are two things this tests: how well our AI tools can do with minimal guidance (how creative they can be), and also whether they have the ability to work with and modify existing images (I have no idea at this stage which of them can do this!).
So those are the tests I want to run; let's get going with the first one!
Some other pages relevant to the above blogs include:
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.