Illustrating Grim Tales
20 May 2021I wrote about the writing process of Grim Tales in a previous post. However, the unique thing about Grim Tales is that it was illustrated by AI too. Here’s how it was done.
The inspiration to create an AI generated book came from me playing around with some generative art. So from the start I knew I wanted to at least attempt to illustrate Grim Tales using it.
Coincidentally, the text generated by the AI wasn’t that effective at describing scenes or characters. This meant that the illustrations weren’t there just because I wanted them, but because they would actually complement the story.
Each illustration was generated using a modified version of @advadnoun’s ‘Aleph2Image’ notebook which uses OpenAI’s CLIP and DALL-E. Aleph2Image takes a prompt written by the user, and generates an image based on that prompt.
I took the notebook and adapted it so that I could run it locally rather than in the browser. Doing this meant I could speed up not just the generation time, but I could also let it run in the background without worrying about getting kicked from the notebook runtime.
A few more modifications allowed me to iterate quickly on prompts. Simple things like showing a few samples from a prompt (to get a feeling for whether a prompt was worth exploring more or not), or queuing up prompts made it a lot easier to work with. Much better than babysitting a notebook. Finally, there were a few tweaks to the learning rate and other parts of the system.
With the story written, I started writing prompts for key scenes, characters, and events in the story. Most of them just a few words, or at most a brief sentence. I queued up a bunch of prompts and came back later to see if anything interesting was generated.
Once we had some images I could filter out the bad prompts and the good ones. From there I could iterate on the prompts and generate again. It almost feels like a machine learning training loop: produce output, check results, adjust parameters, and repeat.
Eventually, using the promising prompts I generated multiple illustrations from each prompt. Effectively playing a game with a random number generator to get better results.
I wasn’t sure how many illustrations would feature in the book, but casting a wide net of prompts for many different parts of the story was a good idea simply because there were some scenarios that I couldn’t find any good prompts for.
In fact, there were multiple illustrations I wanted to do, but wasn’t able to acheive any good output. One of the ones I was really surprised about was when I tried to get an image of a black cat. No matter what prompt or modification to the parameters I made, I couldn’t get a good image of one. Luckily, there were plenty of other opportunities for illustrations throughout the story.
Here’s a few prompts of illustrations featured in the book:
The Grim Reaper on a white background
Ghost in the city hall
Creepy graves
Once I had images that I liked, I shortlisted them and put them alongside the story to see which ones worked best. From there I used another machine learning based tool to upsample the images 4x from a resolution of 512*512. The square aspect ratio was a constraint carried over to the paperback itself - simply because I wanted full page illustrations.
For the front cover, I had to make a big decision - do I try to generate a cover containing the title ‘Grim Tales’ or do I generate a cover and add the title later?
It felt like the only option would be to at least try and generate a cover with a self contained title - it fits so well with theme of the book being created by AI.
Hundreds of images must have been generated to get the cover that I eventually used for the book. The text is mostly legible, and the rest of the image conveys a lot of what is in the story - the Graveyard, the Sprit, and the Reaper as a shadowy figure.
Generating all of the art for the project was a fun and simple process. But it also took a decent chunk of time to generate and curate images that I felt were right for the story.
In total I would estimate that 1000’s of images were generated over a few weeks for the handful that appeared in the book.
A similar process was used for generating the top images for each of these articles. The prompt for this one was ‘The Grim Reaper with an easel’.