Picasso, Da Vinci, and AI all now have a common skill. I bet if you told these two artists this during their careers, they’d probably laugh at you… might be one to ask The Doctor to do on his next visit to the past in the TARDIS. It’s a known fact how Da Vinci and Picasso created artwork. Pens, pencils, paints, canvas, and many hours or days of hard work. But how does AI do it?
It all starts with digital imagery. Thousands if not millions of images. Collated into a massive library, ready to train AI models. It all starts with image recognition. So, the models will be taught what something looks like by showing it several images. Let’s take a squirrel for example…
From the above image, the model might establish the shape of the tail, head shape, and posture. Along with colours and various other characteristics, but what we call them and how the models will see them are completely different.
An AI model, indeed any computer, will just see this as a series of 1’s and 0’s, so what we call a tail, the AI might see as 0010001110101010100110101 (a shortened example as otherwise this whole article would be 10 times longer just to show you an image representation). Then it will be told to look at some more images and identify squirrels and non-squirrels…. reinforcing its learning because once it makes a guess, if it is right then it will be told by its training setup, don’t worry there is not a human having to say yes or no to each guess. It would do this for nearly everything in the world. Using tags that humans will put on some images for it to learn with over time. Like when you are a child and get told to put a cube through a square hole. Eventually it has a very good idea of what is what. So then, it can take a human’s request ‘give me an image of a coffee squirrel’ and it will go off to its data banks, find some imagery, and manipulate it. So that the coffee cup sits in the claws of the squirrel.
This too will have taken time to learn, and again would have been reinforced. But the great thing is that these can be adjusted with the prompts, much like in ChatGPT where you can ask it to reformat the response it just gave to you. So, if your squirrel is too lifelike you can ask for it to be more cartoon for example. Because it will have been trained on various art styles as well so it will know how to stylise this. All this is achievable because the AI will effectively be using its own version of photoshop, except instead of needing a fancy interface and needing to know where exactly each function is, it is just a program that will be built into the AI’s system, and it will have learnt through trial and error how to build great looking images.
Remember, the computer has no eyes! It can’t appreciate its own work… or can it… well for now it can’t. Allowing it to take from other images, merge them together, and even take a computational level of inspiration to make something new from them. All to fit your requests. However, despite it being this very cool tool that we can just generate images at will, we do have to give it precise instructions and tweak it ourselves.
Even though this technology is getting very good, in some cases to the point where it is hard to tell if it is AI generated or not, it does have limitations. Some are obvious, for example if you wanted a picture of your Uncle Fred. I wish you the best of luck with your AI image outcome because it will have no idea who Uncle Fred is, and will most likely bring back some random human figure. But it would make it male looking, because it would know uncle means male from its training. Another limitation is copyright and accreditation for the work. Images have all come from humans. Whether it is a human taking a photo with a camera, or an artist digitally creating the masterpiece, right down to film makers doing animated films.
Scarily, the ‘magic’ behind AI generation sounds a lot like a human art class. Teacher says this is a tree, draw me a different kind of tree. Off the students go with a trunk, branches, leaves. This is what the AI does, except it doesn’t have a brain like a human, so has to computationally work out the links in the data.