If You’ve Come Across Weird Digital Images Like “Giraffe In Space” Or “Darth Vader On Electric Hand Guitar,” It’s Probably Dall-E’s AI That Has Been Making A Splash On Twitter These Days.
Maybe you, like me, have had your Twitter timeline filled with weird digital images created by Dall-E Mini AI for a few days now; Images such as Karl Marx made with TV thaw, Walter White character with Gamecube console in hand; Gordon Ramsey eating the Big Mac or the frog worm in Edward Munch’s Scream painting.
The Dall-E Mini service, hosted on the Hugging Face website, uses artificial intelligence to cut billions of images from across the Internet to create relatively relevant images of text that the user submits; Even if the typed text looks like the examples above, it is weird and surreal, and no instances of them can find in the real world. For example, I tried the phrase “iPhone pony in hand” on this platform and obtained the following images.
Dall-E Mini’s “artwork” is so famous, and everyone is talking about it because of the AI’s fantastic ability to create images of ideas that no one has come up with before.
For example, if you do a Google search for “Gandalf in a spacecraft,” you will not get the desired result. Still, the exact phrase shows relatively relevant results on image-generating platforms such as the Dall-E Mini.
Every few years, a technology emerges that divides the world before and after itself. For example, I remember well the first time I “played” a song, the first video call I made with Viber, or the first photo I took with a 2-megapixel camera and then posted on Instagram with a simple filter.
What makes these moments memorable is the perception of unpredictable and surprising events that may become achievable with the advent of these technologies. Now, you can make a video call any file from any cloud or connect to thousands of people worldwide via live streaming and Wi-Fi.
There has been no news of this type of technology for several years; This is what we want to show to our friends, and I must say that you must see this! Of course, the Dall-E Mini, as its name implies, is just a minimal example of the great, forward-looking technology that can shape the future; The world of algorithms and artificial intelligence.
Dall-E Mini; New entertainment for social network users
The Dall-E Mini is a project by Texas programmer Boris Dayma that he developed in July 2021 to compete in an AI computing competition sponsored by Google and Face machine learning technology company.
The project, which is currently being hosted on the Hugging Face website, has become so popular these days due to the Twitter hype that traffic on the site is so high that it may take several attempts to get the image we are looking for finally; But since using this service is entirely free, it is worth a little patience and effort to satisfy our curiosity.
Requested image: A young man opening a portal to ancient Rome with the help of Commodore 64
In the Dall-E Mini, anything can imagine; Incidentally, Internet users are so fascinated by this service because of its ability to capture the strangest and most irrelevant requests and the ridiculous situations, which leads to funny and sometimes surprising results.
When you enter the phrase you want in the Dall-E Mini text box and press the Run button, you are presented with nine images measuring 256 by 256 pixels that match the algorithm used on this platform.
Of course, when you look closely at these pictures, you will realize how flawed they are, especially if you enter the names of real people instead of imaginary animals or characters. But from a distance and glance, the images created are often very similar to what we expect.
Dima acknowledges that the platform shows better results in abstract paintings but has difficulty with more detailed authentic images.
Undoubtedly, the most challenging part is the pictures of people. If you ask Dall-E for a scene, the result will be great; Because if a tree has a problem, no one will notice it. But if a part of a person’s face, such as the eye, has a problem, we will see it immediately.
Although the Dall-E Mini can create beautiful, “artistic” images, mathematics and algorithms are involved without artistic taste. What the AI of this platform does is not express its artistic taste.
Unfortunately or fortunately, artificial intelligence has not yet developed enough to be creative; Rather, the Dall-E Mini algorithm looks only at the myriad images on the Internet that match the text to the user’s request and then finds the patterns that are repeated in most of them, such as shapes, colors, and descriptions.
The Dall-E Mini then uses these templates to learn how to create an image that fits the user’s text request.
Requested image: Alien movie space creature in courtroom pre-design style
Dima describes the Dall-E Mini as an imitation of OpenAI’s DALL-E project, but on a much smaller scale with a simpler architecture than the original version, which, although much lower quality than the DALL-E, is entirely free for all to implement. There is no need for such hardware; You can even try the Dall-E Mini on your smartphone, though the manufacturer claims it works better on the web.
Interestingly, the machine learning models used to convert text to the image have only reached this level of Dall-E Mini capability in a few years. For example, in this article, which was published in 2018, you can see the challenges and weaknesses of older models. Given the ” giraffe herd on a ship ” caption, this model could only create a few giraffe-like shapes standing on the water.
This model did not even come up with a simple ” sheep ” request. The fact that we can now obtain such near-realistic results from a small personal project designed solely for competition indicates a significant improvement in the “understanding” of the algorithms.
Dall-E; The great revolutionary spark in the creation of works of art?
The Dall-E Mini is nothing more than a toddler compared to its original version, the DALL-E, or more accurately, the DALL-E 2; Unfortunately, the main project is currently in the private beta phase, and less than 5,000 people have access to it.
The DALL-E service, named after a combination of Salvador Dali, surrealist painter, and Pixar animator WALL-E, was born in January 2021 at the San Francisco-based OpenAI company. OpenAI was founded in 2015 by Ilan Mask, Sam Altman, Ilya Softcore, and three others, but Musk left the board in 2018. In 2019, Microsoft invested $ 1 billion in the company.
OpenAI is known for developing GPT-3, a tool for generating complex and heavy text from simple expressions, and Copilot, a tool for automating the coding process for software engineers. Their open-source code is available for public use on GitHub.
With Dall-E technology, you no longer need to have advanced Photoshop skills.
The first version of the DALL-E was also based on the GPT-3 model and was limited to creating images at 256 by 256 pixels. But the second version, which entered the private beta phase in April 2022, is a significant leap forward in artificial intelligence-based image generators.
The images that DALL-E 2 is capable of creating are now 1024 by 1024 pixels and use new techniques such as “inpainting,” in which parts of the idea of the user’s choice are replaced with another image; For example, suppose you take a picture of an orange in a container and then tell DALL-E to replace it with an apple image, and DALL-E does this in the cleanest way possible. Hence, you no longer need advanced Photoshop skills!
In addition to the ability to edit and retouch photos, the second version of DALL-E can turn simple textual descriptions such as ” Elephant Tea Party on the Grass ” that did not previously exist into artistic or authentic images that you will be amazed to see.
DALL-E’s magic is not just about recognizing objects individually but also about their unique understanding of the relationships between objects so that when you ask them to create a ” horsemanship astronaut, ” they know what you mean by that. . In this tweet, you can see some of the images created with DALL-E.
Requested image: Robotic dinosaur versus truck monsters in the Colosseum
OpenAI describes the DALL-E project as an example of collaboration between creative humans and intelligent systems to visualize new ideas and enhance human creativity.
The company also adds that images created with DALL-E can tell us whether the system understands what we humans are saying or repeats what it has learned. In addition, DALL-E shows us how AI systems see our world and know that this, according to OpenAI, is crucial to developing practical and secure AI.
The critical thing to know about DALL-E is that its developer is careful not to misuse it. Users invited to use this platform after creating an account need to agree with the content policy of this company.
DALL-E, for example, does not allow the user to use hate speech, violence, nudity, immorality, or any political content. The platform also uses methods to prevent the production of realistic images of the faces of real people, including celebrities.
OpenAI has strict policies to prevent the misuse of Dall-E.
Although it is possible to create images based on the faces of celebrities in DALL-E, it is not possible to upload a photo without their permission, and the platform uses methods such as blurring the front to prevent the production of realistic images from making it clear that the images are manipulated and not accurate. DALL-E also has access to a list of prohibited words such as “shooting” to prevent pictures from containing sensitive content. Users are also not allowed to use this platform to create images with the purpose of deception, such as Deepfake.
Working with DALL-E is as simple as typing the phrase in the text box; It looks like Google’s search bar has been converted to Photoshop. Even inspired by Google, the platform has a button called “surprise me” that inserts phrases based on previously created images into a text box of its choice. This button is handy when the user is looking for a new idea to create a work of art, but nothing comes to mind.
It takes about 15 seconds for DALL-E to create ten related images in proportion to the typed phrase; Of course, recently, the number of pictures per request has been reduced to 6 so that more people can use this platform.
DALL-E image of “avocado chair.”
One of the leading technologies used in this platform is “diffusion,” which was explained by Google’s artificial intelligence unit last year. In general, diffusion-based models degrade the data entered into the network for synthetic intelligence training by adding Gaussian noise and gently erasing the details of the data so that all that remains is pure noise.
Then, another neural network performs this data degradation process in the reverse direction so that all the noise is removed and an utterly noise-free sample is created.
All this explanation aside, what amazes one is the incredible creativity of this technology in image production. For example, consider the following images, which are created from these phrases:
An economist bear versus the stock market downtrend, digital art
An economist bull versus stock market chart, digital art
The power of DALL-E to capture emotions in these two cases is genuinely unique; Bear fear and helplessness in the face of cow anger. However, the feeling these images create in the viewer is that we are looking at a work of art and creativity. However, using the word “creative” to describe this process is wrong because what happens is based on conjecture and probability, not artistic taste.
Another exciting feature of DALL-E is its ability to solve problems in various ways. For example, when asked to show “a delicious cinnamon candy with moving doll eyes,” he tried different ways to illustrate this model of eyes, one of which was a funny miniature cinnamon candy.
It is safe to say that DALL-E is the most advanced image-generating tool, but there are many similar examples.
Google has also introduced the Imagen tool, which public users are not yet able to use. And, of course, the Dall-E Mini, which has nothing to do with the original DALL-E, but unlike other tools, has become widely available and extremely popular.
OpenAI has not yet decided whether to make DALL-E available to the public one day. According to the company, the current project’s goal is to show a few people how to work with the technology and, if necessary, update both the platform and content policies based on the feedback they receive.
Although the DALL-E is arguably the most advanced image-generating tool to date, while it is not yet widely available, you can entertain yourself with similar examples such as the Dall-E Mini and a few others listed below.
Artificial intelligence platforms similar to DALL-E
In addition to the Dall-E Mini, which has made Internet users so addicted, other examples are available that do more or less the same thing. Platforms like StarryAI and NightCafe work a lot like DALL-E, except that all the images created are entirely unrealistic and artistic, and no one can confuse them with a photo. NightCafe, for example, generates dreamy images based on typed phrases and offers various styles such as “cyberpunk” or “fantasy.”
AI Art Maker, which the site says “transforms the imagination into art,” includes various options, including anime, watercolor, and a realistic image that displays only one 256 by 256-pixel image per request, but for free. You have to pay to get more significant dimensions. In addition, it is possible to convert the created artwork to NFT on this platform.
Images created in AI Art Maker based on the text request “Handy Ice Cream Cat” in four different styles
The Dream by WOMBO app, which also has a web version, like NightCafe, offers a variety of art styles to make the result look like a work of art. This application is based on two neural networks, VQGAN and CLIP; the first task of produce images similar to other photos. CLIP is trained to determine the appropriateness of textual descriptions with pictures.
Images created by Google Imagen
Nvidia’s GauGAN2 project, which we talked about last year, can, with the help of in-depth learning, create images of typed words and phrases that are sometimes very similar to typed phrases and sometimes artistic and occasionally terrifying. The GauGAN2 algorithm is trained with 10 million landscape images with the supercomputer of Celine Nvidia, one of the ten most powerful supercomputers in the world. It can create highly realistic images at best. Although this project is still in beta, you can try it free.
Google is also working on a similar version of Dall-E called the Imagen, which uses OpenAI models to output higher-quality images. Unfortunately, Google does not have a demo version of Imagen to work with like the Dall-E Mini. Still, you can see some examples of images made with the Imagen engine above and on this project’s official website.
DALL-E; Fears and smiles
In the world of technology, the emergence of a phenomenon such as DALL-E, which is an extraordinary display of the power and advancement of artificial intelligence, could be the starting point for a revolution in the size and impact of the Internet and smartphones. Although OpenAI has not yet identified the possible applications of this technology, people who have had the opportunity to experience it have discovered some exciting applications.
For example, an artist has used DALL-E to design augmented reality filters for social networking applications; Or the cook takes the idea from DALL-E to decorate his food. In an article on the potential capabilities of the DALL-E, Ben Thompson, a tech analyst, points to the creation of environments and digital objects in Metavars in a highly inexpensive way.
Tools like DALL-E can be helpful for graphic designers; For example, they can ask DALL-E to develop some concept ideas before they do it themselves. This platform can also be beneficial for people who do not have the financial capacity to hire a designer.
You may have wanted to draw your comic book as a child, but the idea never materialized because your drawing skills were not good.
Some AI lovers have also discovered another exciting application for DALL-E; These people have gone to classical works of art and asked artificial intelligence to paint the continuation of these works or to re-imagine them in entirely different styles.
A Reddit user with DALL-E attempted to complete George Washington’s half-finished drawing by Gilbert Stewart in 1996, with the following result:
DALL-E does not seem to be the tool most people want to use daily. Still, it is conceivable that other creative applications for this technology will discover in e-commerce, social networks, home, and work in the coming months and years.
It is often the case that with the advent of new technology, all our attention is focused on its positive aspects and applications, and we ignore possible abuses of it in the future; But as much as we are excited about the advent of DALL-E technology, it was concerned about the misuse of such a tool in the hands of individuals and companies with fewer rules and red lines than OpenAI.
A company like OpenAI may have strict policies against DALL-E abuse. Still, malicious applications can be expected with the advent of new and similar tools, such as the Dall-E Mini, that do not have serious content monitoring.
Even now, some people are using harassment technology to harass others; It is unlikely that people will want to use a platform like DALL-E for malicious purposes.
Use Dall-E to expand classic artwork.
On the other hand, the automation process has always brought with it the worry of losing jobs. Now that artificial intelligence can paint anything imaginable for us, what is the need for professional illustrators? One of the same artists wrote on Twitter about his concern about this:
I have a terrible feeling that art based on artificial intelligence will swallow the professional economic stability of illustration. Not because art is to be replaced entirely by artificial intelligence, But because this art model will be much cheaper and more suitable for most individuals and organizations.
It is easy to say, “I only go to real artists for art.” But wait until you have to choose between paying $ 500 and $ zero for a system that can do up to 95% of your work.
Another problem with AI models whose neural networks are trained based on data extracted from the Internet is the issue of discrimination and offensive content. A few years ago, a group of MIT researchers deleted a massive collection of 80 million images used to teach their algorithms because they included “offensive terms and pictures.
” In most of these models, too, if you use business-related words, most images show men, indicating discrimination against women.
On the other hand, this technology’s positive aspects should consider on a larger scale. What happens to our understanding of reality when most of the images we encounter on the Internet are produced by artificial intelligence? How can reality be distinguished from artificial intelligence?
DALL-E seems to be a crucial tool in the world of consumer technology. Will our view of DALL-E continue to be an astonishing revolutionary in art and creativity for a few more years, or will it start an adventure of a more disturbing dimension?