Creativity and AI when working with images and other objects

No description

In an art competition in Colorado, Jason Allen's painting entitled Théâtre d'Opéra Spatial won first place in 2022. This caused a great deal of outrage and uproar in the world of art and aesthetics. The picture competed in the digital art category, so some computer life path was expected. However, instead of drawing the image in Illustrator or Krita, Allen used Midjourney in combination with Gigapixel-ai. This tool allows up to six times the resolution of an image.

Is such a painting art? Does it represent something genuinely creative? We must first look away from the result and focus on the creation process to answer these questions. How did Allen proceed? Media shorthand might suggest that he put a request into Midjourney, got an image, enlarged it, sent it to the competition, and was done in ten minutes. Masaryk University had a similar discussion when introducing a new visual style - it looks like something that can be done in five minutes.

Allen carefully refined and refined the query (prompt) that he inserted into the system and systematically modified the work's final form. The creation of the image thus bore a striking resemblance to sculpture, in which sculpture is created by gradually chipping away layers of stone or by small hammer blows on malleable material. Only here, we are working with a digital object, the fundamental advantage of which is that we can go back and gradually modify and change the inputs.

One could argue that AI creates something new that the author has no control over. On the other hand, with wood or conceptual art, the author's power is also limited. In the theatre, for example, the actors perform differently each time, which does not diminish the quality of the subject or the director's work. If we consider whether the Théâtre d'Opéra Spatial is a work of art, we must first include the process of creating the object in the answer. It is the process that, in terms of artistic judgement, is probably the most critical parameter that comes into play.

No description

No description

In his essay The Age of the World Image, Martin Heidegger offers a second insight into the difference between art and banality. He argues that the image as art is a tool that allows us to glimpse something in ourselves. With the image, it is not a question of what it depicts in itself (as an object, etymologically thrown in front of us) but of what it enables the one who looks at it to know. In this context, we usually work with the image with creative tools. We are not concerned with quality or handiwork, but we understand the idea mainly as a signifier, a tool that allows us to link the one who works with it to what it contains somewhere behind. The image of an apple and the concept of an apple are two different things.

A third view, somewhat critical, focusing on the majority production of images from instruments such as the Midjourney or Dalle-2, is associated with aesthetics specialists. Who creates these objects and with what taste determines the output quality and the individual instruments' design. In other words, people with no taste and no aesthetic training (not necessarily formal) will likely prefer images close to neo-romanticism or other historicizing forms, often only in outward imitation, or to sorel (socialist realism) and its more modern variants. There will thus be no artistic quality in them. An empirical or statistical look at the production of individual visual artefacts generated by artificial intelligence confirms this view. Quite possibly (as Václav Maněna points out), the production of images from AI tools will in a few years be perceived as WordArt, which in the 1990s were considered aesthetic and tasteful, and today, no one will appreciate them (similarly to the flashing and webs with gif from the same period).

On the other hand, two things should be kept in mind - objects of this kind can be judged in the same complex way as the haiku at the beginning of our reflections - there is no point in thinking about them without context and context. At the same time, we must remember that we do not always need to create art and that much more superficial or banal objects can often be used for creative activities.

Banality, as the opposite of creativity, is not an absolute matter but a contextual one. It shows up where the context does not go into depth, where one can see perfect expectation, mediocrity, and uninterestingness, even if it is honestly worked.

No description

No description

DALL-E 2 - tool from OpenAI that creates images based on text input. It was one of the first models that allowed for genuinely robust image generation. These can be completed entirely based on text input as a variation of an already created object or by modifying it (generating part of the image). The advantage is the (limited) free version and the ease of use.

Midjourney - is probably the best-known and best tool for creating graphic objects. The current version can work relatively balanced with realistic and graphically more "computer" objects. The descriptions can be brief or very careful, and Midjourney generates the final image based on them. A definite drawback is the price and the fact that it needs to be controlled via Discord. On the other hand, compared to Dalle-2, it is suitable for generating comics or creating graphic elements in a unified visual style.

Stable Diffusion - is free and available as open source, which makes it unique. Alternatively, users can download and work with it on their computer or use the online version. The results are visually weaker than Midjourney, but in our experience of better quality than Dalle-2. The principle of operation is similar to the other tools - the more detailed the input and the more training data, the better the system's results.

Other objects

This section will focus on creating graphical elements and other digital artefacts that can be made through artificial intelligence. Specifically, we may encounter tools for creating presentations or videos. Each medium or type of object brings different aspects of creative collaboration between humans and technology that we would like to outline. Our goal will not be to show some selected tools but rather to think in a structured way about how AI enters the process of creativity.

For video creation, here are two examples of tools that can be used:

No description

No description

Synthesia.io - allows you to create videos by selecting an avatar (actor) and language, and the system makes a video based on the text input. Additional elements can be added to the video, and individual scenes (cuts) can be worked with, contributing to the output's overall interactivity. So, the whole aim of the application is to replace the actor - we work with the idea that the author can create a script and, therefore, communicate it to the public and have it spoken by artificial intelligence. To a certain extent, we are in a situation where the AI is converting one medium to another but not adding much. Partly in this way, we can recall that in a dialogue, Milan Kundera was supposed to say that he does not go to the theatre because plays are the most beautiful literary works in written form.

Pictory.ai – can be used to create short video content from a text (or long) source, as the application description states. Creating a video is divided into several steps - the application establishes a summary from the submitted document (it can be edited) and then makes the actual video, including images or music. The output can be educational content as well as a social media trailer. It works with the typical TikTok form, where the actual message is text or subtitles in the video. Compared to Synthesia, there is a significantly higher level of AI invention (creating summaries, finding a suitable media form). On the other hand, the quality of the output in terms of art is also not entirely satisfactory. Interestingly, this form can serve for the work of specific content but also as an input to the creative process, in which the author can look at his messages in a different medium, condensed and from a completely different perspective.

The second group of tools are those that allow you to create presentations. Presenters are currently a powerful communication medium in the academic environment, where we primarily aim for ideas. As much as Anna Hogen says that philosophy is not cinema, creating lectures or corporate presentations without a visual basis (whatever that may be) is challenging for listeners or viewers. Again, we would instead typologically mention two tools that use AI to create presentations in a way that can be considered helpful for creativity and thinking in general.

Slidesai - works by the user uploading a text input (typically an article for a presentation), and the system creates individual slides with titles or generated images from the text. As in the previous cases, we expect that the user has the primary material ready, understands the topic and needs to create a presentation. The use of AI here can bring three significant benefits - saving time (which is probably crucial for most people), completing a different form of presentation than usual (this is essential for thinking and presenting on the topic) and thirdly, allowing the case to be put together and interpreted differently from how the text understands it. AI can be surprisingly creative in this regard, and the lecture handout can help offer a more profound or different conceptualization for some topics.

No description

No description

Gamma App - allows you to create presentations based on a topic, typically not entirely comprehensive. The system creates the first version of the document, and the user then uses the chat to gradually modify the individual areas of the presentation - from templates through the visualization of information to their content. This application is interesting because it creates the entire display and tries to talk to the human author about the outputs and individual steps. At specific points, it uses a chatbot interface to ask if more detail should be added, if the image is okay, or perhaps the messages should be transformed into a different form. It is based on the idea that although Gamma App creates the presentation for the user, the user input and intervention create a product that is very different from the initial "automatic" design. It is thus a presentation from the user, just made in a completely different way than we are used to. Besides exhibitions, it can also create websites or documents.

What can I use it for?

We have already hinted at the use of some of the tools, but we will nevertheless try to offer some points for reflection on what these tools are helpful for about creativity and information work:

  • You don't have to use illustrations from photobanks - one of the critical things plaguing the Czech visual environment is working with photobanks that offer the same images repeatedly. The reason for differentiation is not just visual but primarily ideological.
  • Create illustrations of things that don't fit in photo banks - typically information interactions or bees' immune systems. We can't find any images and have to have them generated.
  • Keep a consistent visual style - working with content generation tools allows you to work in a consistent visual manner that becomes typical for your deliverables. This is an essential element of professionalism.
  • Get a different perspective on a problem - have a picture drawn, a presentation made, or anything else. Changing the medium may benefit not only the consumers of the content but, more importantly, the authors themselves, who can think about specific issues from a broader perspective.
  • Discuss - the tools allow you to change the medium, making communicating the results more accessible.
  • Be original - tools make it easy to go beyond the limits of your imagination and work with outputs in unexpected ways.

No description

You are running an old browser version. We recommend updating your browser to its latest version.

More info