StyleCLIP

Avyay
Oct 8, 2021
2 min read

StyleCLIP is a tool that has recently taken the AI world by storm. It allows you to perform several transformations on images using a variety of models and has, over the last few weeks, been used to make images of video game characters look like real people.

So how exactly did this come into play?

The use of GANs, or generative adversarial networks, to create AI-based artwork is all the rage these days.

You've probably seen GANs in applications like thispersondoesnotexist.com, which is powered by NVIDIA's open-source StyleGAN2.

CLIP, a model that can generate textual categorization predictions for a specified image, was open-sourced by OpenAI in 2021. You can theoretically map text data to StyleGAN because CLIP effectively interfaces between text and picture data. StyleCLIP allows CLIP vectors to be used to guide StyleGAN generations through user-provided text.

In general, GANs work by reading random "noise" as data and generating an image from it. A latent vector is the term used to describe this type of noise. Tov et al paper's Designing an Encoder for StyleGAN Image Manipulation uses an encoder to invert a given image into latent vectors that StyleGAN may use to rebuild the image. These vectors can then be modified to obtain a desired StyleGAN target image. In essence, the words you use to describe the transformation you want are encoded into vectors which then, using the trained models, permute so as to give transformations in the images. Just look at these sample images from their github repository.

So what has been the latest application?

Users have inputted fictional characters into StyleCLIP and tried to approximate real people from it. Think of it as a way to see playable characters, television stars such as the Simpsons and even novel based characters as real humans.

If you want to try out StyleClip, you can do so by following this link: https://replicate.ai/orpatashnik/styleclip

This allows you to play around with your own input images and your own prompts for it.

You can even go so far as to choose how you run the model and play around with the manipulation strength and a value threshold for disentanglement.

While it remains fun to play around with GANs, it's important to understand the depth of dangerous possibilities that are made obvious by it. It's no secret how often Deepfakes have perverse social media in an attempt to defame people or to instigate groups of people against one another. StyleCLIP allows users to change the hair, facial features and even the race of any individual’s photograph. It is imperative that moving forward, accessibility of AI research should be restricted solely based on the ethical conditions needed to be upheld by the AI community in ensuring no harm to any person comes from these products. AI research or testing must only be done keeping in mind the pervasiveness of social media and how easily new technology can be misused.