What is GenAI?
Generative Artificial Intelligence is capable of generating text, images and other types of novel content. In other terms, it is an ML mode trained to create new data rather than make predictions about specific data points by, most commonly, taking a small prompt in natural language.
Some of the most prominent Gen AI applications include:
-
ChatGPT - a chatbot that simulates real conversation power by GPT 3.
-
DALL-E - imagery generation based on human input in different styles.
-
Midjourney - generated images based on natural language descriptions.
History
GenAI has recently gained attention from the general public, however, this technology has been developing since the 1960s. The first iterations consisted of type-written chatbots that relied on a knowledge base (mainly from experts in the field) presented to a computer. Responses were triggered using keywords, however, it did not scale well. Another early GenAI is a simple much-known model known as Markov Chain. In ML, it has been long used for the next word prediction task, like the autocomplete function. It works by looking at the last word or the last few words.
In the 1990s a statistical approach became popular, with new algorithms able to learn patterns from data, also known as machine learning. They showed promise in mimicking human language understanding. Technological advancements, hardware capable of handling complex computations and the availability of large amounts of data have encouraged the development of neural networks.
Types of GenAI models:
We’ll briefly discuss some of the most popular models used to develop generative models:
-
GAN (Generative Adversarial Networks)
It is an unsupervised learning technique used for generative modelling, featuring a discriminator and a generator. The generator produces novel content similar to the original input, while the discriminator sets apart the critical data from the produced data. The generator tries to fool the discriminator and learns to make more realistic content outputs.
GAN minmax the classification error loss. Given x are real images and x’ are generated images.
-
VAE (Variational Autoencoders)
VAE is featured with encoders and decoders, two neural networks that work together to provide the best-generated model. The encoder learns efficient data encoding from the dataset and passes it into a bottleneck architecture. The decoder then uses latent space to regenerate data similar to the dataset. It uses a probabilistic model to learn the underlying structure of input data and then generates new data that are similar to but different from the original. It has multiple applications but not limited to data compression and synthetic data creation.
Note: VAE is different from an Autoencoder since it provides a statistical explanation in describing the dataset in latent space, the encoder outputs a probability distribution in the bottleneck layer instead of a single output value
-
Transformer
Powering some of the most impressive generative models, like GPT under the hood. Transformer architecture is a model that uses self-attention, to transform a whole sentence into a single sentence. Transformers encode each word in a corpus of text as a token’s relationship with other tokens. This attention map helps the transformer understand the context when it generates new text. Its applications include:
-
NLP tasks; machine translation, text summarization, name recognition and sentimental analysis.
-
Computer vision tasks; image classification, object detection, image generation
-
Recommendation systems
-
Music generation and speech recognition
For more information, watch out for the Transformers article coming soon.
The pink and red flags
The large-scale adoption of Generative AI (GenAI) presents significant challenges alongside its transformative potential. As organizations increasingly integrate GenAI into workflows, the automation of tasks could displace workers, particularly in roles involving repetitive or predictable tasks.
Additionally, since GenAI models learn from vast datasets that often reflect societal biases, they risk perpetuating and even amplifying these biases, leading to discriminatory outcomes if not carefully managed.
Moreover, these models can inadvertently replicate or generate content closely resembling copyrighted material, raising concerns about intellectual property violations and the ethical use of AI-generated outputs.