A Dummies introduction to Generative AI

Rajesh Rajamani
CloudForDummies
Published in
5 min readApr 19, 2023

--

Image copyright to original owners

We all have heard the terms ChatGPT , OpenAI , GPT3 . And then the various tall claims about this being a disruptor for many industries. My aim is to demystify some of these concept and topics so that you can explain it to anyone or make sense still if someone tries to confuse you with complex jargons.

To put it in simple terms Generative AI is nothing but AI that can be used to well “generate”. Sounds silly ? Extremely dumb ?. Hang on.

By “generate” what do I mean ?. Well for example , generating an email response , a short story , a simple image , a sonnet maybe. Sounds interesting ?.

I’m sure you have used the auto-complete feature in Google search ?. Think of it as a very basic implementation Generative AI where query suggestions were generated by the model based on the terms you type on the search bar. They simply trained their models against the zillions of queries searched by users across the world and generated suggestions or queries .

Google autocomplete feature

Generative AI is supported by GPT or Generative Pre-trained Transformers takes this “generate” capability further by several levels .

GPT are a family of large-language models trained using artificial neural networks using transformer architecture on massive un-labelled text data to generate human-like text responses.

ChatGPT is nothing but a chatbot with a web-chat like interface that sits on the top of the latest GPT model which enables you to query the model to provide with you fancy responses. If you have experienced a chat-bot anywhere else then the concept is very similar except for the fantastic models under the hood.

Now that we have a very basic understanding of the different jargons let us dig a bit deeper.

What are Large Language Models and how are they defined? How large is a Large Language Model?

The term “Large” here refers to the number of parameters that were used to train the model. Here’s a summary of the versions of GPT models released so far along with the parameter count.

  • RLHF stands for Re-inforcement Learning with Human Feedback (the infamous OpenAI sweatshop saga where low paid human advisors were used to label text and provide feedback to the model to improve its performance )

Can these models answer all the questions that humanity has ?

Now what all a Generative AI can generate depends on the knowledge it has amassed . Focus on the word pre-trained transformers. This means that the model’s wealth of knowledge is limited to the vast amount of data it has been trained with.If some data was not fed to the model then it is not known to the model and therefore it cannot generate content around it or answer the question. This applies to many fields that are bound to have new approaches emerging from time to time such as Scientific Research , Medicine , Art , Literature , Mathematics to name a few.

Would it have been possible for GPT to generate a proof to the Pythagorean theorem like the 2 high-schoolers using Trignometry ?. Hard to answer.

Attempt to proove Pythagores Theorem using Trignometry

Are these models biased ?

It’s hard to say that they are not. Because it depends on the data it has been trained with. Let’s not get into the ethics angle for now. That’s for another day.

Can I use these models for any language that I wish ?

Some of the languages that GPT-3 has been trained in include English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, Korean, and many more. However there’s a fair chance that a language that you are looking to use is still not supported. And it is simply due to the lack of sufficient data pertaining to that language.

That doesn’t mean the end of the world. You could still intermediate translator engines as a work-around . But then the outcome is only as good as the translation process.

Can the responses be trustworthy ?

You get what you ask for. It’s that simple. For example , if you simply ask “Who is the president of United States” it will understand the question in the context of “current president” and get you the answer Joe Biden.

It’s simply the parameters that hold the most relevance that decide the outcome. Does that mean the outcome of the model can be manipulated ? Hell yeah with relevant manipulated data !! Always be cautious.

Another example. In the previous screenshot the model said it only has data till 2021 . Therefore its obvious that the on-going Russia-Ukraine conflict is un-known to the model. Yet another time the model’s limitations on knowledge being highlighted.

How does it generates code ?

Quoting the documentation. The Codex model series is a descendant of our GPT-3 series that’s been trained on both natural language and billions of lines of code. It’s most capable in Python and proficient in over a dozen languages including JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL, and even Shell.

It’s the collective intelligence of millions of developers if I may call it.

Can you customize the models against your own knowledge base ?

Let’s say your organization has a large volume of data that is not publicly available. You can use the commercial version of OpenAI API to train the model to ingest your own data and thereby enable the model to answer prompts relevant to your data.

There are several tutorials strewn around for this concept. Keep on the lookout for my upcoming article on this.

Parting thoughts

At this stage we have established a basic understanding of Generative AI . Remember one simple thing. Any AI is only as good as the data and the methods they were trained with.

--

--