Decoding AI: From Theory to ChatGPT

- a conversation

ProfileImg
10 Oct '23
9 min read


image

Artificial intelligence (AI)

Before expanding on the technical aspects of ChatGPT, the concept of artificial intelligence needs to be understood.  Artificial intelligence is the design and development of “intelligence” predominantly using computer programming.  Most early computers followed instructions coded into them by software writers or developers.  As instructions got advanced, computers could perform more advanced tasks but still could not “think” or “predict”.  Hence, they did not have intelligence like humans. 

Next came machine learning (ML) that initially involved neural networks similar to the neurons in the human brain.  These involved many processors or computers or nodes that could break down a problem into several pieces, solve them and re-assemble the solutions to be presented as a whole.  Some of the early computer languages were Ada that enabled programming for artificial intelligence.  The Turing test was one of the barometers to distinguish between human and machine intelligence.

As the science and technology progressed, machine learning got more sophisticated that included natural language processing in Google search, for instance, that provided results based on queries similar to a human conversation.  The next step was predictive intelligence where auto-complete, auto-correct and other related features were developed.  On the other hand, software like Watson and Deep Blue developed by IBM could beat top-ranked chess players by learning the rules of the game and “practicing” using the entire history of all chess games ever played. Watson was hailed as the true harbinger of intelligence but was limited in solving real world problems and increasing human efficiency.

The next frontier was DeepMind that was eventually bought by Google.  They introduced the world to “self-learning” computers where rules, rudimentary parameters were initially fed that led the AI to use deep learning to develop intelligence, akin to how a human baby learns.  Also called reinforcement learning, some of this pioneering work was the basis for ChatGPT.  DeepMind caused quite a stir when it beat the world’s top ranked AlphaGo player in a game that was thought to be computationally insurmountable due to its complexity.

That brings us to today’s development of ChatGPT by Open AI with a substantial ownership by Microsoft. ChatGPT is on version 4 introduced in March 2023.  As of this writing in October 2023, ChatGPT has been introduced as an app on smartphones, developed to accept text, picture and audio input.

What is ChatGPT?

GPT stands for Generative Pre-trained Transformer.  Pre-trained indicates the training by reinforcement learning.  To accomplish the training, it needs the alphabet that is the 26 characters for English language, rules of grammar to form sentences, learning to make sense by studying conversations and to keep out negative, dark ideas. One of the foundations is large language models (LLMs) that gather huge amounts of data from large sources like the Internet.  Companies like OpenAI that developed ChatGPT trawl the World Wide Web for information using crawlers, spiders or other means. 

As can be imagined, these training data sets are huge and require enormous amounts of computational resources.  OpenAI was fairly open until version 3.5 and published various papers with the technical information.  It is estimated that the GPT-3.5 was trained on a 45TB database which is equivalent to over 292 million pages of documents, or 499 billion words. It uses 175 billion parameters that are defined as points of connection between input and output layers in neural networks that may have cost north of $500 million just in the year 2022 alone. With the newer capabilities that include voice, images and more accurate output, the resources required would have increased substantially.  Nvidia appears to be the primary beneficiary of the boom in AI-related hardware in the form of GPUs have been designed to support these extremely intensive data demands, from the likes of OpenAI’s ChatGPT, Google’s Bard or Facebook’s Llama 2 open source LLM. 

A few other technologies were required to reach this stage of AI development – generative adversarial networks (GANs) developed by Ian Goodfellow and his colleagues in 2014 and Transformers where some pioneering work was done by Google.  In a GAN, two neural networks contest with each other for various forms of learning, and use other neural networks like the discriminator that determine “realism”. All these are dynamic and change at computational speed that in turn itself gets quicker based on the so-called Moore’s “law”.  From Google’s blog, DeepMind self-learnt by playing “millions of games against itself via a process of trial and error called reinforcement learning.  At first, it played completely randomly, but over time the system learns from wins, losses, and draws to adjust the parameters of the neural network, making it more likely to choose advantageous moves in the future.”  These millions of Chess and Go games were learnt in a few hours to a few days, an impossible time frame for any human to master or comprehend. 

Transformers are a type of machine learning that enabled training on billions of pages of text and track the connections between the words across the various pages.  Their architecture enables the Attention Mechanism that allows the model to focus on different parts of the input sequence when making predictions. This is very important for the software to form coherent sentences, text-to-speech conversion and for other, similar applications. 

If Google was at the forefront of all this AI effort, how did OpenAI suddenly burst on to the scene? The smart data scientists and engineers at OpenAI considered various advances on natural language processing (NLP) to date, evaluated and rejected many options as non-workable.  After many trial-and-error of a few years, OpenAI’s brilliant minds settled on reinforcement learning from human feedback (RLHF) as a viable path of development.  With human intervention in addition to self learning, AI appears closest to mankind’s intelligence. 

And what a pathbreaking product! When ChatGPT 3.0 was released to the public in November 2022, it was the fastest software achiever of 100 million users in a few months.  While versions 3 and 3.5 were based on data through late 2021 and were not real-time, the product was similar to the first iPhone that put together various existing technologies in a very user friendly, human-usable package.  Latest version 4 uses real-time Internet data for greater relevancy.  The various use cases are growing and it appears to have accelerated white collar work efficiencies with coding assistance, along with threatening higher order creative work.  The recent Hollywood writers strike and lawsuit against OpenAI by various authors are a result of the top-down effect complete upending of the progression of AI, as opposed to the bottom-up that was generally prophesied.  

The outstanding capabilities are also the Achille’s heel – how to ensure orderly, progressive and positive development of essentially the entire human knowledge? The darker side comes to the fore in the form of deepfakes, easily created and more realistic frauds, easy access to violent information and other as yet unexplored avenues.  In the world of criminals and police, it appears that crime is quicker to adapt, more agile and always a step or two ahead. 

The various AI software are constantly learning, extremely fast and essentially limitless.  Though OpenAI, Microsoft and Google top management acknowledge the difficulty by indicating that they themselves have no way to predict the AI’s learning curve and/or its repertoire of knowledge, the genie is out of the bottle. Guardrails apparently exist but human ingenuity often overcomes barriers.  The companies themselves are in a development race and are of course capitalistic, and hence won’t slow down.  Sam Altman, the CEO of OpenAI, did a tour of a few countries trying to bring attention to regulating AI for human use.  Bans by select countries put them at a disadvantage when neighboring regions permit usage and are moving full steam ahead.  Red teams with a few experts in their chosen fields are like white hackers that try to help with the safeguards, but it barely manages to scratch the surface.  Perhaps the biggest check and balance comes from opening it up to the general public and working on the constant feedback that is generated. 

AI in India

In India, work on AI is ongoing in the form of AI4Bharat at IIT Madras, and various startups bankrolled by Tech Mahindra, Peak XV (formerly Sequoia) in partnership with Sam Altman funding various firms among others. There is an excellent online article at https://analyticsindiamag.com/the-hidden-cost-of-chatgpt-for-indian-languages/, on the challenges of developing LLMs for Indian languages.  The big picture goal by the father of Indian digital infrastructure Mr. Nandan Nilekani is to enable real-time speech-to-text and text-to-speech conversion of Indian languages.  This allows, say farmers, to query their non-smartphones with voice regarding weather predictions, minimum support prices, recommendations for crop rotation, soil conditions, crop loans, government grants and so on; for their local language query to be processed by English text information available online through government and other technical resources, and for a voice response back in the local language.  As a sidebar, Mr. Nilekani has empowered more than a billion lives – very few people on earth have a claim to that kind of legacy!

Prompt engineering is the effective query and dialogue with AI to improve the output.  Back in November 2022, the author tried with an example by asking for a 300-word essay on Quit India Movement for a 14-year old and then a 100-word essay on the same topic for a 10-year old, there was a visible difference in the output that was easily understood by kids of those respective ages. Now the voice-enabled phone apps make it more powerful compared to Apple Siri or Google Voice Assistant. Every technology and any action in general when used as an aid or in moderation are tools. For example, instead of misusing it for plagiarism, ChatGPT needs to be used as an effective and powerful aid for essay writing and other school work.  The holy grail appears to be Artificial General Intelligence (AGI) that most closely resembles human intelligence.  In a future with word creating art and avatars in the metaverse, we owe it to our kids to help them adapt while maintaining our human identity. 

Disclaimer:  There was zero input from ChatGPT for this article.

Category:Technology



ProfileImg

Written by Mr. Naveen