Introducing chatbots and large language models (LLMs)

featured image

This introduction to chatbots and large language models is excluded from the book Generative AI Tools for Developers: A Practical Guidenow available on SitePoint Premium.

table of contents

a chatbot It is a software application intended to simulate human conversation through text or voice interactions, usually over the Internet. Chatbots first appeared in 1966 when an MIT professor named Joseph Weizenbaum created ELIZA, an early natural language processing computer program created to explore communication between humans and machines.

In 1994, computer scientist Michael Mauldin decided to call this type of software a “chatterbot”, after his invention of Verbot, a chatterbot and artificial intelligence software development kit for Windows and the web.

Evolution of chatbots

Chatbots continued to evolve after ELIZA, finding different purposes ranging from entertainment (with Jabberwacky) to healthcare (with PARRY). The goal of chatbots created during this period was to simulate human interaction under various circumstances. But in 1992, Creative Labs built Dr Sbaitso, a chatbot with speech synthesis. This was the first time machine learning was integrated into a chatbot, although it only recognized limited or pre-programmed responses and commands.

The image below shows Dr. Spaetsu’s interface.

Another chatbot called ALICE (Artificial Linguistic Internet Computer Entity) was developed in 1995 – a program that engages in human conversation using heuristic pattern matching to conduct conversations.

All chatbots released during this period were named “Rule-based chatbots”Because they all operate according to a set of pre-defined rules and patterns created by human developers or conversational designers to generate responses. This means that these chatbots have limited flexibility, due to their reliance on pre-defined rules. They lacked the ability to learn from a user’s message and create a new response to it. Examples of these rules include:

  • If the user asks about product pricing, respond with information about pricing plans.
  • If the user mentions a technical issue, provide troubleshooting steps.
  • If the user expresses gratitude, respond with a thank you message.

In 2001, ActiveBuddy, Inc. launched Publicly released a new chatbot called SmarterChild. It was an intelligent bot that was distributed across global instant messaging networks (AIM, MSN, Yahoo Messenger) and was able to provide information ranging from news, weather, sports, stock information, etc., allowing users to play games and also access the START system to answer questions in natural language by Boris Katz of MIT. It was revolutionary, demonstrating the power of conversational computing, and in many ways it could be said to have been a precursor to Siri.

The next set of notable advances in chatbots came in the 2000s, due in part to the growth of the Web and the availability of raw data. During this period, significant progress was made in natural language processing (NLP), as representation learning and deep neural network-style machine learning methods became widespread in NLP.

Among the achievements of this period are the following:

  • Deep learning and neural networks. Significant advances have been made in recurrent neural networks (RNNs) that have made them capable of capturing complex linguistic patterns, contextual relationships, and semantic understanding, contributing to significant improvements in chatbot performance.

  • Sentiment analysis and sentiment understanding. Sentiment analysis and sentiment understanding were added to NLP techniques in the 2000s. Chatbots have also integrated these capabilities, allowing them to recognize a user’s feelings and emotions while responding to them appropriately. This development has enhanced the chatbot’s ability to provide empathetic and personalized interactions.

  • Named entity recognition and entity association. Named Entity Recognition (NER) and entity mapping also improved when Alan Ritter used a hierarchy based on common Freebase entity types in pioneering experiments on NER across social media texts.

  • Understanding context and managing dialogue. Linguistic models are becoming more efficient at understanding and maintaining contexts within a conversation, and therefore chatbots are becoming better at handling conversations while providing more coherent responses. The flow and quality of interactions has also improved as a result of reinforcement learning techniques.

  • Voice-activated virtual assistants. There was tremendous development in areas such as Neuro-Linguistic Programming (NLP), Artificial Intelligence (AI), and voice recognition technologies from the 1990s to the 2000s. The combination of these elements has led to the development of intelligent voice-activated virtual assistants with better voice than Dr. Spaetsu, which was the first voice-activated chatbot. A notable example of assistants developed in this era is Apple’s Siri, released in 2011, which played a pivotal role in popularizing voice interactions with chatbots.

  • Integration of messaging platforms and APIs. As a result of the progress made in the field of artificial intelligence, there has been a rise in the adoption of chatbots through messaging platforms like Facebook, Slack, WhatsApp, etc. These platforms also allowed users to develop their own personal chatbots and integrate them with different capabilities, by providing them with APIs and developer tools – all of which eventually led to the adoption of chatbots across various industries.

All these developments have made it possible to build chatbots that are capable of having better conversations. They had a better understanding of the topics, and provided a better experience than the written feel of their predecessors.

Great language models

In the early days of the Internet, search engines were not as accurate as they are now. Ask.com (originally known as Ask Jeeves) was the first search engine to allow users to get answers to everyday natural language questions. Natural language research uses Neuro-Linguistic Programming (NLP), a process that uses massive amounts of data to run statistical and machine learning models to infer meaning in complex grammatical sentences. This enabled computers to understand and interact with human language and paved the way for various applications. NLP has facilitated remarkable development, with the emergence of large language models.

a Great language model (LLM) is a computerized language model that can perform a variety of natural language processing tasks, including generating and classifying text, answering questions in a human-like manner, and translating text from one language to another. He has been trained on a vast array of articles, Wikipedia entries, books, Internet-based resources, and other inputs, so that he can learn how to generate responses based on data from these sources.

The basic structure of most LLMs is one of two types:

  • Bidirectional Encoding Representations of Transformers (BERT)

  • Generative Pre-Trained Transformers (GPTs)

All of these LLM degrees are based on the structure of the adapter model. Transformers are a type of neural network architecture that has revolutionized the field of natural language processing and enabled the development of large, powerful language models.

It uses self-attention mechanisms to calculate the weighted sum of the input sequence and dynamically select the tokens in the sequence that are most related to each other.

The image below shows how the transformer model structure works.

Transformer model engineering

How LLMs work

In order to understand how LLMs work, we must first look at how they are trained. Using large amounts of text from books, articles and various parts of the Internet, they learn patterns and connections between words. This is the first step, known as Training before. It uses distributed computing frameworks and specialized hardware such as graphics processing units (GPUs) or tensor processing units (TPUs), which allow for efficient parallel processing. Once that’s done, the pre-trained model still needs to learn how to perform specific tasks effectively, and that’s where fine-tuning comes into play.

fine tuning It is the second step in the training of LLMs. It involves training the model on specific tasks or data sets to make it more specialized and useful for specific applications. For example, LLM can be improved on tasks such as text completion, translation, sentiment analysis, or answering questions.

Status of Chatbots today

Today, we have chatbots that are more powerful than ever. They can perform more complex tasks and are better at handling conversations. This is due to significant advances in artificial intelligence, NLP, machine learning, and an increase in computing power and Internet speed.

Chatbots have continued to benefit from these developments. Notable aspects of these developments include:

  • Advanced artificial intelligence models. The introduction of advanced AI models has revolutionized the capabilities of chatbots in recent years. Models like OpenAI’s GPT series have greatly helped push the boundaries of natural language processing and machine learning. These models are trained on large-scale datasets and can generate contextually relevant responses, making conversations with chatbots more engaging and human-like.

  • Multi-channel and multi-media capabilities. Chatbots are no longer limited to a single platform or interface, as they can work seamlessly across channels such as websites, messaging apps/platforms, and mobile apps. Although often behind a paywall, chatbots have also expanded beyond text-based interactions and now support multimedia input, including images and audio, offering users the freedom to interact across different media.

  • Continuous learning and adaptability. By constantly learning and improving from user interactions, chatbots use reinforcement learning and feedback mechanisms to adapt their responses over time, improving their performance and better meeting user needs.

  • Industry applications. Wide applications of chatbots have been found across industries. For example, Airbnb uses chatbots to help users answer common questions, solve booking issues, and find accommodations, while Duolingo uses chatbots to simulate conversations in foreign language learning and feedback. They are also used in other industries such as financial institutions, healthcare, and e-commerce. This usually requires equipping these bots with domain-specific knowledge so that they can do a great job in their use cases.

  • Integration with back-end systems. Due to this tremendous growth, we now have chatbots that are integrated with back-end systems and databases. This allows them to access and provide up-to-date information, enhancing their ability to provide accurate and up-to-date answers to user queries.

As a result of all these developments, we now have smarter chatbots capable of handling many tasks at different levels, from booking a reservation at your favorite restaurant, or doing extensive research on different topics with references, to solving technical problems in software development. Some of our most popular chatbots today include Bard from Google, Bing Chat from Microsoft, and ChatGPT from OpenAI, all of which are powered by large language models. We will discuss all these tools soon.

Want to learn more about chatbots, LLMs, and other AI tools that can help you in your work as a developer? Check out Generative AI Tools for Developers: A Practical Guide, now available on SitePoint Premium.

Chatbots and large language models (LLMs) have revolutionized the way we interact with technology and have become an integral part of our daily lives. These advanced artificial intelligence tools are capable of understanding and responding to human language, allowing for seamless and natural communication. Whether it’s customer service, personal assistance, or language translation, chatbots and LLMs are increasingly being utilized across various industries to enhance user experience and improve efficiency. In this article, we will explore the fascinating world of chatbots and LLMs, their potential applications, and the impact they have on our society.

Previous Post Next Post

Formulaire de contact