Small language models are the power of emerging GenAI
byAdmin-
The cost to cloud providers of using large language models is driving interest in models that represent a small fraction of the scale of generative AI use in business.
The large language models (LLM) that support generative AI services on AWS, Google Cloud, and Microsoft Azure are capable of performing many operations, from writing programming code and predicting the 3D structure of proteins to answering questions on almost every topic imaginable.
The breadth of capabilities is impressive, but taming these massive AI models with hundreds of billions of parameters is expensive. So companies are wondering whether it’s more cost-effective to train a small language model (SLM) to run a customer service chatbot, for example.
Our favorite quote from clients is that general intelligence may be great, but I don’t need my POS to read French poetry.
Devrit RishiChief Product Officer, Predibase
“Our customers’ favorite quote is, ‘General intelligence may be great, but I don’t need my point-of-sale system to read French poetry,’” Divert Rischi, CEO of startup Predibase, said during a presentation this week at the Linux Foundation. AI.dev Summit in San Jose, California Predibase provides software tools for training SLMs.
Over the past few months, Gartner has observed an increase in the number of enterprise customers evaluating service level management (SLM) to reduce the cost of inference — the complex process of training a GenAI model to produce useful responses to natural language questions.
“We’re starting to see customers coming to us and telling us that they’re running these very large, powerful models, and the cost of inference is too high to try to do something so simple,” said Arun Chandrasekaran, an analyst at Gartner.
As an alternative, companies are exploring models with 500 million to 20 billion parameters, Chandrasekaran said.
“This is kind of a beautiful place,” he said. “These models are starting to gain more interest, primarily due to their price performance.”
SLMs for small jobs
SLMs can’t match the scope of tasks performed by Cohere, Anthropic’s Claude, and OpenAI’s GPT-4 on AWS, Google Cloud, and Azure, respectively. However, managing SLMs trained on data for specific tasks, such as generating content from a specific knowledge base, shows its potential as a significantly less expensive alternative.
“Small models have limited modeling ability, but if we focus their ability on a specific target task, the model can achieve decent improved performance,” according to a paper conducted by researchers at the University of Edinburgh in the UK and the Allen Institute of Technology. Amnesty International Seattle.
In January, consulting firm Sourced Group, an Amdocs company, will help a handful of telecom and financial services companies leverage GenAI using open source SLM, said lead AI consultant Farshad Ghodsian. Initial projects involve leveraging natural language to retrieve information from private internal documents.
Ghodsian experimented with FLAN-T5, an open-source natural language model developed by Google and available on Hugging Face, to identify SLMs. Ghodsian tested a version of FLAN-T5 that contains 248 million parameters.
“When you add resource document creation, it gives you much better results than using it [LLMs]“And it’s a lot easier to operate,” he said. “You can even run it on the CPU. That’s a huge benefit.”
Ghodsian used fine-tuning with retrieval augmented generation (RAG) to obtain high-quality responses. RAG is an advanced, open source AI technology to retrieve information from a knowledge source and integrate it into the generated text.
“You get a really good answer from [FLAN-T5]“Really good,” Qudsian said.
SLM capabilities have attracted major enterprise vendors such as Microsoft. Last month, the company’s researchers introduced Phi-2, a 2.7 billion-parameter SLM system that outperformed the 13 billion-parameter version of Meta’s Llama 2, according to Microsoft. The company released Phi for research only.
SLM strengths and weaknesses
Providers of open source SLMs tout access to the inner workings of models as an important enterprise advantage.
For example, users can access parameters, or weights, that reveal how models formulate their responses. The inaccessible weights used by ownership models worry companies who fear discriminatory biases.
Another critical concern is data management. Many organizations are concerned about data leakage when setting up a cloud-based MBA with sensitive information.
Open source technology also has its critics. In June, supply chain security firm Rezilion reported that 50 of the most popular open source GenAI projects on GitHub had an average security score of 4.6 out of 10. Weaknesses in the technology could lead attackers to bypass access controls and expose sensitive information or Intellectual information at risk. Ownership, Rezilion wrote in a blog post.
Promising SLMs named by Chandrasekaran include Meta’s Llama 2 rocket, the Institute of Technology Innovation’s Falcon, and Mistral AI’s Mistral 7B and Mixtral 8x7B.
Mixtral 8x7B, which is in beta, contains approximately 47 billion parameters but processes inputs and generates outputs at the speed and cost of a 13 billion-parameter model, according to Mistral. The French startup raised $415 million in funding this month, valuing the company at $2 billion.
Mistral and Falcon models are commercially available under the Apache 2.0 license. Chandrasekaran said obtaining a certification for business is crucial.
“We’re starting to see more and more of these open source models being adopted for commercial use, which is very big for many organizations,” he said.
Open source model providers have an opportunity in the coming year, as organizations move from learning to actually deploying GenAI.
“They are still deciding, but they are ready to take the leap once January comes,” Ghodsian said. “They’ve got new budgets and they want to start implementing or at least do some collaboration points [proofs of concept]”.
Anton Gonsalves is TechTarget Editorial at large, reporting on industry trends important to enterprise technology buyers. He has worked in technology journalism for 25 years and resides in San Francisco.
In the world of artificial intelligence, small language models are emerging as powerful tools for driving the advancement of GenAI. These compact yet highly efficient models have the ability to understand and generate human-like language, making them indispensable for a wide range of applications in fields such as natural language processing, chatbots, and virtual assistants. As the demand for more intelligent and responsive technology continues to grow, small language models are proving to be the driving force behind the evolution of GenAI, revolutionizing the way we interact with machines and shaping the future of AI technology.