Highlights:
- Developers can save a lot of time and money by using pre-trained foundation models instead of having to start from scratch when training a language model.
- The first is a generative LLM for information extraction, open-ended question and answer, classification, text generation, and summarization.
Amazon Web Services Inc. has recently expanded its reach into artificial intelligence software development by releasing several new tools for generative AI training and deployment on its cloud platform.
The business described new services in a post on the AWS Machine Learning blog, including the capacity to build and train foundation models, which are extensive, pre-trained language models that lay the groundwork for particular natural language processing tasks.
Deep learning techniques are generally used to train foundation models on enormous volumes of text data, enabling them to become adept at understanding the subtleties of human language and produce content nearly indistinguishable from that written by humans.
When training a language model, developers can save time and money using pre-trained foundation models instead of starting from scratch. A foundation model for text generation, sentiment analysis, and language translation is the Generative Pre-trained Transformer (GPT) from OpenAI LLC.
LLM Choices
Bedrock’s brand-new service makes foundation models from various sources accessible through an API. The Jurassic-2 multilingual large language models from AI21 Labs Ltd., which produce text in Spanish, French, German, Portuguese, Italian, and Dutch, and Anthropic’s PBC’s Claude LLM, a conversational and text processing system that follows moral AI system training principles are included. Users can use the API to access Stability AI Ltd. and Amazon LLMs.
According to Swami Sivasubramanian, Vice President of database, analytics, and machine learning at AWS, foundation models are pre-trained at the internet scale. They can therefore be customized with comparatively little additional training. He used the example of a fashion retailer’s content marketing manager, who could give Bedrock as few as 20 examples of effective taglines from past campaign examples with relevant product descriptions. Bedrock will then automatically generate effective social media posts, display ad images, and web copy for the new products.
In addition to the Bedrock announcement, AWS is releasing two new Titan large language models. The first is a generative LLM for information extraction, open-ended question and answer, classification, text generation, and summarization. The second LLM converts text prompts into numerical representations, including the meaning of the text and helps build contextual responses beyond paraphrasing.
No mention of OpenAI, in which Microsoft Corp. is a significant investor, was made in the announcement. Still, given the market’s demand for substantial language models, this shouldn’t be a problem for Amazon.
Although AWS is behind Microsoft and Google LLC in bringing its LLM to market, Kandaswamy argued that this shouldn’t be considered a competitive disadvantage. He said, “I don’t think anyone is so behind that they have to play catchup. It might appear that there is a big race, but the customers we speak with, other than very early adopters, have no idea what to do with it.”
Hardware Boost
Additionally, AWS is upgrading its hardware to provide training and inference on its cloud. New, network-optimized EC2 Trn1n instances now offer 1,600 gigabits per second of network bandwidth, or about a 20% performance increase, and feature the company’s exclusive Trainium and Inferentia2 processors. Additionally, the business’s Inf2 instances, which use Inferentia2 for inferencing of massively multi-parameter generative AI applications, are now generally available.
CodeWhisperer, an AI coding companion that uses a foundation model to produce code suggestions in real-time based on previous code and natural language comments in an integrated development environment, is another product whose availability has been announced. The tool is accessible from some IDEs and supports Python, Java, JavaScript, TypeScript, C#, and ten other languages.
Sivasubramanian wrote, “Developers can simply tell CodeWhisperer to do a task, such as ‘parse a CSV string of songs’ and ask it to return a structured list based on values such as artist, title and highest chart rank.” CodeWhisperer produces “an entire function that parses the string and returns the list as specified.” He said that developers who used the preview version reported improvement of 57% in speed with a 27% higher success rate.
As many players attempt to capitalize on the success of proofs of a concept like ChatGPT, the LLM landscape will likely remain dispersed and chaotic for the foreseeable future. As Google’s Natural Language API has in speech recognition, it’s unlikely that any one model will come to dominate the market, according to Kandaswamy.
He said, “Just because a model is good at one thing doesn’t mean it’s going to be good with everything. It’s possible over two or three years everybody will offer everybody else’s model. There will be more blending and cross-technology relationships.”