Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
AWS announced more updates for Bedrock aimed to spot hallucinations and build smaller models faster as enterprises want more customization and accuracy from models.
AWS announced during re:Invent 2024 Amazon Bedrock Model Distillation and Automated Reasoning Checks on preview for enterprise customers interested in training smaller models and catching hallucinations.
Amazon Bedrock Model Distillation will let users use a larger AI model to train a smaller model and offer enterprises access to a model they feel would work best with their workload.
Larger models, such as Llama 3.1 405B, have more knowledge but are slow and unwieldy. A smaller model responds faster but most often has limited knowledge.
AWS said Bedrock Model Distillation would make the process of transferring a bigger model’s knowledge to a smaller one without sacrificing response time.
Users can select the heavier-weight model they want and find a small model within the same family, like Llama or Claude, which have a range of model sizes in the same family, and write out sample prompts. Bedrock will generate responses and fine-tune the smaller model and continue to make more sample data to finish distilling the larger model’s knowledge.
Right now, model distillation works with Anthropic, Amazon and Meta models. Bedrock Model Distillation is currently on preview.
Why enterprises are interested in model distillation
For enterprises that want a faster response model — such as one that can quickly answer customer questions — there must be a balance between knowing a lot and responding quickly.
While they can choose to use a smaller version of a large model, AWS is banking that more enterprises want more customization in the kinds of models — both the larger and smaller ones — that they want to use.
AWS, which does offer a choice of models in Bedrock’s model garden, hopes enterprises will want to choose any model family and train a smaller model for their needs.
Many organizations, mostly model providers, use model distillation to train smaller models. However, AWS said the process usually entails a lot of machine learning expertise and manual fine-tuning. Model providers such as Meta have used model distillation to bring a broader knowledge base to a smaller model. Nvidia leveraged distillation and pruning techniques to make Llama 3.1-Minitron 4B, a small language model it said performs better than similar-sized models.
Model distillation is not new for Amazon, which has been working on model distillation methods since 2020.
Catching factual errors faster
Hallucinations remain an issue for AI models, even though enterprises have created workarounds like fine-tuning and limiting what models will respond to. However, even the most fine-tuned model that only performs retrieval augmented generation (RAG) tasks with a data set can still make mistakes.
AWS solution is Automated Reasoning checks on Bedrock, which uses mathematical validation to prove that a response is correct.
“Automated Reasoning checks is the first and only generative AI safeguard that helps prevent factual errors due to hallucinations using logically accurate and verifiable reasoning,” AWS said. “By increasing the trust that customers can place in model responses, Automated Reasoning checks opens generative AI up to new use cases where accuracy is paramount.”
Customers can access Automated Reasoning checks from Amazon Bedrock Guardrails, the product that brings responsible AI and fine-tuning to models. Researchers and developers often use automated reasoning to deal with precise answers for complex issues with math.
Users have to upload their data and Bedrock will develop the rules for the model to follow and guide customers to ensure the model is tuned to them. Once it’s checked, Automated Reasoning checks on Bedrock will verify the responses from the model. If it returns something incorrectly, Bedrock will suggest a new answer.
AWS CEO Matt Garman said during his keynote that automated checks ensure an enterprise’s data remains its differentiator, with their AI models reflecting that accurately.