15.5 C
London
Friday, September 20, 2024

Building Scalable RAG-Based Generative AI Applications in AWS with Amazon FSx for NetApp ONTAP and Amazon Bedrock

Here is the rewritten article:

Introduction

This article provides a detailed walkthrough of a solution that enables the use of Retrieval Augmented Generation (RAG) for generative artificial intelligence (AI) applications on Amazon Web Services (AWS). By utilizing Amazon FSx for NetApp ONTAP with Amazon Bedrock, users can bring company-specific, unstructured user file data to Amazon Bedrock and leverage its powerful generation capabilities. This solution ensures a secure and scalable RAG experience, making it a valuable addition to any generative AI application.

The Solution

Generative artificial intelligence (AI) applications are commonly built using Retrieval Augmented Generation (RAG) that provides foundation models (FMs) access to additional data they didn’t have during training. This data is used to enrich the generative AI prompt to deliver more context-specific and accurate responses without continuously retraining the FM, while also improving transparency and minimizing hallucinations.

In this post, we demonstrate a solution using Amazon FSx for NetApp ONTAP with Amazon Bedrock to provide a RAG experience for your generative AI applications on AWS by bringing company-specific, unstructured user file data to Amazon Bedrock in a straightforward, fast, and secure way.

Solution Overview

The solution provisions an FSx for ONTAP Multi-AZ file system with a storage virtual machine (SVM) joined to an AWS Managed Microsoft AD domain. An OpenSearch Serverless vector search collection provides a scalable and high-performance similarity search capability.

The embeddings container component of our solution is deployed on an EC2 Linux server and mounted as an NFS client on the FSx for ONTAP volume. It periodically migrates existing files and folders along with their security ACL configurations to OpenSearch Serverless. It populates an index in the OpenSearch Serverless vector search collection with company-specific data (and associated metadata and ACLs) from the NFS share on the FSx for ONTAP file system.

How to Use the Solution

Complete the following prerequisite steps:

Make sure you have model access in Amazon Bedrock. In this solution, we use Ant model replied back with detailed steps and source attribution in the chat window to create an FSx for ONTAP file system using the AWS Management Console, AWS CLI, or FSx API:

Now, let’s ask a question about the Amazon Bedrock user guide that is available for admin access only. In our scenario, we asked “How do I use foundation models with Amazon Bedrock,” and the model replied with the response that it doesn’t have enough information to provide a detailed answer to the question.:

Test Permissions using API Gateway

You can also query the model directly using API Gateway. Obtain the api-invoke-url parameter from the output of your Terraform template.

curl -v '/bedrock_rag_retreival' -X POST -H 'content-type: application/json' -d '{"session_id": "1","prompt": "What is an FSxN ONTAP filesystem?", "bedrock_model_id": "anthropic.claude-3-sonnet-20240229-v1:0", "model_kwargs": {"temperature": 1.0, "top_p": 1.0, "top_k": 500}, "metadata": "NA", "memory_window": 10}'

curl -v '/bedrock_rag_retreival' -X POST -H 'content-type: application/json' -d '{"session_id": "1","prompt": "what is bedrock?", "bedrock_model_id": "anthropic.claude-3-sonnet-20240229-v1:0", "model_kwargs": {"temperature": 1.0, "top_p": 1.0, "top_k": 500}, "metadata": "S-1-5-21-4037439088-1296877785-2872080499-1112", "memory_window": 10}'

Cleanup

To avoid recurring charges, clean up your account after trying the solution. From the terraform folder, delete the Terraform template for the solution:

terraform apply --destroy

Conclusion

In this post, we demonstrated a solution that uses FSx for ONTAP with Amazon Bedrock and uses FSx for ONTAP support for file ownership and ACLs to provide permissions-based access in a RAG scenario for generative AI applications. Our solution enables you to build generative AI applications with Amazon Bedrock where you can enrich the generative AI prompt in Amazon Bedrock with your company-specific, unstructured user file data from an FSx for ONTAP file system. This solution enables you to deliver more relevant, context-specific, and accurate responses while also making sure only authorized users have access to that data.

Frequently Asked Questions

Q: How do I use Retrieval Augmented Generation (RAG) with Amazon FSx for NetApp ONTAP and Amazon Bedrock?

To use RAG with Amazon FSx for NetApp ONTAP and Amazon Bedrock, you need to set up an FSx for ONTAP file system, deploy an OpenSearch Serverless vector search collection, and create an embeddings container component. You then populate an index in the OpenSearch Serverless vector search collection with company-specific data (and associated metadata and ACLs) from the NFS share on the FSx for ONTAP file system.

Q: How do I provide permissions-based access in a RAG scenario for generative AI applications?

To provide permissions-based access in a RAG scenario for generative AI applications, you can use FSx for ONTAP support for file ownership and ACLs to restrict access to unauthorized users. You can also use API Gateway to query the model directly and retrieve results based on the permissions assigned to the user.

Q: How do I obtain the api-invoke-url parameter from the output of my Terraform template?

To obtain the api-invoke-url parameter, you need to run your Terraform template and review the output for the value of the api-invoke-url parameter. You can then use this parameter to invoke the API gateway and query the model directly.

Q: Can I use this solution to integrate with other AI and machine learning services?

Yes, you can use this solution to integrate with other AI and machine learning services by leveraging the API gateway to query the model directly and retrieve results. You can also use other services to process and analyze the output of the model, enabling you to integrate it with other AI and machine learning services.

Q: Can I use this solution for large-scale applications?

Yes, you can use this solution for large-scale applications by scaling the infrastructure and configuring the OpenSearch Serverless vector search collection and the embeddings container component accordingly. You can also use other services to process and analyze the output of the model, enabling you to scale the solution to meet the needs of large-scale applications.

Latest news
Related news
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x