Build a FedRAMP compliant generative AI-powered chatbot using Amazon Aurora Machine Learning and Amazon … – AWS Blog

In this post, we explore how to use Amazon Aurora PostgreSQL-Compatible Edition and Amazon Bedrock to build Federal Risk and Authorization Management Program (FedRAMP) compliant generative artificial intelligence (AI) applications using Retrieval Augmented Generation (RAG). FedRAMP is a US government-wide program that delivers a standard approach to security assessment, authorization, and monitoring for cloud products and services. Cloud service providers must demonstrate FedRAMP compliance in order to offer services to the U.S. Government. RAG is often used in generative AI to enhance user queries and responses by augmenting the training of a large language model (LLM) with data from a companys internal business systems.

The solution we demonstrate uses Amazon Aurora Machine Learning which enables builders to create machine learning (ML) or generative AI applications using familiar SQL programming. Aurora ML provides access to foundation models (FMs) in Amazon Bedrock for creating embeddings (vector representations of text, images, and audio) and generating natural language text responses for generative AI applications directly with SQL. With Aurora ML, you can create text embeddings, perform similarity search using the pgvector extension, and generate text responses within the same Aurora SQL function. This reduces latency for text generation because document embeddings may be stored in the same table as the text, minimizing the need to return a search response to applications.

Amazon Bedrock is a fully managed service that offers a choice of high-performing FMs from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API, along with a broad set of capabilities to help you build generative AI applications. Amazon Aurora PostgreSQL is a fully managed, PostgreSQLcompatible, and ACIDcompliant relational database engine that combines the speed, reliability, and manageability of Amazon Aurora with the simplicity and cost-effectiveness of open-source databases.

In this post, we demonstrate how to build an AI-powered chatbot with Aurora ML and Amazon Bedrock, both of which are FedRAMP compliant. The solution includes the following AWS services:

We also use Streamlit to construct the interactive chatbot application running on AWS Cloud9.

The following diagram is a common solution architecture pattern you can use to integrate your Aurora PostgreSQL database with Amazon Bedrock using Aurora ML.

This architecture implements a RAG workflow. The first part involves ingesting the knowledge dataset, which contains unstructured data like PDFs and documents. This data is broken down into smaller chunks and embedded into vector representations using an embedding model such as Amazon Titan. The embeddings are stored alongside the original text chunks in Aurora, which serves as our vector database (Steps A and B).

The second part involves generating a response. First, the users question is converted into an embedding vector using the same embedding model . This question embedding is then used to perform semantic search over the database embeddings to determine relevant text chunks called the context. The context along with a prompt are formatted into a model input, which is fed to the text generation model to produce a natural language response to the question (Steps 14).

This implementation uses Amazon Aurora PostgreSQL with aws_ml (version 2) and the pgvector extension to store embeddings, Amazon Bedrock FMs (amazon.titan-embed-g1-text-02 for embeddings and anthropic.claude-instant-v1 for text generation), and Streamlit for the chatbot frontend. This can be deployed in three main steps:

For this walkthrough, complete the following prerequisites:

We selected PostgreSQL 15.5 for this example; we recommend using PostgreSQL 15.5 or higher so you can use the latest version of the open source pgvector extension

We selectedAurora I/O-Optimized, which provides improved performance with predictable pricing for I/O-intensive applications.

We opted to useAmazon Aurora Serverless v2, which automatically scales your compute based on your application workload, so you only pay based on the capacity used.

This step ingests your documents from an S3 bucket using the Boto3 library. Next, the function splits the documents into chunks using LangChains RecursiveCharacterTextSplitter. Lastly, the function uploads the chunks into an Aurora PostgreSQL table that you specify. The clean_chunk() function escapes special characters in the data to properly clean it before loading into the Aurora PostgreSQL table, which is a best practice because SQL functions can struggle with certain special characters.

Use the following Python code:

A SQL procedure is created to select the text contents, generate embeddings from the text using the Amazon Titan Embeddings G1 Text model (amazon.titan-embed-g1-text-02), and insert the embeddings into the table containing the original text. This model creates 1536-dimensional vector representations (embeddings) of unstructured text like documents, paragraphs, and sentences, taking up to 8,000 tokens as input.

The following code creates a PostgreSQL procedure that generates embeddings using the aws_bedrock.invoke_model_get_embeddings function. The model input is a JSON document. As a best practice, we recommend using the SQL format function to format the model input as given in the code.

The SQL function for generating responses to user questions performs the following tasks:

The similarity search and response generation are combined into a single function. This removes the need to call the similarity search separately from the application, reducing the overall latency of producing responses.

Use the following code:

This AI-powered chatbot application offers multiple ways for users to ask questions. One option is for SQL developers to directly use SQL functions. We also developed a Python chatbot application using the Streamlit frontend framework. The code for this chatbot application is available on GitHub.

We download the dataset for our knowledge base and upload it to an S3 bucket. This dataset will feed and power the chatbot. Complete the following steps:

The solution presented in this post is available in the following GitHub repo. You need to clone the GitHub repository to your local machine. Complete the following steps to configure the environment:

The following command allows you to connect to Amazon Aurora PostgreSQL using the psql client to ask a question and receive a response:

The following command launches a Streamlit-powered, web-based interactive application. The application allows you to ask questions and receive answers using a user-friendly interface.

The configuration to open port 8080 is for demonstration only. For your production environment, refer to Protecting data in transit for best practices to securely expose your application.

The UI will look like the following screenshot. You can ask a question by entering the text in the Enter your question here field.

To clean up your resources, complete the following steps:

In this post, we demonstrated how you can use Aurora, Amazon Bedrock, and other AWS services to build an end-to-end generative AI chatbot application using SQL functions and Python. By storing text embeddings directly in Aurora, you can reduce latency and complexity compared to traditional architectures. The combination of Aurora ML, Amazon Bedrock FMs, and AWS compute like Amazon SageMaker provides a powerful environment for rapidly developing next-generation AI applications.

For more information, see Using Amazon Aurora machine learning with Aurora PostgreSQL.

Naresh Dhiman is a Sr. Solutions Architect at Amazon Web Services supporting US federal customers. He has over 25 years of experience as a technology leader and is a recognized inventor with six patents. He specializes in containers, machine learning, and generative AI on AWS.

Karan Lakhwani is a Customer Solutions Manager at Amazon Web Services supporting US federal customers. He specializes in generative AI technologies and is an AWS Golden Jacket recipient. Outside of work, Karan enjoys finding new restaurants and skiing.

More:
Build a FedRAMP compliant generative AI-powered chatbot using Amazon Aurora Machine Learning and Amazon ... - AWS Blog

Related Posts

Comments are closed.