In this quickstart you can use your own data with Azure OpenAI models. Using Azure OpenAI's models on your data can provide you with a powerful conversational AI platform that enables faster and more accurate communication.
Azure OpenAI requires registration and is currently only available to approved enterprise customers and partners. See Limited access to Azure OpenAI Service for more information. You can apply for access to Azure OpenAI by completing the form at Open an issue on this repo to contact us if you have an issue.
Navigate to Azure OpenAI Studio and sign-in with credentials that have access to your Azure OpenAI resource. During or after the sign-in workflow, select the appropriate directory, Azure subscription, and Azure OpenAI resource.
In the pane that appears, select Upload files (preview) under Select data source. Azure OpenAI needs both a storage resource and a search resource to access and index your data.
For Azure OpenAI to access your storage account, you will need to turn on Cross-origin resource sharing (CORS). If CORS isn't already turned on for the Azure Blob Storage resource, select Turn on CORS.
Start exploring Azure OpenAI capabilities with a no-code approach through the chat playground. It's simply a text box where you can submit a prompt to generate a completion. From this page, you can quickly iterate and experiment with the capabilities.
The playground gives you options to tailor your chat experience. On the right, you can select Deployment to determine which model generates a response using the search results from your index. You choose the number of past messages to include as conversation history for future generated responses. Conversation history gives context to generate related responses but also consumes token usage. The input token progress indicator keeps track of the token count of the question you submit.
The Advanced settings on the left are runtime parameters, which give you control over retrieval and search relevant information from your data. A good use case is when you want to make sure responses are generated only based on your data or you find the model cannot generate a response based on existed information on your data.
Strictness determines the system's aggressiveness in filtering search documents based on their similarity scores. Setting strictness to 5 indicates that the system will aggressively filter out documents, applying a very high similarity threshold. Semantic search can be helpful in this scenario because the ranking models do a better job of inferring the intent of the query. Lower levels of strictness produce more verbose answers, but might also include information that isn't in your index. This is set to 3 by default.
Retrieved documents is an integer that can be set to 3, 5, 10, or 20, and controls the number of document chunks provided to the large language model for formulating the final response. By default, this is set to 5.
Queries that require data analysis would probably fail, such as "Which health plan is most popular?". Queries that require information about all of your data will also likely fail, such as "How many documents have I uploaded?". Remember that the search engine looks for chunks having exact or similar terms, phrases, or construction to the query. And while the model might understand the question, if search results are chunks from the data set, it's not the right information to answer that kind of question.
Chats are constrained by the number of documents (chunks) returned in the response (limited to 3-20 in Azure OpenAI Studio playground). As you can imagine, posing a question about "all of the titles" requires a full scan of the entire vector store.
Select your subscription, resource group, location, and pricing plan for the published app. Toupdate an existing app, select Publish to an existing web app and choose the name of your previousapp from the dropdown menu.
To successfully make a call against Azure OpenAI, you need the following variables. This quickstart assumes you've uploaded your data to an Azure blob storage account and have an Azure AI Search index created. See Add your data using Azure AI studio
In a console window (such as cmd, PowerShell, or Bash), use the dotnet new command to create a new console app with the name azure-openai-quickstart. This command creates a simple "Hello World" project with a single C# source file: Program.cs.
This will wait until the model has generated its entire response before printing the results. Alternatively, if you want to asynchronously stream the response and print the results, you can replace the contents of Program.cs with the code in the next example.
To successfully make a call against Azure OpenAI, you need the following variables. This quickstart assumes you've uploaded your data to an Azure blob storage account and have an Azure AI Search index created. For more information, see Add your data using Azure AI studio.
Spring AI doesn't currently support the AzureCognitiveSearchChatExtensionConfiguration options that allow an Azure AI query to encapsulate the Retrieval Augmented Generation (RAG) method and hide the details from the user. As an alternative, you can still invoke the RAG method directly in your application to query data in your Azure AI Search index and use retrieved documents to augment your query.
Spring AI supports a VectorStore abstraction, and you can wrap Azure AI Search can be wrapped in a Spring AI VectorStore implementation for querying your custom data. The following project implements a custom VectorStore backed by Azure AI Search and directly executes RAG operations.
Run the spring init command from your working directory. This command creates a standard directory structure for your Spring project including the main Java class source file and the pom.xml file used for managing Maven based projects.
The Azure OpenAI chat models are optimized to work with inputs formatted as a conversation. The messages variable passes an array of dictionaries with different roles in the conversation delineated by system, user, tool, and assistant. The dataSources variable connects to your Azure Cognitive Search index, and enables Azure OpenAI models to respond using your data.
For production, use a secure way of storing and accessing your credentials like The PowerShell Secret Management with Azure Key Vault. For more information about credential security, see the Azure AI services security article.
To start chatting with the Azure OpenAI model that uses your data, you can deploy a web app using Azure OpenAI studio or example code we provide on GitHub. This app deploys using Azure app service, and provides a user interface for sending queries. This app can be used Azure OpenAI models that use your data, or models that don't use your data. See the readme file in the repo for instructions on requirements, setup, and deployment. You can optionally customize the frontend and backend logic of the web app by making changes to the source code.
The Azure OpenAI chat models are optimized to work with inputs formatted as a conversation. The messages variable passes an array of dictionaries with different roles in the conversation delineated by system, user, tool, and assistant. The dataSources variable connects to your Azure AI Search index, and enables Azure OpenAI models to respond using your data.
If you want to clean up and remove an Azure OpenAI or Azure AI Search resource, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.
The maxTokens parameter limit the length of the answer that you will get. For example, in order to get an answer of around one or two sentences, set this as 250 to be safe. You may get a short or imcomplete answer if you set this too low. All free users can spend 150K tokens every minute.
Do you want an informative or a creative answer? The temperature parameter defines this. This should be a decimal between 0 and 1, with values closer to 1 returning more informative answers, and values closer to 0 returning more creative answers.
d3342ee215