Spring AI tutorial: Get began with Spring AI

December 6, 2025

18

Use the next context to reply the consumer's query.
If the query can't be answered from the context, state that clearly.

Context:
{context}

Query:
{query}

Then I created a brand new SpringAIRagService:

package deal com.infoworld.springaidemo.service;

import java.util.Listing;
import java.util.Map;
import java.util.stream.Collectors;

import org.springframework.ai.chat.consumer.ChatClient;
import org.springframework.ai.chat.immediate.Immediate;
import org.springframework.ai.chat.immediate.PromptTemplate;
import org.springframework.ai.doc.Doc;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.stereotype.Service;

@Service
public class SpringAIRagService {
    @Worth("classpath:/templates/rag-template.st")
    non-public Useful resource promptTemplate;
    non-public closing ChatClient chatClient;
    non-public closing VectorStore vectorStore;

    public SpringAIRagService(ChatClient.Builder chatClientBuilder, VectorStore vectorStore) {
        this.chatClient = chatClientBuilder.construct();
        this.vectorStore = vectorStore;
    }

    public String question(String query) {
        SearchRequest searchRequest = SearchRequest.builder()
                .question(query)
                .topK(2)
                .construct();
        Listing similarDocuments = vectorStore.similaritySearch(searchRequest);
        String context = similarDocuments.stream()
                .map(Doc::getText)
                .accumulate(Collectors.becoming a member of("n"));

        Immediate immediate = new PromptTemplate(promptTemplate)
                .create(Map.of("context", context, "query", query));

        return chatClient.immediate(immediate)
                .name()
                .content material();
    }
}

The SpringAIRagService wires in a ChatClient.Builder, which we use to construct a ChatClient, together with our VectorStore. The question() technique accepts a query and makes use of the VectorStore to construct the context. First, we have to construct a SearchRequest, which we do by:

Invoking its static builder() technique.
Passing the query because the question.
Utilizing the topK() technique to specify what number of paperwork we need to retrieve from the vector retailer.
Calling its construct() technique.

On this case, we need to retrieve the highest two paperwork which might be most just like the query. In follow, you’ll use one thing bigger, equivalent to the highest three or prime 5, however since we solely have three paperwork, I restricted it to 2.

Subsequent, we invoke the vector retailer’s similaritySearch() technique, passing it our SearchRequest. The similaritySearch() technique will use the vector retailer’s embedding mannequin to create a multidimensional vector of the query. It’ll then evaluate that vector to every doc and return the paperwork which might be most just like the query. We stream over all comparable paperwork, get their textual content, and construct a context String.

Subsequent, we create our immediate, which tells the LLM to reply the query utilizing the context. Notice that you will need to inform the LLM to make use of the context to reply the query and, if it can not, to state that it can not reply the query from the context. If we don’t present these directions, the LLM will use the info it was skilled on to reply the query, which implies it can use data not within the context we’ve supplied.

Lastly, we construct the immediate, setting its context and query, and invoke the ChatClient. I added a SpringAIRagController to deal with POST requests and cross them to the SpringAIRagService:

package deal com.infoworld.springaidemo.net;

import com.infoworld.springaidemo.mannequin.SpringAIQuestionRequest;
import com.infoworld.springaidemo.mannequin.SpringAIQuestionResponse;
import com.infoworld.springaidemo.service.SpringAIRagService;

import org.springframework.http.ResponseEntity;
import org.springframework.net.bind.annotation.PostMapping;
import org.springframework.net.bind.annotation.RequestBody;
import org.springframework.net.bind.annotation.RestController;

@RestController
public class SpringAIRagController {
    non-public closing SpringAIRagService springAIRagService;

    public SpringAIRagController(SpringAIRagService springAIRagService) {
        this.springAIRagService = springAIRagService;
    }

    @PostMapping("/springAIQuestion")
    public ResponseEntity askAIQuestion(@RequestBody SpringAIQuestionRequest questionRequest) {
        String reply = springAIRagService.question(questionRequest.query());
        return ResponseEntity.okay(new SpringAIQuestionResponse(reply));
    }
}

The askAIQuestion() technique accepts a SpringAIQuestionRequest, which is a Java document:

package deal com.infoworld.springaidemo.mannequin;

public document SpringAIQuestionRequest(String query) {
}

The SpringAIQuestionRequest returns a SpringAIQuestionResponse:

package deal com.infoworld.springaidemo.mannequin;

public document SpringAIQuestionResponse(String reply) {
}

Now restart your utility and execute a POST to /springAIQuestion. In my case, I despatched the next request physique:

{
    "query": "Does Spring AI help RAG?"
}

And acquired the next response:

{
    "reply": "Sure. Spring AI explicitly helps Retrieval Augmented Era (RAG), together with chat reminiscence, integrations with main vector shops, a conveyable vector retailer API with metadata filtering, and a doc injection ETL framework to construct RAG pipelines."
}

As you’ll be able to see, the LLM used the context of the paperwork we loaded into the vector retailer to reply the query. We are able to additional check whether or not it’s following our instructions by asking a query that’s not in our context:

{
    "query": "Who created Java?"
}

Right here is the LLM’s response:

{
    "reply": "The supplied context doesn't embrace details about who created Java."
}

This is a crucial validation that the LLM is barely utilizing the supplied context to reply the query and never utilizing its coaching knowledge or, worse, making an attempt to make up a solution.

Conclusion

This text launched you to utilizing Spring AI to include giant language mannequin capabilities into Spring-based functions. You possibly can configure LLMs and different AI applied sciences utilizing Spring’s customary utility.yaml file, then wire them into Spring elements. Spring AI supplies an abstraction to work together with LLMs, so that you don’t want to make use of LLM-specific SDKs. For knowledgeable Spring builders, this whole course of is just like how Spring Information abstracts database interactions utilizing Spring Information interfaces.

On this instance, you noticed learn how to configure and use a big language mannequin in a Spring MVC utility. We configured OpenAI to reply easy questions, launched immediate templates to externalize LLM prompts, and concluded through the use of a vector retailer to implement a easy RAG service in our instance utility.

Spring AI has a strong set of capabilities, and we’ve solely scratched the floor of what you are able to do with it. I hope the examples on this article present sufficient foundational data that can assist you begin constructing AI functions utilizing Spring. As soon as you’re comfy with configuring and accessing giant language fashions in your functions, you’ll be able to dive into extra superior AI programming, equivalent to constructing AI brokers to enhance your online business processes.

Learn subsequent: The hidden abilities behind the AI engineer.

Spring AI tutorial: Get began with Spring AI

Conclusion

Related Articles

BotGauge AI Raises $2 Million for Autonomous QA Platform

Robots-Weblog | Vention führt GRIIP ein: Eine generalisierte Bodily-AI-Pipeline für die Fertigungsautomatisierung

US Military GVSC qualifies Velo3D to assist AM integration into Protection Industrial Base provide chain

LEAVE A REPLY Cancel reply

Latest Articles

BotGauge AI Raises $2 Million for Autonomous QA Platform

Robots-Weblog | Vention führt GRIIP ein: Eine generalisierte Bodily-AI-Pipeline für die Fertigungsautomatisierung

US Military GVSC qualifies Velo3D to assist AM integration into Protection Industrial Base provide chain

The demise of reactive IT: How predictive engineering will redefine cloud efficiency in 10 years

How ShieldHQ Helps Organizations Cut back Insider Threat With out Disrupting Work – Newest Hacking Information

About US