import java.net.URI; import java.net.http.HttpClient; import java.net.http.HttpRequest; import java.net.http.HttpResponse; public class OllamaNativeClient public static void main(String[] args) throws Exception HttpClient client = HttpClient.newHttpClient(); String jsonPayload = """ "model": "llama3", "prompt": "Explain garbage collection in Java in one sentence.", "stream": false """; HttpRequest request = HttpRequest.newBuilder() .uri(URI.create("http://localhost:11434/api/generate")) .header("Content-Type", "application/json") .POST(HttpRequest.BodyPublishers.ofString(jsonPayload)) .build(); HttpResponse response = client.send(request, HttpResponse.BodyHandlers.ofString()); System.out.println("Response Status: " + response.statusCode()); System.out.println("Response Body: " + response.body()); Use code with caution. Option 2: LangChain4j (Recommended for Production)
: Download a model (e.g., llama3 or mistral ) via the Ollamac app interface.
Your Java application communicates directly with this local endpoint. This architecture ensures that your data never leaves your local environment, making it ideal for processing sensitive information. Step 1: Setting Up the Local Environment
String answer = model.generate("What is the capital of France?"); System.out.println(answer);
import io.github.ollama4j.OllamaAPI; import io.github.ollama4j.models.response.OllamaResult; public class OllamaJavaTest public static void main(String[] args) String url = "http://localhost:11434"; OllamaAPI ollamaAPI = new OllamaAPI(url); try // Set the model you pulled earlier String model = "llama3"; String prompt = "What is the best feature of Java?"; System.out.println("Asking Ollama: " + prompt); OllamaResult result = ollamaAPI.generate(model, prompt, false, null); System.out.println("\nResponse:\n" + result.getResponse()); catch (Exception e) e.printStackTrace(); Use code with caution. 2. Using LangChain4j ollamac java work
: The easiest way to integrate with Spring Boot. It uses the OllamaChatModel API to handle chat completions and embeddings locally.
What are you planning to use (Spring Boot, Quarkus, or plain Java)?
: Visit the official Ollama website and download the installer for your operating system (macOS, Linux, or Windows).
Integrating Large Language Models (LLMs) directly into enterprise applications has become a standard requirement for modern software development. While cloud-based APIs like OpenAI and Anthropic are popular, they introduce challenges regarding data privacy, recurring costs, and internet dependencies. import java
Add the spring-boot-starter-ollama dependency to your Maven or Gradle project via Spring Initializer.
Instead of hardcoding client configurations, Spring AI externalizes setup parameters: properties
This public link is valid for 7 days and shares a thread, including any personal information you added. This link or copies made by others cannot be deleted. If you share with third parties, their policies apply. Can’t copy the link right now. Try again later.
public record SentimentAnalysis(String sentiment, double confidenceScore, boolean requiresHumanIntervention) {} // LangChain4j AiServices can automatically map Ollama responses directly into this record. Use code with caution. Performance Optimization and Production Readiness This architecture ensures that your data never leaves
Optimizing performance involves tuning both the model and your client. Key levers include:
Alternatively, you can deploy Ollama using Docker. Run the following command to start the Ollama container and expose it on port 11434 : docker run -d -p 11434:11434 --name ollama ollama/ollama . Then, to pull the model inside the container, execute: docker exec -it ollama ollama pull qwen2.5:7b .
I can provide tailored configuration scripts, production-ready Spring Boot starters, or optimized system prompts based on your needs. Share public link