Chapter 4: LLM-Powered Agent
Learning objectives: Connect an agent to a real LLM, use the
LLMAgentBuilder, handle streaming responses, and manage multi-turn conversations.
LLM Providers in MoFA
MoFA supports four LLM providers out of the box:
| Provider | Crate | Helper function | Requires |
|---|---|---|---|
| OpenAI | async-openai | OpenAIProvider::from_env() | OPENAI_API_KEY |
| Anthropic | Custom | AnthropicProvider::from_env() | ANTHROPIC_API_KEY |
| Google Gemini | Custom | GeminiProvider::from_env() | GOOGLE_API_KEY |
| Ollama | Custom | OllamaProvider::default() | Ollama running locally |
All providers implement the LLMProvider trait from mofa-kernel:
#![allow(unused)]
fn main() {
#[async_trait]
pub trait LLMProvider: Send + Sync {
fn name(&self) -> &str;
fn default_model(&self) -> &str;
async fn chat(&self, request: ChatCompletionRequest) -> LLMResult<ChatCompletionResponse>;
}
}
Architecture note: The
LLMProvidertrait is defined inmofa-kernel(the contract), whileOpenAIProvider,OllamaProvider, etc. live inmofa-foundation(the implementations). This is the microkernel pattern at work — you can create your own provider by implementing this trait.
The LLMAgentBuilder
Instead of implementing MoFAAgent manually (like in Chapter 3), MoFA provides LLMAgentBuilder — a fluent builder that creates a fully-featured LLM agent in a few lines:
#![allow(unused)]
fn main() {
use mofa_sdk::llm::{LLMAgentBuilder, OpenAIProvider};
use std::sync::Arc;
let agent = LLMAgentBuilder::new()
.with_id("my-agent")
.with_name("My Assistant")
.with_provider(Arc::new(OpenAIProvider::from_env()))
.with_system_prompt("You are a helpful AI assistant.")
.with_temperature(0.7)
.with_max_tokens(2048)
.build();
}
The builder supports many options:
| Method | Purpose |
|---|---|
.with_id(id) | Set agent ID |
.with_name(name) | Set display name |
.with_provider(provider) | Set LLM provider (required) |
.with_system_prompt(prompt) | Set the system prompt |
.with_temperature(t) | Set sampling temperature (0.0-2.0) |
.with_max_tokens(n) | Set max response tokens |
.with_model(model) | Override default model name |
.with_session_id(id) | Set initial session ID |
.with_sliding_window(n) | Limit conversation context window |
.from_env() | Auto-detect provider from env vars |
Rust tip:
Arc<dyn Trait>Arc::new(OpenAIProvider::from_env())wraps the provider in anArc(atomic reference-counted pointer). This is needed because the agent and its internal components need to share the same provider.dyn LLMProvidermeans “any type that implementsLLMProvider” — this is Rust’s dynamic dispatch, similar to a virtual method call in C++ or an interface reference in Java.
Build: A Streaming Chatbot
Let’s build a chatbot that streams responses and maintains conversation context.
Create a new project:
cargo new llm_chatbot
cd llm_chatbot
Edit Cargo.toml:
[package]
name = "llm_chatbot"
version = "0.1.0"
edition = "2024"
[dependencies]
mofa-sdk = { path = "../../crates/mofa-sdk" }
tokio = { version = "1", features = ["full"] }
tokio-stream = "0.1"
anyhow = "1"
Write src/main.rs:
use mofa_sdk::llm::{LLMAgentBuilder, OpenAIProvider};
use std::sync::Arc;
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// --- Step 1: Create the provider ---
let provider = Arc::new(OpenAIProvider::from_env());
// --- Step 2: Build the agent ---
let agent = LLMAgentBuilder::new()
.with_id("chatbot-001")
.with_name("Tutorial Chatbot")
.with_provider(provider)
.with_system_prompt(
"You are a friendly AI tutor helping students learn about \
the MoFA agent framework. Keep answers concise."
)
.with_temperature(0.7)
.build();
// --- Step 3: Simple Q&A (non-streaming) ---
println!("=== Simple Q&A ===");
let response = agent.ask("What is a microkernel architecture?").await?;
println!("A: {}\n", response);
// --- Step 4: Streaming response ---
println!("=== Streaming ===");
let mut stream = agent.ask_stream("Explain traits in Rust in 3 sentences.").await?;
print!("A: ");
while let Some(chunk) = stream.next().await {
match chunk {
Ok(text) => print!("{}", text),
Err(e) => eprintln!("\nStream error: {}", e),
}
}
println!("\n");
// --- Step 5: Multi-turn conversation ---
println!("=== Multi-turn Chat ===");
let r1 = agent.chat("My name is Alice and I'm learning Rust.").await?;
println!("A: {}\n", r1);
let r2 = agent.chat("What's my name and what am I learning?").await?;
println!("A: {}\n", r2);
// The agent remembers context from the previous message!
Ok(())
}
Run it:
cargo run
Using Ollama Instead
To use a local Ollama model, just swap the provider:
#![allow(unused)]
fn main() {
use mofa_sdk::llm::{LLMAgentBuilder, OllamaProvider};
let provider = Arc::new(OllamaProvider::default());
// Ollama uses http://localhost:11434 by default
let agent = LLMAgentBuilder::new()
.with_provider(provider)
.with_model("llama3.2") // specify which Ollama model to use
.with_system_prompt("You are a helpful assistant.")
.build();
}
Or use the from_env() convenience method which auto-detects the provider:
#![allow(unused)]
fn main() {
// Checks OPENAI_API_KEY, ANTHROPIC_API_KEY, GOOGLE_API_KEY,
// and falls back to Ollama if none are set
let builder = LLMAgentBuilder::from_env()?;
let agent = builder
.with_system_prompt("You are a helpful assistant.")
.build();
}
What Just Happened?
Let’s trace what happens when you call agent.ask("question"):
- The
LLMAgentwraps your question in aChatMessagewith role"user" - It prepends the system prompt as a
ChatMessagewith role"system" - It builds a
ChatCompletionRequestwith temperature, max_tokens, etc. - It calls
provider.chat(request)which sends the request to the LLM API - The response
ChatCompletionResponseis unwrapped and the text content is returned
For agent.chat() (multi-turn), the agent also:
- Stores the user message in the current
ChatSession - Stores the assistant’s response
- Includes all previous messages in the next request (conversation context)
For agent.ask_stream() and agent.chat_stream():
- The provider returns a
TextStream(a stream of string chunks) - You consume it with
StreamExt::next()in a loop - Each chunk contains a piece of the response as it’s generated
Architecture note: The
LLMAgentstruct lives inmofa-foundation(crates/mofa-foundation/src/llm/agent.rs). It implements theMoFAAgenttrait internally, so it has the same lifecycle (initialize → execute → shutdown). The builder pattern is a convenience — under the hood, it constructs anLLMAgentConfigand passes it toLLMAgent::new().
Session Management
Each LLMAgent manages multiple chat sessions. This is useful for serving multiple users or maintaining separate conversation threads:
#![allow(unused)]
fn main() {
// Create a new session (returns session ID)
let session_id = agent.create_session().await;
// Chat within a specific session
let r1 = agent.chat_with_session(&session_id, "Hello!").await?;
// Switch the active session
agent.switch_session(&session_id).await?;
// List all sessions
let sessions = agent.list_sessions().await;
// Get or create a session with a specific ID
let sid = agent.get_or_create_session("user-123-session").await;
}
Loading from a Config File
For production use, you can define agent configuration in YAML:
# agent.yml
agent:
id: "my-agent-001"
name: "My LLM Agent"
description: "A helpful assistant"
llm:
provider: openai
model: gpt-4o
api_key: ${OPENAI_API_KEY}
temperature: 0.7
max_tokens: 4096
system_prompt: |
You are a helpful AI assistant.
Load it in code:
#![allow(unused)]
fn main() {
use mofa_sdk::llm::agent_from_config;
let agent = agent_from_config("agent.yml")?;
let response = agent.ask("Hello!").await?;
}
Key Takeaways
LLMAgentBuilderis the recommended way to create LLM-powered agents- Four providers are supported: OpenAI, Anthropic, Gemini, Ollama
agent.ask()for one-off questions,agent.chat()for multi-turn conversationsagent.ask_stream()/agent.chat_stream()for streaming responses- Session management enables multi-user and multi-thread conversations
from_env()auto-detects the provider from environment variables- Config files (
agent.yml) are supported for production deployments
Next: Chapter 5: Tools and Function Calling — Give your agent the ability to call functions.
English | 简体中文