C6 — Streaming Replies + a Tool-Calling Agent

Summary — what this page covers In C5 you stood up a synchronous chat endpoint on the Anthropic C# SDK. Now you'll add two things on top of it: SSE streaming (tokens arrive live, one at a time) and a tool-calling agent that acts on real BookTracker data. The agent's write tool calls the C4 IReadingProgressService, so Day 1's feature becomes the agent's write surface — business rules and all.

Time: ~50 minutes · Format: hands-on, with Claude Code · You start from: checkpoint/c5-sdk-chat · You end at: checkpoint/c6-streaming-agent

C5 proved you could call Claude from C#. A single request/response is fine for short answers, but it has two limits: the user stares at a spinner until the whole reply lands, and the model can only talk — it can't do anything to your data. C6 fixes both. Streaming surfaces tokens as they're generated, and a tool-calling agent loop lets Claude search the catalog and update reading progress by calling your existing Core services. No new packages — you're still on the Anthropic SDK from C5, and your Anthropic API key is already in user-secrets.

1. Get to the C6 starting line

git switch -c my-c6 checkpoint/c5-sdk-chat
cd src/BookTracker
dotnet build BookTracker.sln

Expect Build succeeded. The C5 solution already has the Anthropic package, a Singleton AnthropicClient, and your API key in user-secrets — confirm with dotnet user-secrets list from BookTracker.Api. If the key is missing, re-add it as you did in C5.

The answer key for this lab is the tag checkpoint/c6-streaming-agent. Reach for it only after you've tried — git show checkpoint/c6-streaming-agent:<path> shows any file as-built.

2. Add the Core contracts (SDK-free)

Core never references the SDK, so the new abstractions live there as plain interfaces + DTOs. Add to BookTracker.Core:

Services/IStreamingService.cs — IAsyncEnumerable<string> StreamAsync(string prompt, CancellationToken ct = default).
Services/IAgentService.cs — Task<AgentResult> RunAsync(string message, CancellationToken ct = default).
Dtos/AgentDtos.cs — the request/result shapes:

namespace BookTracker.Core.Dtos;

public record AgentRequest(string Message);
public record AgentToolCall(string Name, string Input, string Result, bool IsError);
public record AgentResult(string Reply, IReadOnlyList<AgentToolCall> ToolCalls);

AgentToolCall is what makes the multi-step loop visible — the endpoint returns the list of tools the agent fired, not just the final text.

3. Stream tokens with SSE

The SDK's Messages.CreateStreaming returns an IAsyncEnumerable of stream events. You want the text deltas only. Add BookTracker.Api/Services/StreamingService.cs:

public async IAsyncEnumerable<string> StreamAsync(
    string prompt, [EnumeratorCancellation] CancellationToken ct = default)
{
    var parameters = new MessageCreateParams
    {
        Model = _options.Model,
        MaxTokens = _options.MaxTokens,
        Messages = [new() { Role = Role.User, Content = prompt }],
    };

    await foreach (var ev in _client.Messages.CreateStreaming(parameters, cancellationToken: ct))
        if (ev.TryPickContentBlockDelta(out var delta) && delta.Delta.TryPickText(out var text))
            yield return text.Text;
}

Now the endpoint. Add a GET /api/chat/stream that writes one SSE frame per chunk and flushes after each one — without the flush, ASP.NET buffers the response and the client sees nothing until the whole reply is done:

app.MapGet("/api/chat/stream", async (
    string q, IStreamingService streaming, HttpResponse response, CancellationToken ct) =>
{
    response.Headers.ContentType = "text/event-stream";
    await foreach (var chunk in streaming.StreamAsync(q, ct))
    {
        await response.WriteAsync($"data: {chunk}\n\n", ct);
        await response.Body.FlushAsync(ct);   // without this, the client waits for the end
    }
});

4. Define the agent's tools

Tools are the agent's hands. Add BookTracker.Api/Tools/BookTrackerTools.cs with three tools — each a Tool with a name, a clear Description (this is the contract the model reads to decide when to call it), and a JSON input schema. Wire each to a real Core service, never to the data layer directly:

Tool	Input	Calls (Core service)
`find_book`	`{ query }`	`IBookService.SearchAsync`
`get_reading_progress`	`{ bookId }`	`IReadingProgressService.GetForBookAsync`
`update_reading_progress`	`{ bookId, currentPage, status }`	`IReadingProgressService.UpdateAsync` (the C4 service)

Keep the dispatch (name → service) as a separate instance method so it's unit-testable without the SDK later (this matters at C9):

public async Task<string> ExecuteAsync(
    string name, IReadOnlyDictionary<string, JsonElement> input, CancellationToken ct = default)
    => name switch
    {
        "find_book"               => Serialize(await _books.SearchAsync(input["query"].GetString() ?? "", ct)),
        "get_reading_progress"    => Serialize(await _progress.GetForBookAsync(input["bookId"].GetInt32(), ct)),
        "update_reading_progress" => Serialize(await _progress.UpdateAsync(
                                         input["bookId"].GetInt32(),
                                         new UpdateReadingProgressRequest(
                                             input.TryGetValue("currentPage", out var cp) ? cp.GetInt32() : 0,
                                             input["status"].GetString() ?? ""),
                                         ct)),
        _ => $"Unknown tool: {name}",
    };

find_book calls the service method IBookService.SearchAsync — not the Data-layer repository's SearchByTitleAsync. The agent acts through your code, so it inherits your conventions.

5. Write the agent loop

Add BookTracker.Api/Services/AgentService.cs. The teaching point of this section is implement it as a loop, not a single call — the model calls a tool, you run it, you feed the result back, and it may call another. Spec the manual loop as the reference:

client.Messages.Create with Tools = BookTrackerTools.Definitions and the running Messages.
If response.StopReason != "tool_use" → done; extract and return the final text.
Otherwise: rebuild the assistant turn from response.Content (TextBlockParam + ToolUseBlockParam), ExecuteAsync each ToolUseBlock, collect one ToolResultBlockParam per call (set is_error on failure — never drop one, or the next request is malformed), append the assistant echo plus a user turn of tool results, and loop.

Two guards keep a misbehaving conversation from running away or crashing the request:

A max-iteration cap (e.g. 5) — if the loop hasn't converged, stop and return a clear message.
A per-tool try/catch — turn any exception into an is_error tool result the model can read and explain, instead of a 500. (Let OperationCanceledException propagate — cancellation isn't a tool error.)

private const int MaxIterations = 5;
// ...
for (var iteration = 0; iteration < MaxIterations; iteration++)
{
    if (response.StopReason != "tool_use")
        return new AgentResult(finalText, toolCalls);

    foreach (var toolUse in /* ToolUseBlocks in response.Content */)
    {
        string result; var isError = false;
        try { result = await _tools.ExecuteAsync(toolUse.Name, toolUse.Input, ct); }
        catch (OperationCanceledException) { throw; }
        catch (Exception ex) { result = $"Error: {ex.Message}"; isError = true; }
        toolCalls.Add(new AgentToolCall(toolUse.Name, /* input json */, result, isError));
        // append a ToolResultBlockParam (with is_error = isError) to the next user turn
    }
    // append assistant echo + user tool-result turn; call Messages.Create again
}
return new AgentResult("Stopped after reaching the maximum number of tool iterations.", toolCalls);

Auto alternative: the SDK ships BetaToolRunner (client.Beta.Messages.ToolRunner(...)) that drives this loop for you. The manual loop is the portable teaching version — and keeping ExecuteAsync separate from the loop pays off at C9, where the LLM call is refactored behind an IAgentLlm seam so the loop becomes unit-testable.

6. Endpoints + DI

Add POST /api/agent alongside the stream endpoint in Api/Endpoints/AgentEndpoints.cs. It takes an AgentRequest, runs the loop, and returns the final text plus the tool calls made:

app.MapPost("/api/agent", async (AgentRequest request, IAgentService agent, CancellationToken ct) =>
{
    var result = await agent.RunAsync(request.Message, ct);
    return TypedResults.Ok(result);
});

In Program.cs, register the new services as scoped and map the endpoints — and keep the C5 Singleton AnthropicClient:

builder.Services.AddScoped<BookTrackerTools>();
builder.Services.AddScoped<IStreamingService, StreamingService>();
builder.Services.AddScoped<IAgentService, AgentService>();
// ...
app.MapAgentEndpoints();

dotnet build BookTracker.sln
dotnet test BookTracker.sln

Both should be green before you verify behavior.

7. Verify it

Run the API (dotnet run --project BookTracker.Api) and exercise both endpoints from a second terminal.

Streaming — curl -N keeps the connection open so you can watch tokens land incrementally:

curl -N "http://localhost:5255/api/chat/stream?q=hello"

You should see the reply build up piece by piece, not appear all at once at the end.

The agent, two-step ask — this should fire 2+ tool calls (a find_book, then an update_reading_progress) before the final reply:

curl -X POST http://localhost:5255/api/agent \
  -H "Content-Type: application/json" \
  -d '{"message":"Find Dune and mark it completed"}'

The toolCalls array in the response (and the server logs) shows the loop's steps. Re-query the book to confirm the update hit the database — the agent wrote through your real C4 service.

The C4 through-line — because update_reading_progress runs the C4 IReadingProgressService, the agent enforces the C4 business rules for free. Ask it to make an illegal transition:

curl -X POST http://localhost:5255/api/agent \
  -H "Content-Type: application/json" \
  -d '{"message":"Set Dune back to Reading"}'

The service rejects the Completed → Reading move, your try/catch turns it into an is_error tool result, and the model explains the failure instead of the request crashing. Day 1's debugging-protected feature is now an agent capability — with its guardrails intact.

✅ Checkpoint — you're done when:

dotnet build and dotnet test are both green.
curl -N .../api/chat/stream?q=hello shows tokens arriving incrementally, not all at once.
POST /api/agent with a 2-step ask fires 2+ tool calls (visible in toolCalls / logs) before the final reply.
Tools read and write real data — an update_reading_progress call changes the DB.
A rule-violating ask (Completed → Reading) returns a tool error the agent surfaces — the C4 rules hold.
The loop is guarded: a max-iteration cap, and per-tool errors don't crash the request.
Tag the state: git tag checkpoint/c6-streaming-agent (or compare against the workshop's tag).

What's next

Lab 3 (C6 → C7): RAG-grounded recommendations. You'll give Claude a retrieval step so its book suggestions are grounded in your catalog's actual content — not just the model's training data.