Lab 1 — Your First AI-Native ASP.NET Core Feature

Summary — what this page covers The first build. Attendees wire the SDK into BookTracker and ship a working /api/chat endpoint with multi-turn history and prompt caching — committed. Keep the steps copy-paste exact.

Duration: 45 min · Deliverable: working /api/chat with history and prompt caching — committed

Part A — Install & register (≈10 min)

dotnet add BookTracker.Api package Anthropic
dotnet user-secrets set "Anthropic:ApiKey" "sk-ant-..." --project BookTracker.Api

Register the client as a Singleton and your service as Scoped in Program.cs:

using Anthropic;

builder.Services.AddSingleton(new AnthropicClient {
    ApiKey = builder.Configuration["Anthropic:ApiKey"]
});
builder.Services.AddScoped<IClaudeService, ClaudeService>();

Singleton because AnthropicClient holds an HttpClient — Scoped/Transient causes socket exhaustion. (See Section 1.)

Part B — ClaudeService + chat endpoint (≈20 min)

Implement IClaudeService.ChatAsync(userMessage, history, ct):

  • Map your stored history to a MessageCreateParams.Messages list and append the new user turn.
  • Add a system prompt that scopes the assistant to BookTracker (what it is, what it knows).
  • Return the text from response.Content (filter to TextBlock).

Wire POST /api/chat and persist history with IDistributedCache keyed by session id, so the conversation survives across HTTP requests (the API itself is stateless).

app.MapPost("/api/chat", async (ChatRequest req, IClaudeService claude, CancellationToken ct) =>
{
    var history = await LoadHistory(req.SessionId);
    var reply = await claude.ChatAsync(req.Message, history, ct);
    await SaveHistory(req.SessionId, history, req.Message, reply);
    return Results.Ok(new { reply });
});

Part C — Prompt caching (≈15 min)

Mark the system prompt with CacheControlEphemeral, then make two calls and compare usage:

System = new List<TextBlockParam> {
    new() { Text = bookTrackerSystemPrompt, CacheControl = new CacheControlEphemeral() },
},
  • First call: response.Usage.CacheCreationInputTokens is non-zero (you wrote the cache).
  • Second call: response.Usage.CacheReadInputTokens is non-zero (you read it) — the cached input bills at ~10% of normal. Note the cost delta.

Checkpoint

  • POST /api/chat returns a Claude response
  • Multi-turn history persists across requests
  • Second call shows cache read tokens (caching is working)
  • API key is in user-secrets, NOT committed
  • Committed to your fork

Bonus — The Semantic Search Endpoint

No time pressure. Add a simple semantic-search endpoint — embed a query, compare it to a small set of pre-embedded book descriptions, and return the closest matches. It's a warm-up for the full RAG pipeline you build in Section 3.