Troubleshooting

Summary — what this page covers The "I hit a wall, what now?" page. Every entry is something that has actually broken in a workshop — install, auth, hooks not firing, MCP not appearing, RAG failing, CI quirks. Each entry has the symptom, the cause in one line, and the fix. Before flagging an issue to a TA, try the matching entry here.

How to use this page

Search the page (Ctrl/Cmd-F) for the error message or symptom. Each entry is Symptom → Cause → Fix. If your issue isn't here, capture the exact error and ask a TA — and we'll add it.


Day 1 — Install & authentication

claude: command not found

  • Cause: the global npm bin isn't on your PATH, or the install hit a permissions wall.
  • Fix: confirm install (npm list -g @anthropic-ai/claude-code); if missing, reinstall without sudo — use a Node version manager (nvm) or fix npm's prefix (npm config set prefix ~/.npm-global and add ~/.npm-global/bin to PATH).

Windows: claude works in PowerShell but hooks/subagents misbehave

  • Cause: Claude Code's full feature set requires WSL2.
  • Fix: install WSL2 (wsl --install) and run every Claude Code command inside the WSL2 shell. Don't mix PowerShell and WSL2 sessions.

Auth keeps prompting / 401 Unauthorized on Day 1

  • Cause: logged out, or the wrong auth method.
  • Fix: for the Pro/Max OAuth path, run claude and complete the browser flow once. For API auth, set ANTHROPIC_API_KEY in your shell profile and start a new terminal.

"Claude Code requires a paid plan"

  • Cause: Free tier. Day 1 needs Pro minimum.
  • Fix: upgrade in Claude.ai → Settings → Plan. See Prerequisites.

Day 1 — Section 2 (Steering)

My skill never fires automatically

  • Cause: the skill description doesn't include the trigger words you'd actually say.
  • Fix: rewrite the description: for the model, not the human. Include verbs and nouns from the user request ("add endpoint", "new route"). Then invoke explicitly once with /name to confirm it's wired; tighten the description and re-test auto-match.

Skills aren't being recognized at all

  • Cause: treating skills as single .md files (older docs showed this — it's wrong now).
  • Fix: skills are folders. .claude/skills/<name>/ must contain a SKILL.md. Verify with ls .claude/skills/<name>/SKILL.md.

My subagent fails to load with a YAML error

  • Cause: the file is named <name>.yaml instead of <name>.md with YAML frontmatter.
  • Fix: rename to .claude/agents/<name>.md; the YAML lives between --- fences at the top, and the body below it becomes the subagent's system prompt.

My path-scoped rule doesn't load

  • Cause: either you haven't read a matching file yet, or you're using a user-level rule with paths: (currently unreliable).

  • Fix: confirm the rule's paths: glob matches a file you actually opened in the session (run /memory to see what's loaded). If it's a user-level rule (~/.claude/rules/), move it to project-level (.claude/rules/). Note: rules load on read, not on create — for a "every new file must…" rule, drop paths: so the rule is unscoped.

My PreToolUse hook doesn't block anything

  • Cause: the session was started before the hook was registered, or the script isn't executable, or it isn't returning exit code 2.

  • Fix: chmod +x .claude/hooks/<script>.sh, confirm registration in .claude/settings.json, then start a fresh session (hooks load at session start). Test the script standalone: echo '{"tool_input":{"command":"rm -rf /tmp"}}' | .claude/hooks/<script>.sh; echo $? — should print 2.

A PostToolUse build/format hook fires but nothing seems to happen

  • Cause: the hook command is probably succeeding silently; Claude only sees stderr by default.
  • Fix: ensure your command writes meaningful output (errors and success markers) to stderr or stdout. For dotnet build, pipe through grep -E '(error|warning|succeeded|FAILED)' so Claude has something to react to.

Day 1 — Section 3 (IDE + MCP)

VS Code extension installed but no Claude UI

  • Cause: authentication didn't complete inside the IDE, or the extension didn't reload.
  • Fix: reload window (Ctrl/Cmd-Shift-P → Developer: Reload Window), then run the extension's sign-in command from the command palette.

Rider/IntelliJ: plugin installed but no Claude tool window

  • Cause: restart needed; the plugin uses a tool window that only appears after a full IDE restart.
  • Fix: fully restart the IDE (not just "Invalidate Caches" — restart).

claude mcp list doesn't show the GitHub server

  • Cause: misregistered, or the GitHub PAT scope is wrong.
  • Fix: check the registration command's exit code; confirm the PAT has at least repo (and read:org if you're targeting org-owned repos). Recreate the PAT if unsure.

GitHub MCP demo times out / network blocked

  • Cause: conference Wi-Fi or corporate proxy blocking GitHub API.
  • Fix: tether to a phone; if even that fails, the recorded backup demo in _instructor/ is the fallback. The lab can be completed against the recorded session.

Day 2 — SDK & Auth

Anthropic package not found / wrong package installed

  • Cause: picked a community package by name. The official one is just Anthropic (v12.x).
  • Fix: dotnet add package Anthropic — confirm it resolves to v12.x. The old tryAGI.Anthropic v3.x and Anthropic.SDK are separate lineages.

401 Unauthorized from the API on Day 2

  • Cause: missing/invalid API key, or expecting Pro to grant API access.
  • Fix: Pro/Max does not include API access. Create an API key in console.anthropic.com and set it via user-secrets in the BookTracker.Api project. See Prerequisites.

429 Too Many Requests / 529 Overloaded

  • Cause: rate limit (429) or transient capacity (529).
  • Fix: the Polly policy from Section 1 handles both — confirm it's registered. For 529, retries with backoff resolve almost all cases.

Socket exhaustion under load / SocketException

  • Cause: AnthropicClient registered as Scoped or Transient.
  • Fix: register as Singleton — it holds an HttpClient internally (same reason you'd use IHttpClientFactory).

Day 2 — Streaming & Tools

SSE endpoint returns the whole response at once (not streamed)

  • Cause: response isn't being flushed per chunk, or buffering middleware is in the pipeline.
  • Fix: set Content-Type: text/event-stream, write data: ...\n\n frames, and flush after each write (await Response.Body.FlushAsync()). Check that compression/output-cache middleware isn't buffering the response.

Agent loop runs forever

  • Cause: no max-iteration guard.
  • Fix: add a cap (e.g. 10) on the loop; log iteration count; surface the tool call sequence for debugging.

Tool isn't being called when it obviously should be

  • Cause: the tool's description is too vague — Claude doesn't know when to use it.
  • Fix: rewrite the description for the model (verbs and nouns from realistic prompts). Same discipline as a skill description.

Day 2 — RAG (Section 3)

Qdrant connection refused

  • Cause: Docker isn't running, or the container didn't start.
  • Fix: docker ps — confirm qdrant/qdrant is up. If not: docker run -p 6333:6333 qdrant/qdrant.

Vector dimension mismatch when upserting

  • Cause: the Qdrant collection's vector size doesn't match the embedding model's dimensions.
  • Fix: recreate the collection with the right size. OpenAI text-embedding-3-small is 1536; Ollama nomic-embed-text is 768. Set the size to whichever model you're using.

RAG answers are vague / off-topic

  • Cause: chunking, not the LLM. Chunks too big (retrieval is fuzzy) or too small (no context).
  • Fix: tune chunk size and overlap; re-ingest; re-evaluate against the same query. The recurring theme of Section 3.

Day 2 — MCP server (Section 4)

Claude Code doesn't see my MCP server

  • Cause: server not registered, wrong transport, or wrong URL.
  • Fix: check claude mcp list. Confirm you registered the correct URL and that the server is configured for Streamable HTTP (not the older HTTP+SSE transport).

Tool calls fail with a serialization error

  • Cause: the C# tool's input/output types don't round-trip cleanly to JSON (e.g. a property that can't be deserialized, or a cyclic graph from an EF entity).

  • Fix: use DTOs for tool inputs/outputs — same rule as your endpoints. Never expose an EF entity directly through a tool.


Day 2 — CI/CD (Section 5)

GitHub Actions can't find claude in the workflow

  • Cause: the runner doesn't have Claude Code installed yet.
  • Fix: add an npm install -g @anthropic-ai/claude-code step before invoking it; set ANTHROPIC_API_KEY from a repo secret.

The AI review costs more than expected

  • Cause: running the default model on every PR.
  • Fix: use --model claude-haiku-4-5 for CI reviews. Limit the diff size sent (skip large generated files, lock files, etc.).

Release-notes job runs on every push instead of tag push

  • Cause: workflow trigger.
  • Fix: scope the workflow to on: push: tags: ['v*'] (or use release: types: [created]).

Still stuck?

Capture: the exact command you ran, the full error output, and your OS / .NET / Node versions. Bring it to a TA — and we'll add the entry here so the next attendee doesn't hit the same wall.