Engineering Deep-Dive

Under the hood.
How every tool actually works.

A technical reference for every AI feature on JustHigherEd — from RAG pipelines and vector search to SQL agents and streaming. Written so future-you can pick right back up.

Architecture

The 30,000-foot view

A Next.js frontend, a FastAPI backend, and two AI models. Nginx ties it together in production.

Browser

Next.js App

→

Nginx

Reverse Proxy

→

Next.js

Port 3000

FastAPI

Port 8000

FastAPI talks to these services

ChromaDB

Vector store · 557 programs

Claude Haiku 4.5

Pathfinder chat

Claude Sonnet 4.6

Enrollment analyst

SQLite

Enrollment database

programs.json

557 program records

BLS Data

Salary + job titles

Nginx reverse proxy

In production, a single Nginx instance listens on ports 80/443 and routes /api/* traffic to FastAPI on :8000 and everything else to Next.js on :3000. It also terminates HTTPS via Certbot.

Next.js (App Router)

Client-rendered pages in TypeScript with Tailwind. Uses Clerk for authentication. Talks to the backend exclusively via /api routes, sending JWT tokens for protected endpoints.

FastAPI backend

Async Python with five routers: chat, programs, tuition, enrollment, students. Handles streaming via Server-Sent Events (SSE). Verifies Clerk JWTs on protected routes.

Tool 1 · Pathfinder

AI Program Advisor — RAG pipeline

Pathfinder blends Retrieval-Augmented Generation with a hybrid search engine to give every student a personalised conversation grounded in real program data.

Request → Response pipeline

💬

Step 1

User sends a message

The frontend opens a streaming SSE connection to POST /api/chat, sending the message, session ID, and an optional student ID.

🎓

Step 2

Degree level detection

detect_degree_level() scans the message for keywords like 'bachelor', 'masters', 'doctoral', or 'certificate' to filter searches downstream.

🔍

Step 3

Hybrid search on ChromaDB

get_relevant_programs_hybrid() combines semantic embeddings (60%) with keyword overlap scoring (40%) to retrieve the top 3-5 most relevant programs.

📋

Step 4

Context assembly

build_context_message() packages the retrieved programs, BLS career data, student profile (if logged in), and conversation history into a single Claude context message.

🤖

Step 5

Claude Haiku generates a response

Claude Haiku 4.5 (claude-haiku-4-5-20251001) runs with a carefully crafted system prompt. Max 1 024 tokens per turn for snappy UX.

⚡

Step 6

Streaming to the browser

chat_stream() yields each token as it arrives from the Claude API, wrapped in an SSE 'message' event. The frontend renders tokens in real time.

Hybrid search score formula

score = (0.6 × semantic_score)
     + (0.4 × keyword_score)

semantic_score — cosine similarity of query embedding vs stored program embedding (ChromaDB).
keyword_score — fraction of query words that appear in the program name or description.

System prompt — key rules

Persona: "Pathfinder", a friendly, knowledgeable guide
Always ask about degree level early if unknown
Never overwhelm — show max 3 programs at once
Cite BLS for every salary figure
Call the school "Carvard University", never "University of Utah"
Trigger handoff for: financial aid, visas, disabilities, emotional distress

Current-student personalisation

When a student logs in with their ID, format_student_context() injects their name, GPA, credits completed, enrollment status, interests, and up to 10 recent courses directly into the context window — so Pathfinder can give advice that accounts for where they actually are in their degree.

Session management

The backend stores one PathfinderChatbot instance per session in an in-memory Python dict (chatbot_sessions). Each instance holds conversation history, the inferred degree level, and the handoff flag. In production this should migrate to Redis for horizontal scaling.

backend/routers/chat.py — simplified stream handler

@router.post("/api/chat")
async def stream_chat(req: ChatRequest):
    chatbot = get_or_create_session(req.session_id, req.student_id)

    async def event_generator():
        async for token in chatbot.chat_stream(req.message):
            yield f"event: message\ndata: {token}\n\n"
        yield f"event: done\ndata: {json.dumps({'handoff': chatbot.handoff})}\n\n"

    return StreamingResponse(event_generator(), media_type="text/event-stream")

Tool 2 · Compare Programs

Side-by-side program comparison

A direct API call — no AI involved. Raw program data is fetched, enriched with BLS salary figures, and laid out in a structured table.

How a comparison is built

🖱️

Step 1

User selects up to 3 programs

A search-as-you-type input queries GET /api/programs/search. Results are filtered by degree type if the user has picked one.

📡

Step 2

POST /api/programs/compare

The frontend sends the list of selected program names. The backend fetches each from programs.json in a single pass.

💰

Step 3

Tuition estimation

For each program the backend calls the tuition router internally to compute resident vs. non-resident totals based on credits and the current per-credit rate.

📊

Step 4

BLS salary enrichment

program_occupation_map.json maps each program to a BLS Standard Occupational Classification. The backend looks up median annual wage and projected job openings.

🗂️

Step 5

Required courses list

Each program record in programs.json includes a parsed list of required courses from the university catalog PDF. These are returned as-is and displayed per column.

Data source: programs.json

Parsed from the official University of Utah 2025–2026 Academic Catalog PDF (18 MB) using a custom parser.py script.

557 programs indexed
Fields: name, degree type, credits, description, required courses, admissions requirements, contact info
Re-run parser.py whenever the catalog updates

What the compare response returns

Program name + degree type + total credits
Tuition estimate (resident / non-resident)
Required courses list
Admission requirements summary
BLS job title(s) + median annual salary
Program website URL

No AI involved — intentionally

Compare is deterministic by design. Students need reliable, verifiable facts when making a multi-year financial decision. There is no LLM inference in the compare flow — just structured data fetched and formatted.

backend/routers/programs.py — compare endpoint (simplified)

@router.post("/api/programs/compare")
async def compare_programs(body: CompareRequest):
    results = []
    for name in body.programs:                      # up to 3 names
        prog   = programs_db.get(name)              # O(1) lookup in programs.json
        tuition = estimate_tuition(prog)            # per-credit × total credits
        careers = occupation_map.get(name, [])      # BLS job titles + salaries
        results.append({**prog, "tuition": tuition, "careers": careers})
    return {"programs": results}

Tool 3 · ROI Calculator

Financial modelling — cost vs. salary

Pure math, real data. The calculator models total program cost and projects career earnings from BLS to compute a break-even timeline.

Calculation pipeline

📝

Step 1

User enters inputs

Program name, residency status (in-state / out-of-state), housing type (on-campus dorm, off-campus apartment, or commuting from home), and any scholarship amount.

🏫

Step 2

Tuition calculation

POST /api/tuition/calculate — total credits × per-credit rate (resident vs. non-resident). Books and fees are added as a flat annual amount.

🏠

Step 3

Living cost modelling

Housing type maps to an annual living expense figure (derived from university cost-of-attendance data). Multiplied by program duration in years.

📉

Step 4

Scholarship deduction

The scholarship amount is subtracted from total cost. It is assumed to apply evenly across years.

📈

Step 5

BLS salary projection

The occupation map returns the BLS median annual wage for the program's career path. Entry-level is estimated at 80% of median; mid-career at 100%.

🔢

Step 6

Break-even timeline

payback_years = net_cost ÷ median_salary. The UI then renders a year-by-year cumulative earnings chart showing when investment is recovered.

Cost model variables

Resident tuition

~$6 000/yr (per-credit)

Non-resident tuition

~$21 000/yr (per-credit)

Books + fees

~$1 200/yr (flat)

On-campus housing

~$12 000/yr

Off-campus housing

~$15 000/yr

Commuter living cost

~$6 000/yr

BLS salary data source

Salaries come from program_occupation_map.json — a hand-curated mapping from program name to BLS Standard Occupational Classification (SOC) codes. Each SOC entry includes the latest BLS median annual wage and employment outlook (projected 10-year job growth %). Data was pulled from the BLS Occupational Outlook Handbook.

No AI in the hot path

Like Compare, the ROI Calculator is deterministic. All calculations happen in the FastAPI tuition router — no LLM call. This keeps latency under 200 ms and results fully reproducible.

Tool 4 · Enrollment Analyst

Text-to-SQL — asking questions of enrollment data

University leadership types a plain-English question. Claude Sonnet 4.6 interprets it, writes a SQL query, executes it against a SQLite database, and returns a formatted answer.

Text-to-SQL pipeline

❓

Step 1

User types a plain-English question

Example: 'Which programs had the biggest enrollment drop between 2022 and 2024?'

📚

Step 2

Schema context injected

schema_context.py builds a text description of every table and column in enrollment.db. This is prepended to the Claude prompt so the model knows what it can query.

🤖

Step 3

Claude Sonnet 4.6 generates SQL

The agent prompt instructs Claude to return valid SQLite SQL wrapped in a code block. Claude Sonnet 4.6 is used here (not Haiku) because enrollment queries can be complex multi-join statements.

🔎

Step 4

Answerability check

Before executing, the agent checks if the question is within scope. Out-of-scope questions (e.g. 'what is the meaning of life') return an answerability: false flag instead of running SQL.

🗄️

Step 5

SQL executed on SQLite

database.py runs the extracted SQL against enrollment.db using the Python sqlite3 module. Results are returned as a list of dicts.

📝

Step 6

Claude formats the final answer

A second Claude call takes the raw query results and renders them as a clear, human-readable answer with context — not just a table dump.

Enrollment database schema

SQLite database at backend/enrollment/enrollment.db. Key tables:

enrollment — headcount per program per semester/year
retention — cohort retention rates by year and program
yield — admitted → enrolled conversion rates
demographics — breakdown by gender, ethnicity, residency

Why Sonnet and not Haiku here?

Enrollment questions often require multi-table JOINs, GROUP BY aggregations, window functions for year-over-year deltas, and careful column aliasing. Claude Haiku (optimised for conversational speed) occasionally produces subtly incorrect SQL for complex analytical queries. Sonnet 4.6 is reliably accurate for these — and since administrators ask far fewer questions per minute than students chat with Pathfinder, the higher per-token cost is acceptable.

Safety guardrails

Claude is instructed to generate only SELECT statements
The database connection is opened read-only
Out-of-scope questions are flagged before any SQL runs
Error responses from SQLite are caught and surfaced cleanly

backend/enrollment/agent.py — simplified agent loop

def answer_enrollment_question(question: str) -> dict:
    # 1. Inject schema context into the prompt
    prompt = SCHEMA_CONTEXT + "\n\nQuestion: " + question

    # 2. Claude Sonnet generates SQL
    response = claude.messages.create(
        model="claude-sonnet-4-6",
        messages=[{"role": "user", "content": prompt}]
    )
    sql = extract_sql_block(response.content[0].text)

    # 3. Execute (read-only SQLite connection)
    if sql:
        rows = db.execute(sql)
        # 4. Claude formats the final answer
        return format_answer(question, rows)
    else:
        return {"answerable": False, "answer": "I couldn't find that in the enrollment data."}

Data Layer

Where the data comes from

No hallucinations. Every fact shown to students or administrators traces back to one of these real data sources.

Academic Catalog PDF

program_catalog.pdf

18 MB

The official Carvard/University of Utah 2025–2026 undergraduate and graduate catalog. Source-of-truth for every program.

Used by: Pathfinder, Compare Programs
Note: Parsed into programs.json + chroma_db/ by parser.py and db.py

Program Catalog JSON

data/programs.json

557 programs

Structured extraction of the PDF. Fields: name, degree type, total credits, description, required courses, admission requirements, contact info.

Used by: All four tools
Note: Generated by parser.py from the PDF

ChromaDB Vector Store

chroma_db/

557 embeddings

Each program's name and description is encoded as a dense vector embedding using ChromaDB's default sentence-transformers model. Enables semantic similarity search.

Used by: Pathfinder (hybrid search)
Note: Built by db.py — re-run to refresh after programs.json changes

BLS Occupation Map

data/program_occupation_map.json

400+ mappings

Hand-curated mapping from program name to Bureau of Labor Statistics SOC codes, job titles, median annual wages, and 10-year job growth projections.

Used by: Pathfinder, Compare Programs, ROI Calculator
Note: Manually maintained — update from bls.gov when new Occupational Outlook data releases

Enrollment Database

backend/enrollment/enrollment.db

SQLite

Structured enrollment data: headcount by program and semester, retention rates, yield rates, demographic breakdowns. Used exclusively by the Enrollment Analyst.

Used by: Enrollment Analyst
Note: Schema defined in schema_context.py; seeded with synthetic data for demo

Student Profiles

students/students.json

Synthetic

Mock student records for demonstration. Each record has: student_id, name, GPA, credits, enrollment status, current program, completed courses, interests.

Used by: Pathfinder (current-student mode)
Note: Test login: student ID u1000015

RAG Deep Dive

Retrieval-Augmented Generation explained

RAG means the AI model doesn't answer from memory — it first fetches relevant facts, then generates a response grounded in those facts.

Pathfinder RAG flow — one turn of conversation

💬

User message

"I want a master's in data science"

↓

🎓

Degree detection

detect_degree_level() → "masters"

↓

🔍

Hybrid search

ChromaDB: cosine similarity (60%) + keyword overlap (40%)

↓

📋

Top-3 programs retrieved

e.g. MS Data Science, MS Statistics, MS Computer Science

↓

🧩

Context assembly

Programs + BLS salaries + student profile + conversation history

↓

🤖

Claude Haiku 4.5

System prompt + context + user message → generates response

↓

⚡

SSE stream to browser

Tokens yielded one by one — frontend renders in real time

Why not just use a general LLM?

A general LLM has no knowledge of the specific programs offered at Carvard, their credit requirements, deadlines, or costs. RAG injects that proprietary context at query time.

Why use embeddings at all?

Keyword search fails when students use different words than the catalog (e.g. "machine learning" vs "data science"). Embeddings capture semantic meaning, so the right programs surface even with varied phrasing.

Why keep retrieval to 3–5 programs?

Context window costs grow linearly with retrieved text. More importantly, users get overwhelmed. Pathfinder's UX principle is to present a curated shortlist, not dump everything that matched.

Stack

Every technology used

A deliberate stack — each choice made for a specific reason.

Next.js 14+

Frontend framework

App Router, server components, and built-in routing. Deployed as a standalone Docker container.

TypeScript

Frontend language

Type-safe API calls and props prevent entire classes of runtime bugs.

Tailwind / inline styles

Styling

Inline styles used for component-level isolation; no CSS modules needed for a single-app portfolio.

Clerk

Authentication

Managed auth with pre-built UI. JWT tokens verified on the backend via the Clerk SDK.

FastAPI

Backend framework

Async Python, automatic OpenAPI docs, native SSE support via StreamingResponse.

Anthropic SDK (Python)

AI inference

Official SDK with streaming support. claude-haiku-4-5 for Pathfinder; claude-sonnet-4-6 for Enrollment Analyst.

ChromaDB

Vector database

Lightweight, file-based vector store. Perfect for a single-node demo without running a separate database server.

SQLite

Enrollment database

Zero-config relational DB for structured enrollment analytics. python sqlite3 module — no ORM overhead.

Nginx

Reverse proxy

Routes /api/* to FastAPI, everything else to Next.js. Handles SSL termination via Certbot in production.

Docker Compose

Orchestration

Three services (nginx, backend, frontend) defined in docker-compose.yml. Single command to spin up the full stack.

Server-Sent Events

Streaming transport

One-way text stream from server to browser. Simpler than WebSockets for a unidirectional token stream. Supported natively by browsers.

BLS OOH data

Salary source

US Bureau of Labor Statistics Occupational Outlook Handbook. Authoritative, free, and updated every 2 years. Always cited in responses.

Design Decisions

Trade-offs worth remembering

Things that were deliberate — not accidental — so future-you knows why.

In-memory session storage for Pathfinder

chatbot_sessions is a plain Python dict in memory. It's fast and zero-dependency, but sessions are lost on server restart and can't scale horizontally. Production upgrade path: replace with Redis. The key design is already abstracted — just swap the dict for a Redis client.

Simplicity vs. durability

Haiku for Pathfinder, Sonnet for Enrollment

Students send dozens of messages per session; administrators send a handful of analytical questions per day. Haiku (fast, cheap) is right for conversational turns. Sonnet (powerful, expensive) is right for complex SQL generation where accuracy matters more than speed.

Cost vs. capability

Hybrid search (60/40) over pure semantic search

Pure semantic search misses exact-match cases — if a student types "CS 6350", a keyword approach finds it instantly while embeddings might not. The 60/40 blend was tuned empirically. The weights live as constants in chatbot.py — easy to adjust.

Recall vs. precision

No AI in Compare or ROI Calculator

Deterministic data retrieval was intentional here. Students making a multi-year, multi-thousand-dollar decision need reproducible, verifiable numbers — not a generated summary that might vary between runs. Trust is built through transparency.

Intelligence vs. trust

ChromaDB over Pinecone / Weaviate

ChromaDB persists to a local directory (chroma_db/). No external API keys, no network round-trips, no cost per query. For a single-server deployment with 557 vectors, this is the right call. At 100 000+ vectors, revisit Pinecone.

Managed service vs. simplicity

Synthetic student data

Real student PII cannot be used in a demo. The student profiles in students.json are fully synthetic but realistic — names, GPAs, courses, interests — generated to test the personalisation flow. In a real deployment, this would connect to the university's SIS via an authenticated API.

Demo safety vs. real integration

See it in action

Reading about it is one thing. Try the tools yourself — Pathfinder's RAG, the Compare table, the ROI model, and the Enrollment Analyst are all live.