Pre-launch · v0.5.0.1 · First applications shipping soon

I built my own job-search agent.

Build started April 13, 2026 · Active development · Chicago

After 20+ years in operations, I'm in Act 2, looking for what comes next. I could've used a generic resume builder and a job board. Instead I built the system I would've wanted at every transition: compounding company research, HITL at every gate, three deliverables per submission, and an agent that runs from my Mac because the cloud kept hitting Cloudflare.

The problem

Job search is broken in a specific way: every application starts from zero. You research the company on Tuesday, write a tailored resume on Wednesday, submit it on Thursday. Two weeks later you spot a different role at the same company, and your Tuesday research is gone. Or worse, you have it but it's buried in a Notion page you can't find.

RAG doesn't fix this. Vector search retrieves chunks but synthesizes nothing permanent. Every query starts fresh; nothing accumulates. That works for chatbot Q&A. It doesn't work for you, applying to 200 jobs over six months. By application 50 you should be smarter than at application 1, and the system should make that automatic.

And then there's the application package itself. Most people send a tailored resume. Done. That's a 1-out-of-3 shot. A real submission needs three things ready before you press send: a tailored resume, a cover letter that explains the transition, and an outreach sequence to a hiring manager who knows you exist. Mismatch any of the three and the whole submission gets weaker.

The design constraints

Compounding knowledge, not RAG. Every pipeline run produces a permanent markdown artifact in wiki/. The 10th application is generated with dramatically better training data than the 1st, without any extra effort from me.
Human-in-the-Loop at every gate. No autonomous sending, submitting, or scheduling. The Klarna case study sits over my shoulder: AI that optimized for the measurable metric (speed) while destroying the unmeasurable one (trust). I stay in the loop until the agent has absorbed enough of my taste to be trusted semi-autonomously.
Three deliverables per submission. No job moves to outreach_sent until tailored resume, cover letter, AND outreach sequence are all generated and approved. The three either land together or the submission is incomplete.
Atelier, not SaaS. The UI is a workbench for craft work: newspaper-front-page references, engineering notebooks, gold accents on warm paper. Not a dashboard. The aesthetic reinforces the register.

The stack

Database

Supabase

Postgres for structured operational data: jobs, career_profile, application status. RLS for multi-tenant readiness.

Qualitative memory

Open Brain (OB1)

Supabase + pgvector + MCP. Stores stories, captures, self-assessments. Any AI tool can query it.

Knowledge layer

LLM Wiki

Markdown codebase compiled by the LLM. Companies, contacts, roles, bullets, job-title profiles. Compounds across runs.

Claude

Sonnet for generation + scoring. Haiku for fast classification. Same models powering every Desk in the ecosystem.

Discovery

Local Puppeteer agent

Headless Chromium + stealth plugin running on my residential IP. WTTJ + RemoteOK processors. Mac-on, scraper-on.

React + Tailwind + Vite

The Desk: masthead, Wire ticker, Pipeline kanban, Agent Monitor, Coach Me, Brain panel. Atelier register.

Hosting

Vercel

Web app at desk.alfonsoherrada.com. Public demo at operators-desk.

Dev workflow

Claude Code + gstack

Every line of code written through Claude Code. /plan-eng-review, /cso, /investigate skills as standard practice.

How a job becomes an application

1. Discovery (autonomous)

The local agent runs on a 22-hour throttle, fetching listings from Welcome to the Jungle and RemoteOK. Each scrape produces structured rows in jobs_raw: title, company, URL, snippet, source. No Apify, no cloud workers, no Cloudflare 403s. Just a real browser on a residential IP doing what a person would do.

2. Qualification (LLM-driven)

The qualifyJd processor pulls each new job, hits Claude with the job description and a fit-scoring prompt, and produces a numerical score plus rationale. Scores ≥ 70 enter the pipeline; below that, archived with reasons. Crucially: the rationale is permanent, written to wiki/companies/[company].md. If I see a different role at the same company two weeks later, the company context is already there.

3. Resume generation (HITL)

This is the hardest part. The naïve approach is "compare Alfonso's history to the JD, generate bullets." That produces resumes constrained by how I originally described my work. The better approach: start from the job-title family profile (what does a strong Operations Analyst resume look like?), backward-engineer my real story into that template, then run Coach Me only to fill the gaps the profile leaves open.

Resume v1 is the HITL trigger point. Coach Me is the extraction mechanism, not the starting point. The system asks targeted questions ("what was the team size?", "what was the metric?") instead of open-ended ones. Every approved bullet writes back to wiki/bullets/[achievement-type].md with metadata. Status ladder: baseline → approved → exemplar.

4. Three-deliverable package

For each job that progresses past resume approval, the pipeline produces a cover letter (anchored to the same JD context) and an outreach sequence (3 to 5 messages targeting a specific hiring manager LinkedIn profile). All three are reviewed in the same surface (JobDocsEditor) before anything moves to "ready to send."

5. The Wire (live activity)

Across the top of the app, a ticker shows the agent's heartbeat: jobs scraped, scores assigned, drafts ready, my own approvals. It runs at the masthead level so I can see the system breathing without opening any specific surface. The masthead and ticker timestamps recently moved from UTC to Central time. When my watch says 19:30, the Desk says 19:30, and I don't have to do timezone math in my head.

The architectural decisions that mattered

Career Desk has 10 ADRs (architectural decision records). Five of them carry most of the weight:

ADR-001 · 2026-04-13

Three-Layer LLM Wiki Paradigm (not RAG)

RAG is amnesiac. Every query re-synthesizes from raw documents, no knowledge accumulates. Career Desk treats the LLM as a continuous compiler managing a markdown codebase, producing permanent artifacts instead of transient chat histories. Week-1 research is enriched by week-6 research; nothing re-derives from scratch.

ADR-002 · 2026-04-13

Open Brain as Layer 1 Qualitative Memory

Two memory systems would create silos. Open Brain holds the qualitative layer (stories, insights, self-assessments). Supabase holds the structured layer (jobs, career_profile). Resume bullet generation queries Open Brain first; Coach Me fills genuine gaps only.

ADR-003 · 2026-04-13

Human-in-the-Loop at every approval gate

Klarna's customer-service AI optimized for resolution speed and destroyed customer trust. The pipeline learns my taste through the approval record. Autonomy increases as the record builds. Never granted upfront.

ADR-004 · 2026-04-14

Application Package = 3 Deliverables Per Job

Resume + cover letter + outreach sequence, all anchored to the same job record. None of the three move forward until all are generated and approved. Submission incompleteness is treated as a state, not an oversight.

ADR-007 · 2026-04-14

Job title family profiles as resume template engine

Generating resumes from "Alfonso's history → mapped to JD" produces drafts constrained by my original phrasing. Generating from "what does a strong Ops Analyst resume look like?" → backward-fill my real story produces stronger first drafts. Coach Me fills placeholders, not blank pages.

The build, week by week

Apr 13

Architecture lock. ADR-001 through ADR-003. Wiki Paradigm chosen over RAG. Open Brain integrated as Layer 1. HITL non-negotiable.

Apr 14

Three-deliverable model + Teal standard. ADR-004 through ADR-009. Resume generation locked to Teal XYZ methodology. Job-title family profiles become the template engine.

Apr 15-20

Schema + first scrapers + Coach Me v0. Supabase migrations, Apify-based scrapers for WTTJ + Wellfound + RemoteOK, first Coach Me extraction loop. Bullet library wiki structured.

Apr 21-25

The Desk UI. Masthead + Wire ticker + Pipeline kanban + Agent Monitor + Coach Me surface. React + Tailwind. Editorial register: warm paper, Playfair, gold accents.

Apr 26 · v0.5.0.0

Apify exit, local agent in. After two ships of Apify-based scrapers hit Cloudflare 403s, selector drift, and version-build cache hell, ripped out Apify entirely. Migrated all discovery to a local Puppeteer-extra worker on my residential IP. WTTJ live; Wellfound dropped (DataDome too aggressive even for stealth plugins).

Apr 26 · v0.5.0.1

Central time everywhere. Wire ticker, Agent Monitor, masthead, agent log all read in Chicago time instead of UTC. Database storage stays UTC; only the display layer converts. 22 new tests for the time helpers (DST/CST, invalid-date guards, branch boundaries).

May 3

Pre-launch state. 87 tests passing. 10 ADRs. Discovery + qualification + resume + cover letter + outreach pipelines all running. First HITL-approved application package targeted for the next 7 days.

What's built

architectural decisions

tests passing

agent processors

deliverables per job

v0.5

current version

~3wk

build duration so far

What success looks like

Pre-launch: I've not yet sent the first HITL-approved package. What I'm watching for:

Coverage: 30+ qualified jobs in the pipeline within the first 14 days of running. The throttle is conservative; if real-world drift drops me below 30, the discovery layer is the bottleneck.
Velocity: resume + cover letter + outreach package generated, reviewed, and approved in under 30 minutes per job. Faster than that and I'm not reading carefully; slower and the system isn't pulling its weight.
Match score: 85-90% match on Teal's 15-check gate after Coach Me fills placeholders. The job-title family profile architecture is supposed to start drafts at 90-100% match. If first drafts land at 70%, the profile is the problem; if at 95%, the model's working.
Compounding: the 10th application meaningfully better than the 1st without extra effort. Bullet library status ladder (baseline → approved → exemplar) should fill out organically.
Outcome: first response to outreach within 14 days of first send. First interview within 30. First offer when it lands. The outcome metric is downstream of all four above.

↻ I'll add a "30 days in" section once first applications go out.

What I'd do differently

Skip Apify entirely from day 1. Two full ships built on Apify before the architecture failed too often to be reliable. Cloudflare 403s on data-center IPs, selector drift in cloud Puppeteer, version-build env-var caching surprises. The local agent on a residential IP just works. The cost of "Mac off = no scrape" is acceptable for a one-person system; the cost of fighting cloud bot mitigation isn't.
Lock the time-display layer earlier. Six weeks of UTC timestamps in the UI made me do mental tz math on every job. Should have pinned to America/Chicago in week 1, not week 6.
Don't ship duplicate utility code. Two separate relTime implementations existed in two components for weeks before consolidation. Single source of truth from the start would have caught the Math.round → Math.floor bug earlier.
Build the bullet library wiki structure on day 1. ADR-005 was day 2; the first ten bullets weren't structured to the status ladder. Retrofitting metadata onto existing bullets was lower-priority work that piled up.

What unlocked the speed

The Wiki Paradigm itself. Once it clicked that the LLM is a compiler managing markdown (not a chat assistant retrieving context), every architectural decision after became simpler. Every output goes somewhere permanent.
ADRs from day 1. Decision Log up to 10 entries by week 2. When a question comes up ("why did I pick this?"), the answer is one grep away. Decision drift would have eaten weeks otherwise.
Claude Code's plan/review skills. /plan-eng-review, /cso, /investigate on every meaningful change. The Apify-to-local migration was a 6-hour refactor with full review coverage instead of a 3-day debug spiral.
Open Brain in place before Career Desk. The qualitative memory layer existed before the structured layer. Bullets had a place to call from before the wiki was rebuilt around them.
Atelier-not-SaaS aesthetic locked early. The design system constraints (warm paper, Playfair Display, no gradients, gold reserved for accent) meant I never spent time picking a button style. The reference is "engineering notebook"; everything else falls out.

LLM Wiki ParadigmOpen BrainHITLJob-search agentPuppeteer stealthSupabaseClaude APIReactAI-assisted dev

Next build →

Content Desk

A marketing operating system implemented as Claude Code slash commands. Five skills, voice gate, dual-output principle: every session produces both content and system training.

Read the build →