A Talent Discovery Platform for Hawaii's Professional Network

AEP Hawaii needed to find qualified professionals across Hawaii’s professional network. Fast. They had candidate data from multiple sources, but no way to search it intelligently. Traditional keyword search missed qualified people because resumes and profiles don’t use consistent language. And the team needed results in weeks, not months.

We built an MVP for their talent search platform with two interfaces serving different workflows. The web dashboard for recruiters who think in filters and lists. An MCP server for AI agents that think in natural language queries. Same data, same search engine, two ways in.

Seventeen days from start to deployed.

The data problem

The raw data came from two sources: a resume parsing service and a profile enrichment platform. Both produced different formats, different fields, different levels of completeness. Some candidates had GitHub profiles. Some had LinkedIn. Some had both. Some had neither.

We consolidated 498 candidates into a normalized 27-field schema: name, location, current organization, title, skills, education, and contact information across multiple channels. The import pipeline handled deduplication, skill categorization, and entity normalization, turning messy multi-source data into a clean, queryable dataset.

The normalized data landed in two stores: SQLite for structured queries (full-text search via FTS5 virtual tables) and ChromaDB for semantic vector search.

Simple keyword search fails for talent discovery. A recruiter searching for “React developer” misses the candidate whose profile says “built single-page applications with JavaScript frameworks.” And a pure vector search returns semantically similar but irrelevant results when the query is too broad.

We built a hybrid scoring system that combines three signals:

Vector similarity (60%): Sentence transformer embeddings (all-MiniLM-L6-v2) indexed in ChromaDB. This handles the semantic gap: “machine learning engineer” matches candidates with “data scientist” experience because the concepts are close in embedding space.

Pattern matching (30%): Rule-based weighted scoring with domain-specific boosts. Hawaii location gets a 5x weight. Tech skills like React, Python, and JavaScript get 4x. Startup and founding experience gets 5x. Senior roles get 3x. These weights encode domain knowledge that embeddings alone don’t capture.

GitHub enrichment (10%): Real-time GitHub API lookups with caching. Repository count, commit activity, language distribution, and stars provide a signal for technical depth that resumes often understate.

The final score: (vector × 0.6) + (patterns × 0.3) + (github × 0.1).

The hybrid scoring surfaces candidates that pure keyword search misses while keeping results grounded in the specific requirements. A search for “startup CTO in Hawaii with Python experience” uses all three signals: semantic similarity finds broadly relevant profiles, pattern matching boosts Hawaii-based founders with Python skills, and GitHub data validates technical depth.

Two interfaces, one engine

The web dashboard. An Express backend serving a search UI with natural language input and quick-search categories. Recruiters type queries, get ranked candidate cards with match scores, and drill into full profiles. The interface obfuscates email addresses by default (privacy measure) and provides category filters for common searches.

The MCP server. A Python FastMCP service exposing two tools: search_people for hybrid search and get_person_details for full profile retrieval. AI agents can query the candidate database directly. A recruiter using Claude can ask “find me senior engineers in Honolulu with startup experience” and get structured results without switching tools.

Both interfaces hit the same search engine. The MCP server uses the full hybrid pipeline (vector + patterns + GitHub). The web dashboard uses SQLite FTS5 with structured query support. Different access patterns, consistent results.

Deployment

The web application deployed to Fly.io on shared compute in the LAX region. The MCP server deployed to Kubernetes via a custom MCPService CRD with auto-scaling (0-3 replicas, target concurrency of 5). Both services run health check endpoints and force HTTPS.

The MCP server connects to the NimbleBrain MCP gateway, making it available to any agent workspace with the right credentials.

Timeline

Seventeen days. September 5 through September 22, 2025.

  • Days 1-3: Data pipeline. Consolidating, normalizing, and importing candidate data from multiple sources into the 27-field schema.
  • Days 4-8: Search engine. ChromaDB vector indexing, pattern matching rules, GitHub enrichment integration, hybrid scoring calibration.
  • Days 9-13: Interfaces. Express web dashboard, FastMCP server, API design, search UX.
  • Days 14-17: Deployment, testing, handoff.

498 candidates indexed across four search categories, available through the web dashboard and MCP server, all running on the same search engine.

Have a similar problem? Let's talk.

Or email directly: hello@nimblebrain.ai