AboutBlogContactResumeStart a Project
Back to Blog
ResuGenie: Deploying an AI Resume Parser on AWS
Technical
January 15, 2026·10 min read·By Rugved Chandekar

Resu Genie: Deploying an AI Resume Parser on AWS

AWSRAGOpenSearchAmazon BedrockECS Fargate

How We Beat 50+ Teams and Won the Hackathon with an AI Resume Parser

HackathonLLMsTeamworkAI

ResuGenie started as a 24-hour hackathon project. It won first place against 50+ teams. Then came the real challenge: turning a hackathon prototype into a production AI system on AWS. The gap between those two things is wider than most people realize.

What ResuGenie Does

ResuGenie is an AI-powered resume intelligence system. It parses resumes, extracts structured information, and performs semantic matching between resumes and job descriptions. Not keyword matching — semantic matching. A resume that says "built distributed systems" should match a JD that says "experience with scalable infrastructure" even without keyword overlap.

The hackathon version was a Flask app running locally with a SQLite database and OpenAI API calls. It worked beautifully in the demo. It would have collapsed under any real load.

The Production Architecture

Moving to production required a complete architectural redesign:

┌─────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   AWS ECS   │────▶│  Amazon Bedrock  │────▶│  AWS OpenSearch │
│   Fargate   │     │  (Claude model)  │     │  (Vector Store) │
└─────────────┘     └──────────────────┘     └─────────────────┘
       │                                              │
       ▼                                              ▼
┌─────────────┐                           ┌─────────────────┐
│   AWS S3    │                           │  kNN Vector     │
│  (Storage)  │                           │  Similarity     │
└─────────────┘                           └─────────────────┘

AWS ECS Fargate: The Deployment Layer

The hackathon app was a single Python process. Production needed horizontal scaling, health checks, rolling deployments, and zero-downtime updates.

I chose ECS Fargate over EC2 for a specific reason: Fargate is serverless containers — you define the task, AWS manages the underlying infrastructure. No server provisioning, no AMI management, no capacity planning for the container host.

Key Fargate configuration:

# Task definition (simplified)
{
    "family": "resugenie-task",
    "networkMode": "awsvpc",
    "requiresCompatibilities": ["FARGATE"],
    "cpu": "1024",   # 1 vCPU
    "memory": "2048",  # 2 GB
    "containerDefinitions": [{
        "name": "resugenie",
        "image": "ACCOUNT.dkr.ecr.REGION.amazonaws.com/resugenie:latest",
        "portMappings": [{"containerPort": 8000}],
        "environment": [
            {"name": "OPENSEARCH_HOST", "value": "..."},
            {"name": "AWS_REGION", "value": "us-east-1"}
        ]
    }]
}

Amazon Bedrock: The LLM Layer

The hackathon version used OpenAI's API directly. Production moved to Amazon Bedrock for two reasons: data privacy (resumes are sensitive documents and we wanted them to stay within our AWS environment) and cost predictability.

Bedrock provides managed access to foundation models — Claude, Titan, Mistral — with IAM authentication instead of API keys, which integrated cleanly with our existing AWS security posture.

OpenSearch: The Vector Store

The core intelligence of ResuGenie is semantic matching — comparing embeddings of resume sections to embeddings of job description requirements. AWS OpenSearch (managed Elasticsearch) supports k-NN vector search natively.

Every resume section is embedded and stored. When a JD comes in, we embed each requirement and query OpenSearch for the closest matching resume sections. The results are scored and assembled into a match report.

def match_resume_to_jd(resume_sections, jd_requirements):
    scores = []
    for requirement in jd_requirements:
        req_embedding = embed_text(requirement)
        
        # Find closest resume section
        results = opensearch.search(
            index="resume-embeddings",
            body={
                "query": {
                    "knn": {
                        "embedding": {
                            "vector": req_embedding,
                            "k": 1
                        }
                    }
                }
            }
        )
        
        best_match = results["hits"]["hits"][0]
        scores.append({
            "requirement": requirement,
            "matched_section": best_match["_source"]["text"],
            "score": best_match["_score"]
        })
    
    return scores

The Challenges: Prototype to Production

Three specific problems the hackathon version didn't face:

  • Cold start latency: Fargate tasks take 30-60 seconds to start. For a demo this is fine. For production, you need minimum running tasks and auto-scaling policies that pre-warm capacity.
  • Embedding costs: Embedding every resume and JD has real API cost. We implemented caching — same text hash gets the cached embedding, not a new API call.
  • Resume format diversity: Hackathon resumes were clean PDFs. Production resumes arrive as .docx, .pdf with images, scanned documents, two-column layouts. Parsing became its own engineering problem.

What the Hackathon Didn't Teach

The hackathon taught me that the idea worked. Production taught me everything else: infrastructure, scaling, observability, cost optimization, error recovery, and the endless variety of real-world inputs.

If I had to summarize the lesson in one sentence: a prototype proves the concept; production proves the engineering. They require completely different skills.

Want to discuss AWS architecture, RAG pipelines, or AI system design? I'm always up for a technical conversation.

Get In Touch
RC
Rugved Chandekar AI Systems Engineer @ Idyllic Services — AWS & RAG Specialist — IEEE Author