Rugved Chandekar
I design and deploy production-grade AI systems — RAG pipelines, agentic orchestration engines, and ML infrastructure — that automate complex workflows at scale. Shipped at Idyllic Services with ~99% token reduction and ~10× throughput improvement.
const engineer = {
focus: "AI Systems",
shipped: ["RAG", "Agents", "AWS"],
tokenReduction: "~99%",
throughput: "~10×",
research: "IEEE 2026"
};
// systems that scale ↗▌
Chhatrapati Sambhajinagar, India
Rugved Chandekar —
I design systems that think.
I'm an AI Systems Engineer specializing in production RAG pipelines, agentic orchestration, and ML infrastructure on AWS. At Idyllic Services Pvt. Ltd, I architected a supervisor-led agentic AI system for JD-driven candidate sourcing — reducing LLM token costs by ~99% and improving throughput ~10× through async orchestration and targeted retrieval.
My work sits at the intersection of backend engineering and applied ML. I choose tools based on constraints, not familiarity — OpenSearch over Pinecone for operational control at AWS scale; XGBoost over deep learning when interpretability outweighs marginal accuracy gains; Bedrock over self-hosted LLMs when operational overhead matters more than per-token cost.
I think in pipelines: input → orchestration → retrieval → synthesis → output. I design for failure cases — rate limits, embedding drift, latency spikes — and I optimize for the constraints that matter in production.
on live agentic AI pipeline
via async orchestration
on live production system
in algorithms & problem-solving
Systems I Design & Deploy
End-to-end engineering — from architecture decisions to production deployment. I own the system, not just the code.
RAG & LLM Pipeline Engineering
Design and deploy production RAG systems — vector indexing, retrieval orchestration, LLM routing, and response synthesis. Built on AWS with OpenSearch and Bedrock.
- Chunking strategy & embedding pipeline
- OpenSearch / vector store integration
- LLM orchestration & prompt engineering
- AWS ECS Fargate deployment
Agentic AI & Workflow Automation
Architect supervisor-worker agentic systems that handle multi-step reasoning tasks end-to-end — with token-efficient orchestration and async execution. Proven: 90% effort automation on live systems.
- Supervisor-led agent architecture
- Token-efficient LLM orchestration
- n8n / custom pipeline automation
- 24/7 production uptime
Full-Stack AI Application
From ML model training to REST API design to deployed web application — I own the entire stack. Designed for real throughput requirements, not toy demos.
- ML model training & evaluation
- Flask REST API with auth & rate limiting
- Database design & query optimization
- Containerized cloud deployment
ML Model Deployment & API
Take an ML model from notebook to production API with monitoring, error handling, and scalable inference. Chose XGBoost over deep learning when interpretability and latency matter more than marginal accuracy.
- Model serialization & versioning
- REST inference API with latency SLAs
- SHAP explainability layer
- Docker & AWS deployment
AI & Systems Infrastructure
The production toolkit behind every system I ship.
Experience & Impact
Production systems built, scaled, and shipped.
Associate Developer — AI Systems
Idyllic Services Pvt. Ltd
- Architected a supervisor-led agentic AI system for JD-driven candidate sourcing — multi-step LLM orchestration with structured output parsing and retry logic
- Reduced LLM token consumption by ~99% through pipeline redesign: replaced brute-force calls with targeted retrieval + focused, context-bounded prompts
- Achieved ~10× throughput improvement via async orchestration — replaced sequential API calls with parallel execution and intelligent caching
CP & DSA Lead
Hackslash Community
- Conducted 5+ workshops, coding competitions & hackathons
- Mentored 300+ students in DSA & problem-solving
- Built structured curriculum for competitive programming
B.E. Information Technology
Govt. College of Engineering, Chhatrapati Sambhajinagar
- CGPA: 7.27
- Active in competitive programming & hackathons
Production Systems That Solve Real Problems
Each system is live, documented with architecture decisions, and built around real constraints — not portfolio demos.
Raghavendra Swami Mutt — Automated Booking System
Problem: A religious institution managing hundreds of daily seva bookings manually — prone to double-bookings, staff overhead, and processing delays.
System: Flask backend with MySQL (ACID transactions to prevent booking race conditions), REST API layer, and an admin dashboard for real-time management. Chose relational DB over NoSQL for transactional integrity on concurrent booking writes.
Impact: 90% manual effort eliminated, 95% reduction in per-booking processing time. Real users, 24/7 production uptime.
ResuGenie — Production AI Resume Intelligence Platform
Problem: HR teams manually scan resumes — a semantic matching problem that keyword search fails to solve.
Architecture: Flask API on AWS ECS Fargate → resumes chunked at sentence-level with overlap → embedded and indexed in OpenSearch → JD query triggers vector retrieval → Amazon Bedrock synthesizes match explanation. Chose OpenSearch over Pinecone for AWS-native operational control and lower egress cost at scale.
Pipeline: Upload → chunk → embed → index → query → retrieve → synthesize → score.
Explainable House Price Predictor
Problem: Black-box price predictions are useless in real estate — users need to know WHY a property is valued as it is.
System: XGBoost regression served via Flask REST API with SHAP value computation per prediction. Chose XGBoost over neural networks — better performance on tabular data, natively compatible with SHAP, sub-10ms inference latency at this scale. Traded marginal accuracy for full interpretability.
Impact: R² = 0.88 on held-out test data. Users see exact feature contributions per prediction.
Collaborative Code Editor
Problem: Shared coding sessions demand sub-100ms state sync across concurrent users without edit conflicts or data loss.
Architecture: Flask-SocketIO backend orchestrates room-based sessions. Persistent WebSocket connections eliminate polling overhead (100ms vs 3–5s latency). Socket.IO chosen over raw WebSockets for cross-browser reliability and automatic reconnection on network failure.
Challenge: Handling concurrent edit conflicts and cursor position sync — operational transformation ensures all clients converge to the same state.
Proof That Backs It Up
Research, competition wins, and published tools — signals that compound.
Production Infrastructure I Build On
Chosen for constraints, not trends. Each tool justified by the systems it enables.
Building something that needs
serious engineering?
I work on production AI systems, RAG pipelines, agentic automation, and ML infrastructure. If you have a real problem that needs a real system — let's scope it together. Fast response, clear communication, no noise.
☕