$0M+Overcharge exposure

$0.00MCost avoidance

0%Debug time reduced

Rahul Reddy Talatala

Iautomate
the complex.
ImakeAI
work at scale.

GenAI Engineer building production-grade LLM systems, multi-agent pipelines, and AI infrastructure that ships with real, measurable impact.

Download CV

About

Building AI that ships,
not just demos.

I'm a GenAI Engineer who turns cutting-edge AI research into systems that actually ship. I've automated millions of dollars in operational waste, fine-tuned LLMs for production inference, and built agentic pipelines that work at enterprise scale, not just in notebooks. I care about real impact over cool demos.

Whether it's orchestrating multi-agent pipelines, fine-tuning LLMs for inference efficiency, or building telecom-scale RAG systems, I live at the intersection of research and production. Currently at Verizon, previously at Apple.

Say hello LinkedIn

🧠

Agentic Systems

LangGraph, MCP, multi-agent orchestration at enterprise scale

📊

RAG & Knowledge Graphs

Hybrid Graph RAG with Neo4j, Elastic, and pgvector

⚡

MLOps & LLMOps

NVIDIA Triton, LoRA fine-tuning, LangFuse observability

☁️

Cloud Infrastructure

AWS, GCP, Azure: Kubernetes, Terraform, CI/CD at scale

Experience

Where I've shipped.

GenAI Engineer

Current

Infinite Computer Solutions·Client: Verizon

Aug 2025 – PresentTampa, FL

LangGraphGeminiDSPy

Built a five-stage document-audit pipeline with Gemini 2.5 Flash and a 70-rule engine that matches invoice line items to purchase orders via hybrid search, surfacing $114–140M in annual overcharge exposure across 500–2,000 capital projects.
Fine-tuned Qwen2.5-3B with LoRA to mask sensitive financial fields on-prem before any data reaches a frontier LLM, then used DSPy to optimize extraction prompts — lifting structured-field accuracy from 74% to 91%.
Solved the cold-start problem for a 346-table text-to-SQL agent by building an offline pipeline that builds a 1,536-dim ChromaDB vector store and a 17,772-edge NetworkX join graph, enabling accurate retrieval from day one across 11.8M+ rows.
Designed a multi-agent FiOS planning system that clusters copper-circuit addresses with DBSCAN and coordinates six specialist agents into prioritized build plans, cutting manual fiber build planning effort by 65%.
Built a LangGraph routing agent that triages 278,350 tickets at 90% accuracy using vendor-specific, glossary-grounded LLM prompts — eliminating ~47,000 hours of annual L1/L2 work for a projected $2.46M in cost avoidance.

MLOps Engineer (Contract)

Apple Inc.·Data Platform Efficiency

Apr 2025 – Aug 2025Dallas, TX

LangGraphNVIDIA TritonApache Spark

Software & AI Engineer

Eminent Services Corporation

May 2024 – Apr 2025Frederick, MD

ReactNode.jsAzure DevOps

ML Research Assistant

University at Buffalo

Jan 2024 – May 2024Buffalo, NY

PyTorchPythonTransformers

Solutions Engineer

Swym Corporation

Aug 2021 – Aug 2023Bangalore, India

REST APIsJavaScriptE-commerce

Products

Things I've shipped.

prism-mem

LiveSolo build

Post-session memory for AI coding agents.

An open-source CLI tool I built solo. After every git commit, it reads Claude Code session transcripts and git diffs, extracts structured knowledge as (subject, predicate, object) triples, links them by cosine similarity, auto-detects stale facts, and rewrites CLAUDE.md, .cursorrules, and AGENTS.md. In a 3-session demo, it compressed 411,463 bytes of raw transcripts into 11,707 characters of structured knowledge — a 35x reduction.

kg-gensentence-transformersnumpySQLiteFastMCPFastAPILiteLLMPython

PyPI GitHub Learn more Read story

What it does

Extracts (subject, predicate, object) triples from Claude Code session transcripts and git diffs via kg-gen and an LLM
Links related triples by cosine similarity and marks contradicted facts stale automatically — no manual input
Rewrites CLAUDE.md, .cursorrules, and AGENTS.md after every git commit via a post-commit hook
MCP server exposes get_context, query_knowledge, and crystallize tools to Claude Code, Cursor, and Codex

ApplyAI

LiveSolo build

Apply to jobs 10× faster.

A full-stack job application assistant I built solo — Chrome extension, web dashboard, and AI backend. It autofills application forms using a LangGraph agent, tracks every application in one place, and scores your resume against each job posting.

LangGraphGeminiFastAPINext.jsChrome Extension MV3Supabase

Live App Chrome Store GitHub Read story

What it does

LangGraph DAG agent autofills entire forms — dropdowns, React Select, and file uploads included
Resume-to-job match scoring with matched and missing keyword breakdown
Application tracking dashboard with status pipeline, KPI cards, and charts
Job board discovery across Lever, Ashby, and Greenhouse via SERP search

Projects

Things I've built.

View all projects

apply-ai-extension

Full-stack job application assistant with a Chrome extension, LangGraph autofill agent, resume match scoring, and application tracking dashboard. Built solo and live on the Chrome Web Store.

LangGraphGeminiFastAPINext.js+2

GitHub Live

springcommerce

Production-grade microservices e-commerce platform built with Spring Boot, secured with Keycloak, orchestrated on Kubernetes, and monitored via Grafana + Prometheus.

Spring BootKubernetesGrafanaKeycloak+1

GitHub

secure-stack

End-to-end DevSecOps pipeline implementing a three-tier architecture on AWS EKS with ArgoCD, GitOps workflows, and automated security scanning.

AWS EKSArgoCDGitOpsDevSecOps+1

GitHub

atmo-flow

Real-time weather and air quality data pipeline on GCP using Cloud Composer (Airflow), PySpark transformations, and interactive visualizations.

GCPAirflowPySparkData Pipeline

GitHub

View all projects

Writing

Things I've written.

All posts on dev.to

May 31, 2026

prism-mem: Automatic Knowledge Extraction for AI Coding Agents

Built prism-mem, an open-source CLI that reads Claude Code session transcripts and git diffs after every commit, extracts (subject, predicate, object) triples via kg-gen, links them by cosine similarity, auto-detects stale facts, and rewrites CLAUDE.md automatically — no manual context updates needed.

AI AgentsKnowledge GraphPython

Read on dev.to

Mar 6, 2025

I Got Tired of Filling Out the Same Form 50 Times, So I Built an AI to Do It

Built ApplyAI, a Chrome extension that automates job application form filling using a LangGraph agent and Gemini AI, cutting the process from 10 minutes to under 10 seconds.

AI AgentsAutomationPython

Read on dev.to

Jan 20, 2025

AtmoFlow: Real-Time Weather and Air Quality Insights

End-to-end data pipeline on GCP using Cloud Functions, Pub/Sub, Dataproc, and BigQuery to ingest, process, and visualize environmental data with Looker dashboards.

Data EngineeringGoogle CloudPySpark

Read on dev.to

Mar 12, 2025

Spring Commerce: Building a Resilient E-commerce System with Spring Boot Microservices

A complete microservices e-commerce system with four core services communicating via REST and Kafka, secured with Keycloak, and monitored through the Grafana observability stack.

Spring BootMicroservicesKubernetes

Read on dev.to

Mar 12, 2025

Spring Core Fundamentals: A Beginner Guide

A practical guide to Spring's core concepts: dependency injection, inversion of control, bean management, and aspect-oriented programming for building maintainable Java apps.

JavaSpring BootBackend

Read on dev.to

All posts on dev.to

Skills

The toolkit.

Years of building at the frontier, from LLM fine-tuning to distributed data pipelines to cloud-native deployments.

Agentic & ML

LangGraphLangChainCrewAIAutogenPyTorchTensorFlowGoogle ADKMCPA2A

RAG & Knowledge

Hybrid RAGLlamaIndexNeo4jElastic Vector DBpgvectorCohere RerankSemantic Search

MLOps & LLMOps

RayONNXMLflowNVIDIA TritonLangSmithLangFuseKubeflowW&BAxolotl

Data Engineering

SparkAirflowKafkadbtSnowflakeTimescaleDBETL/ELT

Cloud & DevOps

AWSGCPAzureKubernetesTerraformDockerArgoCDCI/CD

Programming

PythonTypeScriptJavaSQLFastAPISpring BootNode.jsReactNext.js

Education

Where it started.

MS, Computer Science

University at Buffalo

Buffalo, NY

GPA 3.8 / 4.0

BTech, Computer Science

Vellore Institute of Technology

Chennai, India

GPA 3.9 / 4.0

Contact

Let's work
together.

I'm open to full-time roles and interesting collaborations. If you're building something ambitious in the AI space, I want to hear about it.

Or just chat with my AI. It knows everything about me.

rahul.talatala@gmail.com

linkedin.com/in/rahul-reddy-t

GitHub

github.com/rahult18

United States

Rahul Reddy Talatala

Iautomatethe complex.ImakeAIwork at scale.

Building AI that ships,not just demos.

Agentic Systems

RAG & Knowledge Graphs

MLOps & LLMOps

Cloud Infrastructure

Where I've shipped.

GenAI Engineer

MLOps Engineer (Contract)

Software & AI Engineer

ML Research Assistant

Solutions Engineer

Things I've shipped.

prism-mem

ApplyAI

Things I've built.

apply-ai-extension

springcommerce

secure-stack

atmo-flow

Things I've written.

prism-mem: Automatic Knowledge Extraction for AI Coding Agents

I Got Tired of Filling Out the Same Form 50 Times, So I Built an AI to Do It

AtmoFlow: Real-Time Weather and Air Quality Insights

Spring Commerce: Building a Resilient E-commerce System with Spring Boot Microservices

Spring Core Fundamentals: A Beginner Guide

The toolkit.

Where it started.

MS, Computer Science

BTech, Computer Science

Let's worktogether.

Iautomate
the complex.
ImakeAI
work at scale.

Building AI that ships,
not just demos.

Let's work
together.