Hello, I'm

Mohana Moganti

Software EngineerSan Jose, CA
Mohana Moganti

6+

yrs exp

6+

companies

10+

shipped

AI Engineer and builder of LLM-driven systems that work for real users.

Based in San Jose, CA. I've spent the last 6+ years building production systems across fintech, EdTech, and AI — from compliance platforms at Deloitte to founding-engineer work at Gembizz and Gen AI systems at NextPhase AI.

I care deeply about systems that are observable, maintainable, and actually solve real problems, not just impressive demos. My stack is Python-first, cloud-native, and increasingly agent-driven.

Published research on Hindi NLP. Outside engineering: basketball, sketching, and building things with my hands.

education

M.S. Software Engineering

San Jose State University

2023 – 2025 · GPA 3.6

B.E. Engineering

Osmania University, Hyderabad

2016 – 2020 · GPA 3.5

Featured
Dec 2024 – May 2025

ScorePAL: Agentic Evaluation Platform

Multi-agent evaluation system using CrewAI where agents autonomously decompose rubrics, retrieve context via Weaviate vector search, and score submissions using Gemini on GCP. Multimodal RAG over text and image inputs reduced manual grading effort by 60%.

PythonCrewAIWeaviateGeminiGCPPostgreSQLMultimodal RAG

Languages

PythonTypeScriptJavaScriptJavaScalaSQLGolangC++

AI / ML

LangChainCrewAIRAGGraphRAGRAPTORGeminiOpenAIStable DiffusionVector DatabasesAgentic Workflows

Frameworks

FastAPIDjangoFlaskSpring BootReactJSGraphQLApollo RouterSvelteTailwind

Databases

PostgreSQLMongoDBNeo4jWeaviateDynamoDBPineconeAstraDBChromaDBMySQL

Cloud & DevOps

AWSGCPAzureDockerKubernetesGitHub ActionsArgoCDPrometheusGrafana

Data

SparkDatabricksSnowflakeHadoopFlinkFeature EngineeringData Pipelines
NextPhase AI LLC
  • Design and ship production generative AI systems — LLM orchestration, agentic workflows, and retrieval-augmented pipelines that turn unstructured data into reliable, user-facing intelligence.
  • Build multimodal RAG stacks with embedding-based retrieval, hybrid search, and context assembly to ground model outputs and cut hallucinations in real workflows.
  • Evaluate and integrate foundation models (OpenAI, Gemini, open-weight) with offline benchmarks for relevance, latency, and cost — selecting configurations that scale in production.
  • Develop Python/FastAPI services on AWS and GCP with vector stores, observability, and CI/CD so Gen AI features deploy safely and iterate quickly.
Gembizz LLC
  • Built LLM-driven workflows that convert raw user input into structured business profiles and narrative-style stories, powering personalized discovery and community engagement for 1,000+ active users.
  • Developed recommendation pipelines leveraging user interaction signals and embedding-based similarity to customize content and profile ranking.
  • Designed service APIs in Python (FastAPI) and TypeScript, integrated with MongoDB Atlas and AWS for inference, context assembly, and response orchestration.
  • Operated containerized services with production observability, telemetry, and cost-aware scaling through CI/CD supporting frequent deployments.
San José State University
  • Supported courses in Machine Learning, Networking, and Information Security. Mentored students on distributed systems, consistency models, and high-availability design.
  • Led lab sessions and debugging walkthroughs covering model evaluation, network protocols, and Linux system internals, helping students apply theoretical concepts to practical implementations.
Astranetix Corporation
  • Developed and productionized multimodal Retrieval-Augmented Generation solutions combining text and document embeddings to ground large language model outputs.
  • Established chunking, embedding, and vector retrieval workflows using Weaviate and GraphQL, tuning HNSW parameters to improve semantic recall and reduce inference latency.
  • Evaluated models from GCP Model Garden to inform production model selection; operated scalable inference services on AWS Lambda and S3 with a focus on throughput control, fault tolerance, and cost efficiency.
Flatirons AI LLC
  • Evaluated advanced RAG techniques including GraphRAG and RAPTOR to enhance multi-hop reasoning and contextual grounding over large document corpora.
  • Adapted embedding models and task-specific language models using domain datasets; constructed offline evaluation workflows to assess relevance, precision, recall, and latency.
  • Executed controlled A/B experiments across chunking strategies, embedding choices, and retrieval parameters to inform production-ready configuration decisions.
Deloitte Touche Tohmatsu India LLP
  • Contributed to data-intensive, compliance-focused fintech platforms by developing Python-based backend services integrated with Angular frontends.
  • Implemented REST and GraphQL interfaces in Python to enable analytics, reporting, and integration with downstream data-driven and intelligent systems.
  • Prototyped AI-oriented pipelines for text classification, information extraction, and similarity search on enterprise datasets to assess automation feasibility.
  • Strengthened automated testing, release validation, and CI/CD processes, improving deployment reliability and reducing post-release defects by ~25%.
  • Recognized with a Spot Award for delivering high-impact solutions and consistently exceeding performance and quality expectations.
Turito / YuppTV
  • Implemented JVM-based backend services in Scala supporting data-centric workflows for EdTech and media streaming applications.
  • Created RESTful interfaces enabling analytics ingestion, personalization logic, and multi-client content delivery.
  • Improved AWS DynamoDB performance through optimized key design and query patterns, reducing API response times by roughly 30%.
CDAC (Centre for Development of Advanced Computing)
  • Completed a summer internship focused on cybersecurity fundamentals and Linux system administration at India’s national R&D institution for advanced computing.
  • Gained hands-on experience with Kali Linux, core Linux commands, and security best practices in lab environments.

International Journal of Innovative Engineering Research and Technology (IJIERT) · 2020

Proposed a machine learning approach for sentiment classification of Hindi-language text using a Naive Bayes classifier, addressing the unique morphological and syntactic challenges of Hindi NLP.

NLPSentiment AnalysisNaive BayesHindi NLPMachine Learning

get in touch

mohanmoganti2023@gmail.com

Based in San Jose, open to remote. Hiring for AI engineering, full-stack, or founding roles — or just want to talk shop on agents, RAG, and what you're building.

Full-timeContractRemoteFounding eng

Expect replies within 48 hours

© 2023 Mohana MogantiBuilt with code & coffee ☕