Generative AI & NLP
- Amazon Bedrock
- SageMaker AI
- AgentCore
- LLMs (Claude, ChatGPT, Mistral)
- RAG
- AI Agents
- Prompt Engineering
- Multimodal (TTS, T2V, I2V)
- AWS Comprehend
AI/ML Architect · Engineering Leader
12+ years delivering production-grade AI for enterprise — from RAG pipelines and AI agents to LLMOps and cloud-native ML platforms. Currently Machine Learning Architect at Caylent.
About
I'm an AI/ML Architect at Caylent, focused on Generative AI, LLMOps, and cloud-native ML on AWS. My work spans architecture and delivery for Fintech, Automotive, MarTech, MedTech, and EdTech clients — translating fuzzy business asks into systems that ship.
I've led AI pipelines that processed 15M+ records, deployed fine-tuned open-source LLMs to production for a platform used by 25,000+ organizations, and built a GenAI document pipeline that cut financial-statement processing from 30–60 minutes to ~90 seconds. I co-authored Caylent's MLOps Solution Offering — a reusable architecture blueprint now used company-wide.
Outside delivery, I mentor engineers, contribute to open-source experiments, and write on responsible-AI topics — a thread that goes back to my graduate work at SFU, where I published on model fairness & transparency.
Experience
Promoted to ML Architect; lead end-to-end architecture and delivery of scalable GenAI and ML solutions on AWS — from discovery through production rollout — with focus on cost-performance trade-offs, stakeholder alignment, and maintainable systems.
Fintech
GenAI document pipeline: 30–60 min → ~90 s; >95% extraction accuracy; 10× throughput (50→500 docs); $0.36–0.47 per document, enabling same-day reviews.
MarTech / CRM
DistilBERT email classification (99.3% F1, 0.064s inference, $0.23/10k records) and Mistral-7B name extraction (97%+ accuracy) across beta, prod-us, prod-eu — among the first production deployments of fine-tuned Mistral-7B for enterprise NLP.
Automotive
Benchmarked 5 models (OmniAvatar, HunyuanVideo, Wan2.2, Wan2.2-S2V-14B, Amazon Nova Reel) across quality, generation time (2–35 min/18s video), and cost ($1.38–$2.98/video) on AWS p5.4xlarge (H100 80GB). Customer verdict: "A great success. Work of the highest quality."
Grew from individual contributor to UI lead across 6+ projects serving 2M+ users (Chivas TV, Euskaltel TV, TBC Taiwan, FOXTEL Australia).
Skills
Certifications
Projects
Live data from github.com/manju-malateshappa — refreshed each load.
Achievements & Writing
Hackathon · Winner
Automated PR generation and code reviews using Jira, Slack, Git diff, and LLMs via GitHub Actions.
Hackathon · Winner
XGBoost speedster-detection solution deployed on AWS SageMaker, integrated into the Alida platform.
Publication
SFU CS course project — detected and mitigated racial bias in ML classification (COMPAS dataset) with IBM AIF360, Google What-If, and SHAP. Improved Disparate Impact from 0.7 → 0.91 with XGBoost + Reweighing.
Publication · IEEE
Co-author on an IEEE paper from earlier in my career.
Education
M.Sc. Computer Science — Big Data & Machine Learning
B.E. Computer Science
Contact
If you're working on Generative AI, LLMOps, or scaling AI on AWS — or just want to swap notes — I'd love to hear from you.