Open to Work

Turning data into models, systems, and decisions.

I'm Neehanth Reddy, a Data Scientist and AI/ML builder with 2+ years of applied experience across predictive modeling, forecasting, machine learning pipelines, biomedical AI research, and production-style AI applications.

PythonSQLscikit-learnPySparkPower BIFastAPIAWSGCPLangGraphDeveloper Advocacy

About

Data science first, engineering discipline underneath.

I position my work around business-facing data science: understanding the problem, building measurable models, validating results, and packaging outputs so stakeholders or systems can actually use them.

Data Science

Predictive modeling, forecasting, segmentation, churn analysis, evaluation, dashboards, and insight communication.

Machine Learning

Reproducible training workflows, model APIs, Dockerized delivery, CI/CD, DVC, and AWS deployment patterns.

AI Engineering

RAG pipelines, retrieval benchmarking, structured LLM workflows, agentic systems, and tool-aware application design.

Capabilities

Practical stack for data.

Data Science & Analytics

Machine Learning

Data Engineering

AI Engineering

Cloud

Selected Projects

Proof of work, ordered for Data Scientist roles.

Forecast Accuracy

Baseline

SARIMA

18% RMSE ↓

Data ScienceGitHub

E-Commerce Analytics Pipeline

Built segmentation, association-rule mining, and SARIMA forecasting on 10K+ transactions to support marketing and supply-chain decisions.

Reduced forecasting RMSE by 18% versus baseline.
Delivered stakeholder-ready Power BI reporting.
Connected customer segments to product-bundling opportunities.

Pythonpandasscikit-learnSARIMAPower BI

Churn Model 91%

validation accuracy · 0.81 F1

Data ScienceGitHub

Telecom Churn Prediction

Created a churn prediction pipeline for 7,000+ customer records with feature engineering, class balancing, and model comparison.

Improved minority-class detection over logistic regression baseline.
Used resampling techniques to address class imbalance.
Produced ranked predictors for retention strategy.

Pythonscikit-learnSMOTEANNF1 Score

27%Churn

$456KMonthly Rev.

$139KChurn-related loss

AnalyticsGitHub

Customer Churn Analysis Dashboard

Built an interactive Power BI dashboard on 7,043 telecom customers to quantify churn rate, revenue exposure, and high-risk segments.

Connected churn behavior to revenue impact.
Combined Python preprocessing with SQL KPI extraction.
Prioritized retention actions for business users.

Power BISQLPythonKPI Reporting

106-subject baseline

ML ResearchGitHub

EEG Signal Processing & Classification

Built reproducible EEG workflows for sleep spindle detection and motor imagery classification with leakage-aware evaluation.

Audited 109 subjects and established a clean 106-subject baseline.
Used LOSO validation with CSP + LDA and CSP + linear SVM.
Generated PSD visuals and confusion matrices for reporting.

PythonMNESciPyscikit-learnYASA

94%Validation

ML EngineeringGitHub

Poultry Disease Classification System

Developed a VGG16 transfer-learning classifier for four poultry disease classes and packaged it as a deployment-ready inference API.

Versioned data and experiments with DVC.
Served predictions through FastAPI and Docker.
Automated deployment through GitHub Actions, AWS ECR, and EC2.

TensorFlowVGG16DVCFastAPIAWS

≥0.998 recall 7,641 chunks · 76 queries

AI EngineeringGitHub

GenAI Research Assistant

Built a RAG-based research assistant over arXiv papers with retrieval-backend benchmarking and end-to-end quality evaluation.

Compared FAISS-HNSW, Annoy, and Flat Index retrieval.
Evaluated latency, recall, memory, and RAGAS metrics.
Built a Streamlit semantic Q&A interface.

LangChainFAISSAnnoyChromaDBRAGAS

Resume JD Fit Analysis

Structured LLM Evaluation

LangGraph · Pydantic · LangSmith

AI Engineering GitHub

Agentic LLM Evaluation Workflow for Resume Screening

Built a LangGraph-based workflow that evaluates resume-job fit through structured extraction, gap analysis, deterministic score calibration, and recommendation generation.

Extracted structured resume and job-description data in parallel using typed LLM outputs.
Used Pydantic validation to make scoring inputs more consistent and reusable.
Instrumented workflow runs with LangSmith for trace-level debugging and prompt iteration.

Python LangGraph Pydantic LangSmith LLM Evaluation

$ query inventory12 guarded tools4 warehouses

AI EngineeringGitHub

AI-Powered Inventory Management

Built a FastAPI inventory system with PostgreSQL, MongoDB audit logs, JWT auth, and a conversational agent for natural-language operations.

Managed 22 products across 4 warehouses.
Designed 12 permission-aware agent tools.
Separated transactional data from flexible audit logging.

FastAPIPostgreSQLMongoDBLangChainOllama

2.3x multi-node throughput gain

Data EngineeringGitHub

Distributed Data Processing & Scalability Analysis

Built Dockerized single-node and multi-node Spark environments and benchmarked PySpark ETL performance on NYC Taxi data.

Measured runtime, CPU, memory, and throughput trade-offs.
Built reproducible Spark experiments in Docker.
Showed when cluster parallelism improves processing performance.

PySparkApache SparkDockerETLBenchmarking

Experience

Applied research, teaching, and ML delivery.

Mar 2026 — Present

AI R&D Intern · Biomedical Sensors & Systems Lab, University of Memphis

Building reproducible biomedical ML workflows for EEG signal processing, sleep spindle detection, and motor imagery classification.

Audited 109 PhysioNet EEG subjects and established a clean 106-subject baseline.
Applied LOSO validation with CSP + LDA and CSP + linear SVM for realistic multi-session benchmarking.
Generated PSD visualizations and confusion matrices to support signal-quality analysis and technical reporting.

PythonMNESignal Processingscikit-learnResearch Documentation

Feb 2026 — Present

Co-Organizer · Google Developer Groups at Florida Atlantic University

Grow a remote-friendly learning community for aspiring AI engineers and data scientists, improving accessibility to certification pathways and AI education through virtual programming.

Plan, organize and support virtual events including study jams, information sessions, and hands-on workshops.
Coordinate speakers, structure technical content, and ensure smooth virtual event execution.
Expand access to learning resources around Google Cloud, certifications, and AI topics.

Community BuildingTechnical EventsGoogle CloudAI EducationWorkshop CoordinationDeveloper Advocacy

Jan 2024 — Dec 2025

Data Science Fellow / Graduate Assistant · University of Memphis

Supported graduate-level coursework in applied statistics, machine learning, and biostatistical methods through hands-on coding templates and mentoring.

Developed R starter code and assignment templates for Advanced Statistical Learning II.
Mentored 30+ students in SAS-based statistical analysis and ML implementation.
Translated modeling concepts into reusable, reproducible learning workflows.

RSASStatisticsMachine LearningMentoring

Jan 2023 — Dec 2023

Machine Learning Intern · Microline Information Systems

Worked on supervised ML modeling, data preprocessing, workflow automation, and early-stage REST integration for internal classification use cases.

Built and evaluated supervised models with Python and scikit-learn.
Automated data-cleaning and model-evaluation scripts for internal reporting workflows.
Explored API-based delivery of model predictions for prototype usage.

Pythonscikit-learnFeature EngineeringREST APIs

Credentials

Education and certifications.

Education

M.S. Data Science

University of Memphis · Jan 2024 — Dec 2025

GPA: 3.93 / 4.00

Machine LearningAdvanced Statistical LearningDatabase SystemsData MiningArtificial Intelligence

Education

B.Tech Electronics & Communication Engineering

JNTU Hyderabad · Jul 2020 — May 2023

Foundation in signals, DSP, AI, DBMS, and embedded systems.

Signals & SystemsDigital Signal ProcessingPythonAIDBMS

Writing

Notes on data, ML, and AI systems.

Medium

Practical stories on data science, intelligent search, AI assistants, and bringing ML models into real use.

Read articles →

AI Deck on Substack

Data, machine learning, and AI — from raw ideas to working systems.

Read newsletter →

Contact

Let's build something useful from data.

Open to full-time Data Scientist, ML Engineer, and AI Engineer opportunities across the United States.

LinkedIn GitHub

Memphis, TN · Open to relocate neehanthreddym@gmail.com

Turning data into models, systems, and decisions.

Data science first, engineering discipline underneath.

Data Science

Machine Learning

AI Engineering

Practical stack for data.

Data Science & Analytics

Machine Learning

Data Engineering

AI Engineering

Cloud

Proof of work, ordered for Data Scientist roles.

E-Commerce Analytics Pipeline

Telecom Churn Prediction

Customer Churn Analysis Dashboard

EEG Signal Processing & Classification

Poultry Disease Classification System

GenAI Research Assistant

Agentic LLM Evaluation Workflow for Resume Screening

AI-Powered Inventory Management

Distributed Data Processing & Scalability Analysis

Applied research, teaching, and ML delivery.

AI R&D Intern · Biomedical Sensors & Systems Lab, University of Memphis

Co-Organizer · Google Developer Groups at Florida Atlantic University

Data Science Fellow / Graduate Assistant · University of Memphis

Machine Learning Intern · Microline Information Systems

Education and certifications.

M.S. Data Science

B.Tech Electronics & Communication Engineering

AI Agents Fundamentals

Introduction to Data Analytics on Google Cloud

AWS Cloud Quest: Cloud Practitioner

Applied Machine Learning: Foundations

Extract, Transform and Load Data in Power BI

Quantium Data Analytics Job Simulation

Notes on data, ML, and AI systems.

Medium

AI Deck on Substack

Let's build something useful from data.