Open to Work

Turning data into models, systems, and decisions.

I'm Neehanth Reddy, a Data Scientist and AI/ML builder with 2+ years of applied experience across predictive modeling, forecasting, machine learning pipelines, biomedical AI research, and production-style AI applications.

PythonSQLscikit-learnPySparkPower BIFastAPIAWSLangGraph

Data science first, engineering discipline underneath.

I position my work around business-facing data science: understanding the problem, building measurable models, validating results, and packaging outputs so stakeholders or systems can actually use them.

02

Machine Learning

Reproducible training workflows, model APIs, Dockerized delivery, CI/CD, DVC, and AWS deployment patterns.

03

AI Engineering

RAG pipelines, retrieval benchmarking, structured LLM workflows, agentic systems, and tool-aware application design.

Practical stack for data.

Data Science & Analytics

PythonSQLpandasNumPyStatistical ModelingHypothesis TestingForecastingClusteringPower BITableau

Machine Learning

scikit-learnPyTorchTensorFlowFeature EngineeringModel EvaluationCross-ValidationSMOTECSPLDA/SVM

Data Engineering

ETLPySparkApache SparkPostgreSQLMongoDBData ValidationBatch ProcessingBenchmarking

AI Engineering

FastAPIDockerAWSGitHub ActionsDVCLangChainLangGraphFAISSChromaDBRAGAS

Proof of work, ordered for Data Scientist roles.

Forecast Accuracy
Baseline
SARIMA
18% RMSE ↓
Data ScienceGitHub

E-Commerce Analytics Pipeline

Built segmentation, association-rule mining, and SARIMA forecasting on 10K+ transactions to support marketing and supply-chain decisions.

  • Reduced forecasting RMSE by 18% versus baseline.
  • Delivered stakeholder-ready Power BI reporting.
  • Connected customer segments to product-bundling opportunities.
Pythonpandasscikit-learnSARIMAPower BI
Churn Model 91%

validation accuracy · 0.81 F1

Data ScienceGitHub

Telecom Churn Prediction

Created a churn prediction pipeline for 7,000+ customer records with feature engineering, class balancing, and model comparison.

  • Improved minority-class detection over logistic regression baseline.
  • Used resampling techniques to address class imbalance.
  • Produced ranked predictors for retention strategy.
Pythonscikit-learnSMOTEANNF1 Score
27%Churn
$456KMonthly Rev.
$139KChurn-related loss
AnalyticsGitHub

Customer Churn Analysis Dashboard

Built an interactive Power BI dashboard on 7,043 telecom customers to quantify churn rate, revenue exposure, and high-risk segments.

  • Connected churn behavior to revenue impact.
  • Combined Python preprocessing with SQL KPI extraction.
  • Prioritized retention actions for business users.
Power BISQLPythonKPI Reporting
106-subject baseline
ML ResearchGitHub

EEG Signal Processing & Classification

Built reproducible EEG workflows for sleep spindle detection and motor imagery classification with leakage-aware evaluation.

  • Audited 109 subjects and established a clean 106-subject baseline.
  • Used LOSO validation with CSP + LDA and CSP + linear SVM.
  • Generated PSD visuals and confusion matrices for reporting.
PythonMNESciPyscikit-learnYASA
94%Validation
ML EngineeringGitHub

Poultry Disease Classification System

Developed a VGG16 transfer-learning classifier for four poultry disease classes and packaged it as a deployment-ready inference API.

  • Versioned data and experiments with DVC.
  • Served predictions through FastAPI and Docker.
  • Automated deployment through GitHub Actions, AWS ECR, and EC2.
TensorFlowVGG16DVCFastAPIAWS
≥0.998 recall 7,641 chunks · 76 queries
AI EngineeringGitHub

GenAI Research Assistant

Built a RAG-based research assistant over arXiv papers with retrieval-backend benchmarking and end-to-end quality evaluation.

  • Compared FAISS-HNSW, Annoy, and Flat Index retrieval.
  • Evaluated latency, recall, memory, and RAGAS metrics.
  • Built a Streamlit semantic Q&A interface.
LangChainFAISSAnnoyChromaDBRAGAS
Resume JD Fit Analysis
Structured LLM Evaluation

LangGraph · Pydantic · LangSmith

AI Engineering GitHub

Agentic LLM Evaluation Workflow for Resume Screening

Built a LangGraph-based workflow that evaluates resume-job fit through structured extraction, gap analysis, deterministic score calibration, and recommendation generation.

  • Extracted structured resume and job-description data in parallel using typed LLM outputs.
  • Used Pydantic validation to make scoring inputs more consistent and reusable.
  • Instrumented workflow runs with LangSmith for trace-level debugging and prompt iteration.
Python LangGraph Pydantic LangSmith LLM Evaluation
$ query inventory12 guarded tools4 warehouses
AI ApplicationGitHub

AI-Powered Inventory Management

Built a FastAPI inventory system with PostgreSQL, MongoDB audit logs, JWT auth, and a conversational agent for natural-language operations.

  • Managed 22 products across 4 warehouses.
  • Designed 12 permission-aware agent tools.
  • Separated transactional data from flexible audit logging.
FastAPIPostgreSQLMongoDBLangChainOllama
2.3x multi-node throughput gain
Data EngineeringGitHub

Distributed Data Processing & Scalability Analysis

Built Dockerized single-node and multi-node Spark environments and benchmarked PySpark ETL performance on NYC Taxi data.

  • Measured runtime, CPU, memory, and throughput trade-offs.
  • Built reproducible Spark experiments in Docker.
  • Showed when cluster parallelism improves processing performance.
PySparkApache SparkDockerETLBenchmarking

Applied research, teaching, and ML delivery.

Mar 2026 — Present

AI R&D Intern · Biomedical Sensors & Systems Lab, University of Memphis

Building reproducible biomedical ML workflows for EEG signal processing, sleep spindle detection, and motor imagery classification.

  • Audited 109 PhysioNet EEG subjects and established a clean 106-subject baseline.
  • Applied LOSO validation with CSP + LDA and CSP + linear SVM for realistic multi-session benchmarking.
  • Generated PSD visualizations and confusion matrices to support signal-quality analysis and technical reporting.
PythonMNESignal Processingscikit-learnResearch Documentation
Jan 2024 — Dec 2025

Data Science Fellow / Graduate Assistant · University of Memphis

Supported graduate-level coursework in applied statistics, machine learning, and biostatistical methods through hands-on coding templates and mentoring.

  • Developed R starter code and assignment templates for Advanced Statistical Learning II.
  • Mentored 30+ students in SAS-based statistical analysis and ML implementation.
  • Translated modeling concepts into reusable, reproducible learning workflows.
RSASStatisticsMachine LearningMentoring
Jan 2023 — Dec 2023

Machine Learning Intern · Microline Information Systems

Worked on supervised ML modeling, data preprocessing, workflow automation, and early-stage REST integration for internal classification use cases.

  • Built and evaluated supervised models with Python and scikit-learn.
  • Automated data-cleaning and model-evaluation scripts for internal reporting workflows.
  • Explored API-based delivery of model predictions for prototype usage.
Pythonscikit-learnFeature EngineeringREST APIs

Education and certifications.

Education

M.S. Data Science

University of Memphis · Jan 2024 — Dec 2025

GPA: 3.93 / 4.00
Machine LearningAdvanced Statistical LearningDatabase SystemsData MiningArtificial Intelligence
Education

B.Tech Electronics & Communication Engineering

JNTU Hyderabad · Jul 2020 — May 2023

Foundation in signals, DSP, AI, DBMS, and embedded systems.
Signals & SystemsDigital Signal ProcessingPythonAIDBMS

Let's build something useful from data.

Open to full-time Data Scientist, ML Engineer, and AI Engineer opportunities across the United States.

Memphis, TN · Open to relocate neehanthreddym@gmail.com