Hi, I'm

Abhishek Rithik
Origanti

Aspiring Data Engineer & Analyst

Masters student in Data Science at the University of Maryland, College Park. Passionate about turning raw data into real-world impact from real-time pipelines to deep learning models and generative AI.

Abhishek Rithik Origanti
UMD · College Park

About Me

I'm a data science professional with a background in Computer Science Engineering (AI & ML specialization) from SRM University, now completing my Masters at the University of Maryland. With hands-on experience in data engineering, pipeline development, and analytics, I'm passionate about building efficient data systems and translating complex datasets into clear, actionable insights.

My current research focus is Scalable Data Engineering & Pipeline Optimization exploring how modern orchestration frameworks and cloud-native architectures can maximize throughput, reliability, and observability in large-scale data workflows. I've won a Best Paper Award at an international conference and ranked in the top 10 out of 2,500 participants in a national Computer Vision hackathon.

University of Maryland
Masters in Data Science

University of Maryland, College Park

2024 – 2026
SRM University
B.Tech — CS (AI & ML Specialization)

SRM University

2020 – 2024
Best Paper Award · ICCTSD-2024

Efficient Deepfake Image Detection Using Dense CNN Architecture

Location

College Park, MD, USA

Projects

A selection of data science and engineering work. View all on GitHub

Bitcoin Pipeline project preview
Real-Time Bitcoin Data Pipeline & Power BI Dashboard

Built a real-time Bitcoin monitoring system using Python and CoinGecko API with automated data ingestion via Power Query. Delivered an interactive Power BI dashboard featuring 7/30/90-day moving averages, volatility metrics, and time series forecasting with scheduled auto-refresh.

PythonPower BIPandasTime Series
EV Trends project preview
Predicting EV Trends & Charging Infrastructure

Analyzed EV adoption patterns using GeoPandas and Scikit-learn to map infrastructure gaps across Washington state. Built Random Forest and Logistic Regression models achieving 98.42% CAFV eligibility accuracy and R²=0.918 for electric range prediction.

Scikit-learnGeoPandasSeabornML
SQL Retail project preview
Retail Data Analysis with SQL

Performed end-to-end retail analytics on real-world e-commerce data using MySQL covering complex multi-table joins, window functions, aggregations, and customer segmentation to surface actionable sales and product performance insights.

SQLMySQLData AnalysisETL
Snowflake Pipeline project preview
End-to-End Data Pipeline with Snowflake

Architected a production-grade ETL pipeline on Snowflake following the Medallion architecture (Bronze → Silver → Gold). Implemented data quality checks, schema evolution, and query optimization using SQL and Snowflake-native features for scalable analytics.

SnowflakeSQLData EngineeringETL
Databricks Pipeline project preview
Scalable Pipelines with Databricks

Designed a scalable data lake pipeline on Databricks using Apache Spark and Delta Lake with Medallion architecture. Performed large-scale transformations via Spark SQL, enabling reliable, versioned, and query-optimized data for downstream analytics workloads.

DatabricksApache SparkSpark SQLDelta Lake

Skills

Programming Languages
Python SQL R Bash
Machine Learning & AI
Scikit-learn TensorFlow PyTorch Keras XGBoost SpaCy / NLP A/B Testing Hugging Face
Data Visualization
Matplotlib Seaborn Tableau Power BI Excel
Big Data & Cloud
Amazon Web Services (AWS) Apache Spark ETL/ELT Pipelines Snowflake Hadoop Docker Apache Airflow Databricks
Databases
MySQL PostgreSQL MongoDB SQLite NoSQL
Other Skills
Git / GitHub Flask / FastAPI Time Series Analysis MLOps Geospatial Analysis

Professional Experience

UMD Counseling Center

Graduate Student Data Analyst · College Park, MD

Sep 2024 – Present
  • Processed and standardized 5,000+ student records using Python and SQL under FERPA compliance guidelines, translating raw institutional data into concise executive reports that directly informed leadership decisions.
  • Restructured preprocessing workflows to reduce manual effort by 30%, and refined SQL queries iteratively until data accuracy reached 96.5% a benchmark the team could confidently cite across all reporting cycles.
  • Oversaw a team of 4 student analysts while building Tableau dashboards for near-real-time outcome monitoring, cutting reporting errors by 20% and giving student support programs a sharper foundation for decision-making.
PythonMySQLTableauPandas
Interlinked Corp

Data Engineer Intern · Remote

May 2025 – Sep 2025
  • Architected ELT pipelines to handle more than 2 million daily events using Airflow, Docker, and SQL, incorporating time-series features for forecasting and anomaly detection that pushed real-time throughput up by 40%.
  • Consolidated real-time weather, satellite, and sensor feeds into Python ETL workflows, then translated those outputs into geospatial risk maps via GeoPandas, Folium, and QGIS for wildfire analysis teams.
  • Introduced automated data quality checks and an offline evaluation harness for model validation, raising pipeline throughput by 50% while holding offline accuracy steady at 95% across production runs.
PythonApache AirflowPostgreSQLGeoPandasDocker
The Coding School

Teaching Assistant · Data Science Research Program

Jun 2025 – Aug 2025
  • Supported 20+ high school students through end-to-end data science research in Python, Scikit-learn, and TensorFlow, accompanying them from initial EDA through model evaluation and final presentations.
  • Wove Git/GitHub into the core curriculum and walked students through regression, classification, and clustering workflows, contributing to a 15% improvement in overall project quality by the end of the program.
  • Facilitated workshops on experiment design and error analysis, and ran mock client reviews modelled on consulting practice - helping students articulate their methodology with clarity and professional confidence.
PythonScikit-learnTensorFlowJupyterGit
Open Weaver

Generative AI Intern · Chennai, India

Jul 2023 – Sep 2023
  • Conceived and delivered a voice-to-image generator using GANs and CLIP in TensorFlow, achieving 95% accuracy in mapping voice inputs to visual outputs across large and varied test datasets.
  • Reduced model training time by 30% through targeted hyperparameter tuning, regularization strategies, and GPU acceleration preserving 98% accuracy and cross-dataset scalability throughout.
  • Assembled large-scale training datasets via web scraping with Beautiful Soup and maintained SQL/MySQL databases to keep AI pipelines reproducible, well-documented, and accessible to the broader team.
GANsCLIPTensorFlowHugging FacePower BI
GANfinity.AI

Data Analyst · Remote, India

Dec 2022 - Jun 2023
  • Developed and deployed an AI-enabled Fintech B2B cloud application with ML-powered financial risk prediction, embedding anomaly detection models that elevated fraud detection capabilities by 90%.
  • Evaluated Gradient Boosting, XGBoost, and Random Forest classifiers against one another, then applied A/B testing on transaction flows to identify which approaches meaningfully improved conversion outcomes.
  • Revamped SQL-based ETL processes for centralized data management, tightening ingestion and transformation logic so downstream analytics teams consistently had clean, reliable data at their disposal.
XGBoostRandom ForestSQLTableauPython

Publications

Best Paper Award

Efficient Deep Fake Image Detection Using Dense CNN Architecture

International Research Conference on Computing Technologies for Sustainable Development (IRCCTSD 2024)
Communications in Computer and Information Science, vol 2361. Springer, Cham.

Deepfakes innovative manipulations of digital visual content using deep learning methods have emerged as a significant threat, raising concerns about misinformation and privacy violations. Their influence spans social media, political discourse, and beyond, highlighting the urgent need for robust detection tools. This research addresses the deepfake threat through a thorough examination of binary classification techniques. Focused on distinguishing genuine from manipulated images, the study leverages diverse datasets to train and evaluate methods utilizing Convolutional Neural Networks (CNNs) with emphasis on spatial feature extraction. Experimental results demonstrate the model's effectiveness in detecting manipulated images across various scenarios, achieving 97% accuracy on unseen data.

Certifications & Awards

Awards & Recognition

Best Paper Award · ICCTSD-2024

"Efficient Deepfake Image Detection Using Dense CNN Architecture"

Top 10 · Proglint CV Hackathon 2023

Ranked in the top 10 out of 2,500 participants in Proglint's Alliance University Computer Vision Hackathon.

Want to Know More?

Download my full resume to see my complete experience, skills, and qualifications.

Download Resume

Get In Touch

I'm always open to interesting conversations, collaborations, or new opportunities.