Abhishek Rithik Origanti

           

Masters in Data Science

University of Maryland, College Park

Department of Computer Science and Engineering

Student

Institute of Data Science




Current Project
Fine Grained Attribute Grounding

Projects

Go to my Github page to look my work and collaborate.

Real-Time Bitcoin Data Ingestion and Time Series Analysis using Power BI


Engineered a real-time data ingestion pipeline using Python (requests, pandas) to extract and preprocess Bitcoin price data from the CoinGecko API. Transformed and exported the data into CSV format, enabling seamless integration with Microsoft Power BI. Designed and deployed a dynamic Power BI dashboard featuring live updates, historical price trends, moving averages, and volatility metrics. Leveraged Power BI’s Python integration to conduct time series analysis, uncovering seasonal patterns and forecasting future price movements. Enabled 15-minute automatic refresh and implemented streaming datasets for continuous market insights.


Predicting EV Trends and Charging Needs


Utilized Python with Pandas, Matplotlib, and Seaborn for geospatial data analysis, mapping EV density trends and identifying infrastructure gaps to optimize charging station placement by clustering over 1,000 EVs. Developed Random Forest and Logistic Regression models using Scikit-learn, achieving 98.42% accuracy in predicting CAFV eligibility, effectively classifying BEVs and PHEVs based on electric range and model year.


Efficient Deepfake Image Detection Using Dense CNN Architecture


Designed and optimized Dense CNN architectures with data augmentation using Keras and OpenCV, implementing regularization techniques to improve model accuracy by 12% while reducing computational costs by 30%. Conducted hypothesis testing and validated models across five diverse datasets, enhancing generalizability by 15% and ensuring high reliability in detecting AI-generated synthetic images.


Retail Data Analysis with SQL: Insights into Sales, Customers, and Product Trends


Developed an in-depth SQL-based retail data analysis project, focusing on real-world e-commerce datasets containing product, sales, and customer information. Leveraging SQL queries, data updates, and advanced analytical techniques, the project provided insights into sales performance, customer behavior, and product trends. This hands-on experience enhanced SQL proficiency by demonstrating data manipulation, aggregation, and evaluation strategies, making it a valuable addition to a coding portfolio.


End-to-End Data Pipeline Development with Snowflake


Designed and implemented a robust data pipeline using Snowflake, transforming a data lake into an optimized, analytics-ready solution. This project followed the complete data lifecycle from ingestion to structured datasets, leveraging the three-layer Medallion architecture—Bronze, Silver, and Gold. Gained hands-on experience in data engineering best practices, including data transformation, storage optimization, and performance tuning, making it a valuable addition to a professional data engineering portfolio.


Building Scalable Data Pipelines with Databricks


Developed a modern data pipeline using Databricks, transforming a data lake into an optimized, analytics-ready solution. Implemented the three-layer Medallion architecture—Bronze, Silver, and Gold—to ensure efficient data processing and management. Utilized Spark SQL for real-world data transformations, gaining hands-on experience in scalable data engineering workflows. This project enhanced expertise in big data processing and analytics, making it a valuable addition to a data engineering portfolio.


End-to-End Data Engineering with the Modern Data Stack


Designed and implemented a complete data engineering pipeline using open-source tools from the modern data stack. Focused on efficiently extracting, loading, and transforming scattered, complex data into a unified, analytics-ready format. Applied best practices such as data modeling, testing, documentation, and version control to ensure scalability and maintainability. This hands-on project provided practical experience in building a robust analytics platform for a fictional e-commerce company, enhancing expertise in data engineering workflows.




SKILLS

  • Programming Languages: Python, SQL, R, Bash
  • Data Science & Machine Learning: Pandas, NumPy, Scikit-learn, TensorFlow, PyTorch, Keras, XGBoost, NLP (SpaCy, Hugging Face), Model Deployment (Flask, FastAPI)
  • Data Visualization & Analysis: Matplotlib, Seaborn, Tableau, Power BI, Microsoft Excel, Statistical Analysis
  • Big Data & Cloud Tools: Hadoop, Apache Spark, Databricks, Docker, ETL/ELT Pipelines
  • Databases: MySQL, PostgreSQL, MongoDB, NoSQLg
  • Other Skills: Statistical Analysis, Geospatial Analysis, Data Wrangling, Business Analysis, Collaboration, Version Control (Git), A/B Testing, Time Series Analysis, MLOps







  • Professional Experience

    Education

    • 2024-26: Masters in Data Science , University of Maryland, College Park - GPA : 3.5/4.0
    • 2020-24: B.Tech in Computer Science Engineering with specialization
      in AI and ML , SRM University - CGPA : 8.84/10

    Internships

    Interlinked Corp

    (Data Engineer Intern – Remote)

    • Analyzed environmental datasets related to wildfire risk and emergency response, uncovering critical spatial and temporal patterns to improve emergency preparedness and situational awareness.
    • Developed and maintained Python-based ETL pipelines to ingest, clean, and transform real-time weather, satellite, and sensor data into analysis-ready formats for rapid deployment.
    • Enhanced Interlinked’s AI-powered emergency response platform by optimizing data workflows, improving latency and real-time throughput by 40% using asynchronous processing techniques.
    • Collaborated with geospatial analysts to produce interactive maps using GeoPandas, Folium, and QGIS, enabling visual exploration of high-risk zones and fire spread trajectories.
    • Automated data fetching and transformation from government APIs using cron jobs and custom Python scripts, supporting continuous updates to fire risk prediction systems.
    • Validated and tuned predictive models using historical wildfire data, helping improve the accuracy of risk forecasting algorithms and supporting more effective emergency response.
    • Constructed scalable data pipelines using Apache Airflow and SQL for batch and stream processing, ensuring robust scheduling, logging, and error handling in production.
    • Authored detailed technical reports and visual summaries that supported decision-making for government agencies on resource allocation and risk mitigation strategies.
    • Designed and managed structured databases using PostgreSQL and SQLite, normalizing data schemas and writing efficient queries for fast geospatial and temporal analysis.
    • Leveraged Docker and Git for version-controlled deployment of pipeline modules, improving reproducibility and simplifying onboarding across engineering and research teams.
    • Participated in model validation cycles and anomaly detection using Pandas, Scikit-learn, and PyCaret, improving confidence in automated alerts and operational analytics outputs.

    The Coding School

    (Teaching Assistant for the Data Science Research Program)

    • Guided high school students in real-world research projects by retrieving, evaluating, and integrating datasets from open data portals and online APIs for analytical use.
    • Cleaned, transformed, and structured raw datasets using Python (Pandas, NumPy), ensuring high-quality data preparation for machine learning and statistical analysis workflows.
    • Troubleshot and optimized student Python code, debugging logic and syntax errors to maintain functional pipelines and improve project execution speed and reliability.
    • Conducted Exploratory Data Analysis (EDA) using matplotlib, seaborn, and pandas to reveal trends, patterns, and outliers for data storytelling and model preparation.
    • Mentored students in developing machine learning models using Scikit-learn and TensorFlow, covering regression, classification, and clustering for their research case studies.
    • Designed and iterated complete project pipelines, assisting students in setting research goals, cleaning data, modeling, and producing final insights under tight deadlines.
    • Introduced Git and GitHub workflows to students, teaching version control, branch management, and collaborative coding for reproducible and traceable data science work.
    • Supported the use of Jupyter Notebooks and interactive Python environments, helping students document experiments and debug code in real-time during class sessions.
    • Created personalized learning materials and code snippets to simplify complex topics, enhancing student understanding of machine learning, EDA, and visualization principles.
    • Collaborated with instructors and mentors to track student progress, recommend improvements, and troubleshoot challenges encountered in datasets or modeling steps.
    • Reviewed, evaluated, and gave feedback on student-generated insights, charts, and final presentations, ensuring analytical rigor and effective communication of research outcomes.

    Counseling Center- University of Maryland

    (Graduate Student Data Analyst) - College Park, MD

    • As a Graduate Student Data Analyst at the University of Maryland Counseling Center, I analyzed and transformed 5,000+ records using Python (Pandas, NumPy) to ensure FERPA compliance.
    • Developed Python scripts to automate data preprocessing workflows, reducing manual effort by 30% and enhancing the efficiency of data pipelines for faster analysis and reporting.
    • Optimized SQL queries in MySQL to validate data integrity, identify trends, and perform audits, achieving 96.5% accuracy through systematic testing and error-checking processes.
    • Collaborated with cross-functional teams to design and implement data privacy protocols, ensuring adherence to FERPA and other regulatory standards for sensitive student information.
    • Led a team of 4 student analysts to streamline data collection and analysis processes, fostering collaboration and ensuring timely delivery of actionable insights.
    • Conducted anomaly detection in datasets using advanced Python libraries, ensuring data accuracy and reliability for decision-making by counseling center staff.
    • Created interactive dashboards and visualizations using Tableau to present trends and insights, enabling counselors to make data-driven decisions for student support programs.
    • Spearheaded data audits to identify discrepancies and improve data quality, resulting in a 20% reduction in reporting errors and enhanced trust in analytical outputs.
    • Mentored junior analysts on Python, SQL, and data visualization tools, enhancing team productivity and fostering a culture of continuous learning and skill development.
    • Presented findings and recommendations to senior leadership, effectively communicating complex data insights to support strategic planning and resource allocation.
    • Demonstrated strong problem-solving and leadership skills by coordinating group projects, resolving conflicts, and ensuring alignment with organizational goals and deadlines.

    Open Weaver (Generative AI Intern) - Chennai, India

    • Developed a voice-to-image generator using Generative Adversarial Networks (GANs) and CLIP, achieving 95% accuracy in mapping voice inputs to visual outputs, enhancing real-time AI applications.
    • Implemented deep learning models with Keras, TensorFlow, and Scikit-learn, analyzing 10,000+ voice samples to convert spoken data into high-precision visual outputs, improving model reliability.
    • Utilized Hugging Face transformers and Natural Language Processing (NLP) for voice data preprocessing, refining text embeddings and reducing noise in training data for better generative results.
    • Applied supervised learning techniques, including neural networks and predictive modeling, to optimize voice-to-image synthesis, ensuring structured learning and improved classification accuracy.
    • Reduced model training time by 30%, optimizing hyperparameters, implementing regularization, and leveraging GPU acceleration, cutting training duration from 6 hours to 4 hours per epoch.
    • Performed Exploratory Data Analysis (EDA) using Pandas, NumPy, Matplotlib, and Seaborn, identifying key voice features in spectrograms to refine dataset structure and enhance model training.
    • Designed and managed databases with SQL and MySQL, efficiently storing and retrieving large-scale voice datasets, enabling seamless integration with AI pipelines and data preprocessing workflows.
    • Deployed models using Docker and Git, ensuring smooth version control, containerized execution, and easy scalability, improving reproducibility across various cloud and local environments.
    • Created interactive dashboards in Power BI, Tableau, and MS Excel, visualizing AI model performance metrics, dataset distributions, and trend patterns for data-driven decision-making.
    • Scraped large datasets using Beautiful Soup and web scraping techniques, extracting relevant speech and image data to improve training efficiency and diversify learning inputs.
    • Conducted hypothesis testing and statistical analysis with MATLAB and C++, validating AI-generated images, improving model robustness, and ensuring consistent accuracy in different datasets.

    GANfinity.AI (Data Analyst) - Remote, India

    • Built and deployed a comprehensive AI-enabled Fintech B2B cloud application, focusing on creating a scalable, full-stack web-based product.
    • Leveraged Python libraries like Pandas, NumPy, and Scikit-learn for in-depth data analysis, as well as JavaScript for interactive data visualization.
    • Conducted extensive data preprocessing and feature engineering, including data cleaning, wrangling, normalization, and scaling, to prepare datasets for predictive modeling.
    • Employed text vectorization techniques to classify user segments for targeted financial services.
    • Led the data analysis and machine learning aspects, implementing classification models such as Gradient Boosting, XGBoost, and Random Forest with extensive parameter tuning.
    • Evaluated model performance using metrics like accuracy, precision, recall, and F1-score to select the most effective model for financial risk prediction.
    • Utilized advanced anomaly detection techniques to identify payment discrepancies, increasing fraud detection capabilities by up to 90%.
    • Conducted A/B testing to analyze transaction patterns and collaborated with marketing teams, leading to targeted campaigns that improved transaction rates.
    • Executed SQL operations, including complex joins and data aggregation, to streamline the ETL process, enhancing data transformation and storage in a centralized database.
    • Set up new database schemas and optimized queries to improve data management and retrieval speed.
    • Developed automated data transformation workflows, converting raw data into formats suitable for analysis. Created an experimental framework for automated data collection and built real-time approval systems, reducing manual intervention and increasing productivity by 30%.
    • Collaborated with cross-functional teams, including UI/UX designers and backend developers, integrating machine learning insights into the application to enhance user experience.
    • Visualized data insights using tools like Tableau and R Markdown, providing detailed exploratory analysis and uncovering key business trends.
    • Identified key areas for procedural improvement through customer data analysis, providing actionable insights that enhanced decision-making and profitability.
    • Applied various clustering techniques to detect underperforming segments, leading to strategic adjustments that boosted overall system efficiency.
    • Maintained a high standard of attention to detail in handling data and building models, ensuring the accuracy and reliability of predictions. Effectively communicated insights through written reports and presentations to stakeholders, facilitating data-driven business decisions.

    Publications and Awards
    • 2024 Won Best Paper Award at International Conference on Computing Technologies for Sustainable Development-2024 for " Efficient Deepfake Image Detection Using Dense CNN Architecture" "Certificate" "Publication Link"
    • 2023 Ranked in the top 10 out of 2500 participants in Proglint’s Alliance University Computer Vision Hackathon 2023.






    Certifications

    My Profile on Linkedin

  • Oracle Database Foundations (certificate)
  • MongoDB: MongoDG Basics (Verify)
  • DeepLearning.Ai: Neural Networs and Deep Learning (certificate)
  • IBM : Machine Learning Introduction for Everyone (certificate)
  • DeepLearning.AI: AI For Everyone (verify)
  • Introduction to Machine Learning by Debjani Chakraborty (NPTEL, IIT KGP) (verify)



  • Contact


    Mailing address: 4313 Knox RD, Apt 516, College Park, MD 20740

    E-mail: origanti@umd.edu or abhishekapj182@gmail.com
    Linkdedin:  Reach me at Linkedin