Shivani Raut
Development
Indiana, United States
Skills
Data Engineering
About
Shivani Sampat Raut's skills align with Database Specialists (Information and Communication Technology). Shivani also has skills associated with System Developers and Analysts (Information and Communication Technology). Shivani Sampat Raut has 5 years of work experience.
View more
Work Experience
Data Engineer / Data Science Intern
Geodis Logistics LLC
May 2023 - August 2023
- (Full time Internship) Led the Optimization project for an existing SQL database, focusing on new database modeling and data extraction from Oracle Database to MongoDB Database through Hadoop-Spark architecture. Migrated SQL database to NoSQL database MongoDB using Python Spark-Mongo and SQL connector, resulting in faster querying and optimized schema architecture. Conducted exploratory data analysis (EDA) and visualization on transaction data to understand trends and seasonality, facilitating the development of Time Series machine learning models using ARIMA, SARIMA, and Facebook Prophet in Python for forecasting future transaction volumes. Deployed the data pipelines with models using Apache Airflow and Kubernetes through Docker. Then finally presented dashboards and Visualizations using Power BI. Overall, Created Big Data Hadoop and Spark data pipelines for implementing machine learning models. Then deployed these data pipelines using Apache Airflow and Kubernetes.
Data Engineer Consultant
FedEx, Remote
July 2018 - December 2021
- (Atos Syntel Payroll) Designed and developed complex data pipelines using Numpy, Pandas, PySpark, NLTK, Spacy etc. python libraries in the Dataiku tool through various code recipes. Maintained their data quality after analyzing accuracy-scores of various data sources. Used several QA test cases, to support the rapidly growing business needs of client FedEx. Optimized up to 2 hours of execution time and scaled up the Data flows and machine learning pipelines after partitioning, parallelism and incremental data ingestion based on the date. Used Spark and HIVE. Productionized and Automated Machine Learning model flows using Dataiku, Apache Airflow, Abinitio tools and Azure MLops cloud platform. Used NLP and Various Data flows to predict HS (Harmonic System) code as part of HS-Search Application. Using Classification and clustering in Machine learning algorithms, predicted Caging of the Shipments at Customs and True/False hits of Restricted Party Screening (RPS) system. Used Hadoop, Spark framework to extract and store the data through Dataiku tool. Used SQL (Teradata, Oracle), Hive and Python for transforming the data. Loaded the data into scaled up ML model flows after feature engineering using python. Validated the predictions using QA test cases.
ML Engineer
Indiana University, Indianapolis
August 2022 - Present
- (Bioinformatics Research) Created Question Answering Large Language Model for clinical guidelines taking data collected from AHA sources, using Neural network, Deep Learning, Machine Learning, Natural Language Processing (NLP) Developed hinge loss function in Generative adversarial networks (GAN) model for Genomic Regional Assessment of Peaks from DNA sequence data. The synthetic data was created using Deep Learning code of TensorFlow after executing it on NVDIA GPUs. Performed research on improving the quality of Knowledge graph using various techniques such as BERT, GPT3, ALBERTA, ROBERTA in Visual Studio integrated with my GitHub account. Retrieved and transformed the data using Visual Studio and Git-Hub Platform.