Aditi Gupta
Development
UP, India
Skills
Data Engineering
About
Aditi Gupta's skills align with System Developers and Analysts (Information and Communication Technology). Aditi also has skills associated with Database Specialists (Information and Communication Technology). Aditi Gupta has 7 years of work experience.
View more
Work Experience
DevOps Engineer
Project #2 HBO
February 2022 - February 2023
- (February, 2022-February, 2023) About Project: Home Box Office AWS cloud-based Data Analytics platform and set of core services to enable internal stakeholders to have better business insights and enable faster decision making. Worked as DevOps Engineer. Got opportunity to work on Kubernetes, Terraform, Databricks, Nexus, Jenkins, Snowflake and other AWS services. Build and support framework to help data engineers, data scientist to complete their task providing minimalistic inputs in less time without going into the details of the AWS/service they are using. Automating Airflow job notification on Slack channels to help desired team get notified on time. Involved in various adhoc development activities to fulfill users and business requirements.
MARS
January 2019 - January 2022
- solution aims to build a consolidated data repository (datalake) and coherent data warehousing platform on cloud for the Medibank community, development partners and other stakeholders. It is a cloud based automated mework for end to end data processing that utilizes the capabilities of the AWS services like EMR, S3, Redshift and Glue as well Spark Engine for efficient processing and execution of ETLs. Worked on data movement across the multiple tiers by utilizing the existing framework. Involved in creation of Glue and Redshift tables, implementation of business logic for base features or data marts using Glue. Involved in unit testing, data verification and validation and preparation of Unit Test Case documents. Involved in development and execution of Airflow DAGs.
Project #4 Amgen
October 2017 - December 2018
- (EDW) data from multiple sources has been migrated to Big Data landscape to build a Enterprise Data Lake (EDL) for Supply-Chain space. The application comprises of building Scorecard reports to manage and track the metric performance and progress of supply chain interactions. Further advancements related to framework of HTB (High Throughput Biologics) have been built for advance reporting towards management and robust automated data handling to reduce manual work. Worked on various modules from requirement analysis to system study, designing, coding, testing, debugging, documentation and implementation. Data migration from relational databases like Teradata etc to HDFS using Sqoop. Involved in creation of Hive managed tables, implementation of business logic using Hive scripts, optimization using partitioning and bucketing concepts and analysis of datasets. Exposure of processed Hive data to Impala or AWS S3 or Redshift. Development of wrapper scripts to automate the Hadoop jobs and scheduling using the Crontab. Building generic and reusable code bases. Unit testing, data verification and preparation of Unit Test Case documents. Building scheduler scripts using Unix shell scripting or Airflow DAGs. Involved in applying business transformations using PySpark scripts.
IT Analyst
Tata Consultancy Services
July 2017 - January 2022
- Project Details Project #1 Thermofisher (February, 2023-Present) About Project: Thermofisher AWS cloud-based Enterprise Data Lakehouse (EDL) aims for end-to-end data processing and draw business insights so that in future, best decision can be made to help business grow. Project utilizes AWS cloud platform along with databricks, Airflow and other reporting tools to visualize different business segment metrics.