Aravind Reddy Pathoori
Development
Kansas, United States
Skills
Data Engineering
About
ARAVIND REDDY PATHOORI's skills align with System Developers and Analysts (Information and Communication Technology). ARAVIND also has skills associated with Programmers (Information and Communication Technology). ARAVIND REDDY PATHOORI appears to be a low-to-mid level candidate, with 4 years of experience.
View more
Work Experience
Big Data Engineer
Nike, Inc
February 2022 - February 2024
- * Established and maintained ETL processes using AWS Glue for data extraction, transformation, and loading. * Created ETL parsing and analytics data pipelines using Python and Spark to create a structured data model. * Developed Spark SQL scripts using PySpark to perform transformations and actions on Data Frames, Datasets in Spark for faster data Processing. * Devised Spark streaming job to consume data from Kafka topic of different source systems and push the data into HDFS. * Created Hive tables as per requirements, internal or external tables defined with appropriate static and dynamic partitions and worked extensively with HIVE DDLS and Hive Query language (HQLs). * Translated Hive/SQL queries into Spark transformations using Spark RDDs and PySpark. * Developed Spark scripts using Python on AWS EMR for Data Aggregation, Validation, and implemented Athena for ad-hoc querying. * Performed performance tuning and troubleshooting of Map reduce jobs by analyzing Hadoop log files. * Extracted data from servers into AWS S3 and bulk-loaded cleaned data into Cassandra using Spark. * Processed large sets of structured, semi-structured, and unstructured data implements and leverages open-source technologies. * Tuned and optimized SQL queries on AWS Redshift to enhance query performance and improve overall data processing efficiency. * Strong understanding of data modeling concepts and database design principles, including star schema and snowflake schema to optimize data structures for efficient data loading and querying in snowflake. * Demonstrated expertise in creating, debugging, scheduling, and monitoring jobs using Airflow orchestrating tool. * Worked in Agile methodology, actively participating in Scrum planning, review calls, retrospectives, and daily Scrum meetings to address problems/issues.
Hadoop Developer
Holiday Bazaar
August 2019 - July 2021
- * Experience in designing, developing, and implementing SSIS Packages to extract, transform, and load the data. * Utilized the Cloudera Hadoop framework to enhance data processing and storage throughput across a cluster of up to seventeen nodes. * Wrote Sqoop jobs to Load data from RDBMS to HDFS and vice-versa. * Responsible for managing data coming from different RDBMS source systems like Oracle and SQL Teradata, and was involved in maintaining the structured data within the HDFS in various file formats such as Avro, and Parquet for optimized storage patterns. * Tuned the performance of Pig queries and developed Pig scripts for further data processing. * Transformed data into a tabular format and processed results using Hive Query Language. * Utilized Apache Flume to load real-time unstructured data, like XML data, and log files into HDFS. * Processed large amounts of structured and unstructured data using the MapReduce framework. * Tuned and troubleshooted MapReduce jobs by analyzing Hadoop log files. * Developed Spark code using Scala and Spark Streaming for efficient testing and processing. * Worked with Oozie flow engine to schedule and orchestrate time-based jobs to perform multiple actions. * Implemented Jira for ticketing and tracking issues and Jenkins for CI/CD pipelines. * Proficiently managed version controls using Git and GitHub. * Experienced in developing SSRS reports, including designing report layouts and data visualization.