
Harish Chava
Development
Texas, United States
Skills
Data Engineering
About
Harish Chava's skills align with System Developers and Analysts (Information and Communication Technology). Harish also has skills associated with Database Specialists (Information and Communication Technology). Harish Chava has 9 years of work experience.
View more
Work Experience
Sr Data Engineer
Target
April 2019 - Present
- * Developed a robust SFTP ingestion framework using Python to efficiently ingest data from multiple vendors into Hadoop Distributed File System (HDFS). * Led the design and implementation of an innovative Spark ETL pipeline project, which improved data processing speed by 40%. * Lead team of resources, Team Management as Agile Methodologies. * Provide data subject matter expertise to business and development groups and work closely with cross-functional leaders across the organization. * Orchestrated ETL pipelines using Azure Data Factory (ADF) to ingest data into Azure Data Lake Storage (ADLS). * Utilized Auto Loader in Databricks to progressively stream Cloud Files from ADLS, laying the groundwork for bronze, silver, and gold tables in the Delta Lakehouse. * Initiated batch and streaming processes for Delta Live Tables. Implemented data quality SQL constraints within Delta Live Tables (DLT)
Data Engineer
Expedia Group
August 2017 - March 2019
- * Optimized long running Spark jobs by rewriting ETL pipelines, restructuring the Hive tables and changed the underlying file format structure. * Assumed end-to-end responsibility for a Link Direct product from design to final code deployment on Analytics/reporting side. * Successfully transitioned existing data pipelines to a cloud-based infrastructure, selecting optimal AWS services to enhance scalability and reliability. * Sourced thousands of S3 JSON Objects into using AWS Glue, Spark Pipelines and created Hive external tables. * Solid understanding of designing and operationalization of large-scale data and analytics solutions on Snowflake data warehouse. * Loaded marketing data to Snowflake, capitalizing on its native capabilities like Copy Into, Snowpipe, Snowflake Shares, and Python scripting. * Developed scripts using Shell Scripts and Python to automate routine pipeline running tasks, streamlining operational processes.
Data Engineer
Blue cross Blue Shield
April 2017 - August 2017
- * Developed Talend workflow execution steps with ETL transformations. * Developed map reduce programs and written customized Hive UDF's to implement complex business logics.
Data Engineer
T-Mobile
August 2015 - March 2017
- * Implemented SCD Type1 ETL jobs using Hive, pig Scripts. Used pig scripts for Data cleansing. * Migrated some of the SLA sensitive jobs from Hive to Spark. * Imported data from RDBMS to HDFS and performed ETL transformations according to the given business logics and then exported the final data from Hive tables to Teradata using Sqoop to make analytical datasets available for Reporting layer consumption. * Scheduled all the workflows using oozie coordinators.
Software Engineer
January 2013 - November 2013
- * Develop Java application for digital analytics on campaign clicks and impressions. * Performed SQL Tuning using SQL Trace, analyzed explain plan for optimizing query performance. * Helped develop MapReduce programs and define job flows.