Karnati Sri Anjaneya Reddy
Development
Texas, United States
Skills
Data Engineering
About
Sri Anjaneya Reddy Karnati's skills align with Programmers (Information and Communication Technology). Sri also has skills associated with System Developers and Analysts (Information and Communication Technology). Sri Anjaneya Reddy Karnati has 10 years of work experience.
View more
Work Experience
AWS Data engineer
VISA
July 2022 - January 2024
- Roles and Responsibilities: • Import data from various sources to AWS raw location using Sqoop to Extract. • Pyspark /Beeline Hive framework executed using shell scripts to perform ETL process to transform and load raw data into catalog table. • Setup and build AWS infrastructure various resources, VPC, ECS, EC2, S3, IAM, EBS, Security Group, Auto Scaling, and RDS in Cloud Formation JSON templates. • Developed script which will Load the data into Spark Data frames and do in memory data computation to generate the output response. • Performed job functions using Spark API's in Scala/Python/Pyspark for real time analysis and for fast querying purposes and Experienced with Agile methodology and delivery tool Version One. • Experienced with AWS cloud platform and its features, which includes EC2, S3, ROUTE53 VPC, EBS, AMI, SNS, RDS AND CLOUD WATCH. • Proficient in designing, optimizing, and managing Redshift clusters and data models. • Used the AWS -CLI to suspend on AWS Lambda function used AWS CLI to automate backup of ephemeral data stores to S3 buckets EBS. • Gathered Semi structured data from S3 and relational structured data from RDS and keeping data sets into centralized metadata Catalog using AWS GLUE and extract the datasets and load them into Kinesis streams. • Good at Bash and Ansible Shell Script to run our Jobs through the Scheduler. • Worked as part of an Agile/Scrum based development team and exposed to TDD approach in developing applications. • Proficient in designing and implementing ETL pipelines to extract, transform, and load data from various sources to destination systems. • Skilled in working with large datasets, optimizing ETL workflows, and automating data processes. • Day to-day responsibility includes developing ETL Pipelines in and out of data warehouse, develop major regulatory and financial reports using advanced SQL queries in snowflake. • Stage the API or Kafka Data (in JSON File format) into Snowflake DB by flattening the same for different functional services. • Extensive experience automating the build and deployment of scalable projects through GitLab CI/ CD, Jenkins, etc. and worked on Docker and Ansible. Used JavaScript for data validations and designed validations modules. • Created methods (get, post, put, delete) to make requests to the API server and tested Restful API using postman. Also used Loaded Cloud Watch Logs to S3 and then load into Kinesis Streams for Data Processing. • Proficient in using AWS Athena for querying and analyzing data stored in Amazon S3. • Worked on data transformation and cleansing processes using AWS Glue for better data quality. • Developed complex SQL queries and data transformation scripts to extract insights from large datasets stored in Amazon Redshift. • Handled operations and maintenance support for AWS cloud resources which includes launching, maintaining and troubleshooting EC2 instances, and S3 buckets, Virtual Private Clouds (VPC), Elastic Load Balancers (ELB) and Relational Database Services (RDS). • Worked on the Power BI for Generating and sending reports to BI team based on client requirements. (weekly and Monthly). Tech Stack Hive, Scala, Pyspark, Control-M, Jenkins, Maven. Git, AWS and Snowflake
Data engineer
Synchrony
January 2021 - July 2022
- Roles and Responsibilities: • Worked on Google cloud platform (GCP) services like compute engine, cloud load balancing, cloud storage, cloud SQL, stack driver monitoring and cloud deployment manager. • Setup Alerting and monitoring using Stack driver in GCP. • Hands of experience in GCP, Big Query, GCP Composer, Airflow, GCS bucket, G - cloud function, cloud dataflow, Google Cloud Storage, Pub/Sub cloud shell, GSUTIL, BQ command-line utilities, Data Proc, Stack driver. • Monitoring Big query, Dataproc and cloud Data flow jobs via Stack driver for all the environments. • Analyze various types of raw file like Json, CSV, Xml with Python/SCALA using Pandas, Numpy etc. • Using g-cloud function with Python/SCALA to load Data into Big query for on arrival csv files in GCP bucket. • Setup Alerting and monitoring using Stack driver in GCP. • Used spark and spark-SQL to read the parquet data and create the tables in hive using the Scala API. • Responsible for gathering the requirements from the clients and assigning the tasks based on the priority. • Hands on Ansible Shell Script, developed to schedule our D-series and Control-M Jobs. • Created a framework to identify and copy the mainframe source files required for data processing from HDFS to GCP node. Decommissioning of existing process and scheduled jobs in HDFS node. • Hands on developer for Unix/Linux batch applications. • Analyzing the Data from different sources using Big Data Solution Hadoop by implementing Hive and Sqoop. • Experience in transporting and processing real-time stream data using Kafka. • We are responsible for the customers providing the VAU (VISA Account update), CLKG (Linked card information) and Digital wallet/electronic- wallet. • Process and load bound and unbound Data from Kafka pub/sub topic to sql server using spring boot application. • Build the jobs and workflows to automate the run of code scripts using Jenkins deployment tool. • Added the dependent jobs at the Airflow/Control-M scheduler and validated those changes. • Worked on the Power BI for Generating and sending reports to BI team based on client requirements. (weekly and Monthly). Tech Stack Hive, Scala, Pyspark, Control-M, Jenkins, Maven. Git, GCP, and Snowflake
Data engineer
Wells Fargo
August 2019 - December 2020
- Roles and Responsibilities: • We deal with WLDS, 1CDH and EDQE Applications. • Performed ETL data translation while SFTP the files to Source to Target Data Mapping documents to support large datasets (Big Data) out to the GCP Cloud. • Used Sqoop to import data into HDFS and Hive from other data systems. • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager. • We will follow the Agile Methodology to develop our code. • Migration of ETL processes from RDBMS to Hive to test the easy data manipulation. • Developed Hive queries to process the data for visualizing. • Using g-cloud function with SCALA Function to load Data in to Big query for on arrival CSV files in GCP bucket. • Do fact dimensional modeling and proposed solution to load it • Processing data with scala, spark, spark SQL and load in hive partition tables in parquet file format • Develop near real time data pipeline using Kafka and spark stream to ingest client data from their API and apply transformation. • Created queries using Scala, Hive, SAS (Proc SQL) and PL/SQL to load large amount of data from SQL Server into HDFS to spot data trends. • Combine all the above steps in oozie workflow to run the end-to-end ETL process. • Developing under scrum methodology and in a CI/CD environment using Jenkins. • Installed and scheduled the Control-M Jobs and added the previous day dependencies. • Used Jenkins, GIT and deployed to enable versioning, build pipelines, and deployed into production. • Build jobs and workflows to automate the run of code scripts using Automation tool. Tech Stack: Airflow, Service Desk, JIRA, Putty, ITRS dash board and HUE
Azure Cloud Support engineer
Microsoft
May 2018 - August 2019
- Roles and Responsibilities: • As an Azure Cloud Technical support engineer, worked on technically challenging areas of networking, compute, subscription, VMs and storage to resolve customer problems on timely manner. • Worked on escalated tasks related to interconnectivity issues, complex cloud-based identity management, service interruptions with virtual machines and associated virtual storage issues. • Managed and Created Storage account in Azure Portal, worked on Creating images of Virtual Machines and attaching Disk storage to them. Also managed Virtual Network endpoints for these hosts on Azure Portal. • Had experience of deploying VMs, storage, Network and Backup/Restore activities and also with Power shell script. • Worked on increasing subscription limits and upgrading of customer accounts. • Successfully led a new initiative of chat support and built a team of 9 engineers around my work. • Exposed to work with various issues severities and in appropriately engaging specialized teams without impacting the customer experience. • Developed and maintained knowledge base articles to help rest of the team to handle the recurring issues effectively. (Confluence pages). • Work closely with the program management/product group for onboarding of newly released features. • Worked on Microsoft Azure cloud to provide to provide IaaS Support to client. Create Virtual machines through Power shell Script and Azure Portal. • Capturing an Image of Virtual Machine, attaching a Disk to a Virtual Machine. Manage and create Virtual Network and end points in Azure Portal. • Deploying VM's, Storage, and Network through Power shell Script. • Responsible to trouble shoot the customer issues such as Node performance issues, Portal login issues, SSH issues, Billing issues, Job failures issues etc. • Hands on Azure Hindsight's, Azure Data Factory, Azure COSMOS DB, Azure Synapse and Azure Data Bricks, • Coordinating with Microsoft for increasing subscription limit like- core limit and Cloud services. Handing and resolving client issues remotely. • Hands on to generate the JIT access and able to reboot the servers from the backend without data loss. • Designed and configured Azure Virtual Networks (VNets), Subnets, Azure Network Settings. • Hands on experience in Azure development, worked on Azure storage; Azure SQL Database, Virtual Machines. • Experience in migrating the existing V1 (classic) Azure Infrastructure into V2 (ARM). • Have familiarity with Azure Kubernetes and Dockers.
engineer
Kohl's, Data
February 2014 - May 2018
- Roles and Responsibilities: • Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster. • Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster. • Installed Oozie workflow engine to run multiple Hive and Pig Jobs. • Created the Backup files in SSMS and copied into the GCP Bucket • Responsible to write the script to copy the files from MSSQL to GCP Cloud. • Ran the GCP script on newly created Database on GCP SQL server. • Developed PIG Latin scripts for the analysis of semi structured data. • Developed and involved in the industry specific UDF (user defined functions) • Used Hive and created Hive tables and involved in data loading and writing Hive UDFs. • Experienced in loading and transformation of large sets of structured - semi structured data. • Done data migration from an RDBMS to a NoSQL database and was also involved in data deployed in various data systems. • Utilized Falcon and Oozie to schedule workflows for Hive/SQOOP/HDFS/SPARK actions. • Evaluated big data technologies to bring an improvement in the data processing architecture. Data modeling, development, and administration of relational and non-relational NoSQL databases (Big Query, Elastic Search) • Used ETL to implement the Slowly Changing Transformation, to maintain Historically Data in Data warehouse. • Performing ETL testing activities like running the Jobs, Extracting the data using necessary queries from database transform, and upload into the Data warehouse servers. • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms. • Generating and sending reports to BI team based on client requirements. Tech Stack: Hive, Sqoop, Spark with scala, HDFS and Power BI