Sreevani Pedaballi
Development
Florida, United States
Skills
Data Engineering
About
Sreevani Pedaballi's skills align with Programmers (Information and Communication Technology). Sreevani also has skills associated with System Developers and Analysts (Information and Communication Technology). Sreevani Pedaballi has 9 years of work experience, with 3 years of management experience, including a low-level position.
View more
Work Experience
Senior Data Engineer Duration
TIAA, New Jersey
April 2023 - Present
- Worked as individual contributor as well as leading offshore team on AWS pipeline Implementation. Design and build the pipeline to publish data from DynamoDB to Delta Lake. Implemented Glue ETL to enrich the prospect data with other source systems data to generate the golden prospect record. Setup the Glue infrastructure including S3 bucket, IAM roles & policies and Security group to launch the glue job Worked on replacing DynamoDB scan API with AWS glue DynamoDB export connector to extract data from DynamoDB Implemented python module to load CDP ranger policy and audit logs into S3 bucket. Worked on Cloudera CDP configuration including AMI Catalog, register CDP environment, set the id broker mapping and creating Datalake and Datahub. Worked on AWS Glue scalar integration, which includes creating Glue database catalog terraform configuration and publish it to scalar. Created tables and Global Secondary Indexes to store and manage the transient and prospect data. Having good knowledge in provisioning the resources in DynamoDB in an efficient and cost effective way. Worked on the Glue schedule trigger via the bogiefile, which is further consumed by the Jenkins one pipeline and create it in specified AWS environment. Technologies Used: Spark, Python, S3, AWS Glue, Dynamo DB, CloudWatch, Scalr, Cloudera CDP and SQL Client: Commonwealth Bank, Australia
Big Data Engineer
Commonwealth Bank
July 2018 - June 2019
- Orchestrate the Redshift ETL using AWS Glue and Step Functions. Implemented the python script to read data from S3 and load it into an Amazon Redshift table by using Amazon Redshift Spectrum. Implemented the aggregation query and exports the results to Amazon S3 location via UNLOAD command. Configure the AWS Step Function to starts a series of AWS Glue Python Shell jobs, each with parameters for obtaining database connection information from AWS Secrets Manager and a SQL file from S3. Configure the notification to an Amazon Simple Notification Service (SNS) topic in the case of pipeline failure. Collected and aggregated large amounts of web log data from different sources such as web servers, mobile, IOT devices using AWS Kinesis Firehose and stored the data into S3 for analysis. Implemented Lambda handler function for loading web log details into Amazon Dynamo DB. Monitor the jobs by using CloudWatch logs and metrics to debug the failures and optimize the job performance. Created Partitioning, Bucketing, Map side Join, Parallel execution for optimizing the hive queries. Technologies Used: S3, AWS Glue, Athena, Redshift, RedshiftSpectrum, Lambda, StepFunctions, CloudeWatch, Kinesis Streams, Dynamo DB, SNS, Secrets Manager, HDFS, EMR, Hive, HQL, Spark, Scala and Python.
Big Data Engineer
Lloyds Bank
January 2016 - June 2018
- Used Agile Scrum to manage the full life cycle development of the project. Involved in business requirement gathering, analysis, feasibility research of the project, estimation and enhancements. Developed Sqoop scripts to import/export data from relational sources and handled incremental loading transaction data by date. Experienced in implementing Spark RDD transformations, actions to implement business analysis. Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala. Used Spark SQL for reading data from external sources and processes the data using Scala computation framework. Developed Unix shell scripts to load large number of files into HDFS from Linux File System. Used Spark SQL to process the huge amount of structured data. Worked with the Spark for improving performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD and Spark YARN Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms. Technologies Used: HDFS, Map Reduce, Kafka, Sqoop, Hive, HQL, Spark, SQL, Java, Scala and Python.
Senior Software Engineer
Verizon
July 2014 - December 2015
- Technologies Used: C#, .Net, ASP.Net, ADO.Net, VB, Web Services, WCF, MVC 4.0, SQL Server 2012, jQuery
Lead Engineer
Kohl's
May 2013 - July 2014
- Technologies Used: .Net 3.0, ASP. Net, C# .Net, WCF, XML, HTML, jQuery SQL Server 2005, MVC, Ektron CMS, skipjack payment gateway
Lead Engineer
General Motors
February 2010 - December 2011
- Technologies Used: .Net 3.0, ASP. Net, C# .Net, WCF, XML, HTML, jQuery, SQL Server 2005