Worky | Meet our Talent

Skills

Data Engineering

Apache Spark

AWS (Amazon Web Services)

Azure Databricks

PostgreSQL

mySQL

Microsoft SQL Server

Hadoop

About

SOHAIL ANJUM's skills align with Programmers (Information and Communication Technology). SOHAIL also has skills associated with Database Specialists (Information and Communication Technology). SOHAIL ANJUM has 12 years of work experience.

Acomplishments

Professional Summary • Overall, 11 years of IT experience across Big Data. ETL\ELT, SQL, Python, Scala, C#, Java. Interested and passionate about working in Big Data environment and data analytics • Expertise in AWS Services such as Amazon Glue, AWS Lambda, Amazon RDS, Amazon Redshift, Amazon DynamoDB, Amazon S3, AWS Data Pipeline, AWS Code Commit, AWS CloudWatch, AWS IAM, AWS IoT Analytics etc. • Designed and developed ETL/ELT data pipelines with AWS Glue/Lambda. • Utilized AWS Glue Data Catalog to maintain a centralized metadata repository, enabling efficient data discovery and cataloging. • Implemented and optimized complex data transformations on unstructured, semi-structured and structure data using AWS Glue/ PySpark data frames. • Developed Crawlers in AWS Glue to automate the discovery and cataloging of metadata from diverse data sources. • Integrated AWS Glue with other AWS services, including Amazon S3, RDS and Redshift, to create scalable and flexible data architectures. • Strong Analytical and data aggregation skills using PySpark, Advanced SQL, duckdb. Pandas. • Developed data pipelines using Databricks Delta, a data lake house platform built on Apache Spark that provides ACID transactions and scalable analytics. • Performed data ingestion from different data sources like S3 bulk files, database using PySpark/JDBC. • Hands on experience in Glue Catalog/Athena and Hadoop, Hive tables • Configured Auto workload management in AWS Redshift • Implemented PySpark streaming to pick up data from Kafka topic and send to Data pipeline. • Design and Implemented Database, Tables, Relationships and Store Procedures. • Developed custom scripts and applications using AWS Wrangler API to automate data ingestion, cleaning, and transformation tasks, resulting in faster data processing times and improved data quality. • Designed interactive dashboards and reports in Microsoft Power BI, enhancing data-driven insights. • Implemented DAX/Measure/Relationship for custom KPIs to track key business metrics.

Work Experience

Senio Data Engineer

DPL

December 2021 - Present

Key Responsibilities: Structured and programmed scalable data pipelines using AWS Glue and Lambda functions to process large volumes of data from various sources. Utilized AWS Glue allowing seamless data integration, transformation, and loading into data lakes, databases and warehouses Designed and implemented multiple databases using Amazon RDS, to support the company's web, mobile applications and IoT data Implemented data processing pipelines using PySpark, a Python library for Apache Spark, to handle largescale data transformations efficiently. Utilized AWS Glue for ETL processes, allowing seamless data integration, transformation, and loading into data lakes and warehouses. Designed and optimized data workflows to ensure scalability and performance in a cloud environment. Collaborated with cross-functional teams to understand data requirements and implemented solutions for complex data processing tasks. Improved database performance by through implementing database optimization techniques such as indexing, query optimization, and database tuning Hands on NoSQL databases using AWS DynamoDB, utilizing features such as partitions, global secondary indexes, and streams. Designed and maintained serverless ETL workflows using AWS Step Functions and AWS Lambda to transform data in the data lake into consumable formats for analytics and reporting. Designed and implemented real-time data streaming solutions using AWS Kinesis and AWS Lambda to process and analyze real-time data. Designed interactive dashboards and reports in Microsoft Power BI, enhancing data-driven insights. Implemented DAX/Measure/Relationship for custom KPIs to track key business metrics. Implemented security best practices for AWS services using AWS Identity and Access Management (IAM) and VPC security groups. Experienced in using Nexus Scrum framework to scale Agile Scrum practices across multiple teams, resulting in improved alignment, coordination, and delivery of complex software products. Expertise in Jira for project management, with experience in creating and managing tasks, user stories, bugs, and epics, resulting in increased visibility and efficiency in software development workflows Tools / Environment: AWS Service (Glue, Lambda, Step Functions, Data pipeline, IoT Core, IoT Analytics, RDS for MySQL, Code commit), Microsoft Power BI, PySpark, Pandas, NumPy, VPC, Subnets, Nexus Scrum, JIRA etc.

Software Engineer

Trees Technologies

April 2013 - March 2024

Key Responsibilities:
• Coordinated with the Technical Lead on current programming tasks.
• Collaborated with other programmers to design and implement features.
• Quickly produce well-organized, optimized, and documented source code.
• Create and document software tools required by artists or other developers.
• Debug existing source code and polish feature sets.
• Contributed to technical design documentation.
• Work independently when required.
Tools / Environment:
C#.NET, Asp.Net, Web

Senior Data Engineer

S&P Global Market Intelligence

February 2016 - December 2021

Key Responsibilities:
• Experienced in designing and deploying data pipelines and analytics solutions using Databricks on AWS.
• Developed ETL workflows using Databricks and AWS services such as Glue, Lambda.
• Expertise in performance tuning and optimization of PySpark-based applications and clusters on AWS
using Databricks, Spark UI, and CloudWatch metrics.
• Implementing complex business transformation using PySpark language and storing transformations into
delta/parquet format, performed

Software Engineer

F3 Technologies/Health Care Technologies

March 2014 - January 2016

Key Responsibilities:
• Analyzed and implemented best coding practices into the project code.
• Full knowledge of how to design, program, implement, and maintain.
• Identified and developed areas for revisions in current projects.
• Executing and implementing software tests.
• Developed quality assurance procedures for software projects.
• Coordinating the efforts and cooperating with other developers, designers, system and business
analysts, etc.
• Break down bigger tasks into smaller, easily

Sohail Anjum

Skills

About

Acomplishments

Work Experience

Senio Data Engineer

DPL

December 2021 - Present

Software Engineer

Trees Technologies

April 2013 - March 2024

Senior Data Engineer

S&P Global Market Intelligence

February 2016 - December 2021

Software Engineer

F3 Technologies/Health Care Technologies

March 2014 - January 2016

Education

University of Arid Agriculture

BS(CS)

January 2008 - January 2012