Waheed Ahmad
Development
Punjab, Pakistan
Skills
Data Engineering
About
WAHEED AHMAD's skills align with Database Specialists (Information and Communication Technology). WAHEED also has skills associated with System Developers and Analysts (Information and Communication Technology). WAHEED AHMAD has 6 years of work experience.
View more
Work Experience
Addo AI
June 2022 - Present
- for One of the largest airlines in Asia Built a scalable server-less Data Warehouse (DW) platform on top of GCP Big Query Talend Communicated with multiple business stakeholders to understand business requirements and map them onto Source matrix (SMX) documents. Deployment of ETL jobs on top of Talend Cloud. Implemented CDC pipelines and Compact jobs. Metadata management using Big Query. Logging and Monitoring of ETL pipelines using Talend built-in tools Technology Stack: GCP BigQuery, Google Cloud Storage, Compute Instances, Talend Cloud, Talend Data Integration. Project: Crisis 24 - United Kingdom Developed a templatized ETL framework for fetching tables from SQL Server. Implemented a migration pipeline leveraging Infrastructure as Code (boto3) to automate the creation of required resources, generate Glue jobs from S3-based scripts Developed ETL templates using Appflow for efficient data transfer from S3 to Salesforce. Technology Stack: Glue, Crawlers, Data Catalog, S3, Athena, Salesforce Project: Large Provincial Government Organization in Pakistan Designed and implemented a metadata-driven data ingestion framework that ingested 23+TB of data from 46 different sources to Hive and Vertica. Designed and implemented CDC pipeline using spark. Implemented job orchestration through Airflow. Development of serving layer and data lake in Hive. Deployed alert and monitoring system for the ETL jobs Technology Stack: Hadoop, Spark, Hive, ElasticSearch, Logstash, Kibana, Great Expectation, Airflow Project: Large European telco in Pakistan Designed and implemented a metadata-driven data ingestion framework that ingested 100+TB of data from 15 different sources to Hive and Vertica. Implemented job orchestration through Talend and SLJM. Development of serving layer and data lake in Hive. Teradata jobs translation and migration into Vertica SQL. Monitoring the existing metrics, analyzing data and standing up big-data analytical tools Technology Stack: Hadoop, Spark, Hive, Vertica, SLJM
ETL Developer/ Data Engineer
Government of Pakistan
September 2018 - May 2022
- Developed ETL pipelines using Spark primarily SparkSQL Datasets to perform transformations on 28+ ingested streams of data. Extracted data from a variety of sources including Oracle, Hive and HDFS using OJDBC connectors, SFTP and Hadoop scripts. Transformed and unified input data according to a common table schema. Wrote spark-submit jobs to load the transformed data into output tables and wrote bash scripts to deploy the pipelines to production Created custom logic to deal with highly denormalized database schemas in spark ensuring type-safety. Created nested case classes and developed flattening logic to convert nested datasets to flattened dataframes having 254+ fields Designed table schemas for data sinks in order to retrieve data in the most time efficient way. Decided partitioning columns and bucketing based on the required table queries Performed Unit testing, Integration testing, Exception Handling and Logging of all the ETLs and transformation functionalities added Technology Stack: Hadoop, Spark, Hive, Vertica, SLJM