Close this

Sameer Pant

Development
Bagmati, Nepal

Skills

Data Engineering

About

Sameer Pant's skills align with System Developers and Analysts (Information and Communication Technology). Sameer also has skills associated with Database Specialists (Information and Communication Technology). Sameer Pant appears to be a low-to-mid level candidate, with 2 years of experience.
View more

Work Experience

Data Engineer

FlyingSpark Infotech
September 2022 - January 2024
  • Designed and setup enterprise data lake to support their various client's business cases including analytics, reporting, processing of voluminous and rapidly changing data. Drove transformative data initiatives by leveraging Delta Lake, demonstrating proficiency in managing large-scale data lakes. Implemented versioned data storage, ACID transactions, and schema evolution, enhancing data reliability and facilitating streamlined data processing workflows. Demonstrated advanced competency in Databricks, utilizing the platform to design and implement scalable data processing pipelines, leveraging Spark clusters for optimized data analytics, resulting in improved performance and resource utilization. Implemented machine learning model development and deployment, contributing to data-driven decision-making across the organization using MLlib and Mlflow. Created solutions using Apache Spark, AWS Glue, Amazon SageMaker and Amazon Redshift building data warehouse and implementing a range of analytical approaches, algorithms, and machine learning models and deployed solutions in production environments Drove the integration and optimization of real-time streaming workflows using Amazon Kinesis. Implemented scalable and resilient data pipelines, allowing for the seamless capture, processing, and analysis of high-frequency data. This initiative significantly elevated the organization's responsiveness and agility in leveraging realtime insights for strategic decision-making. Modified Apache Spark 3.0, 3.1, and 3.3 along with Hadoop source code where upgraded from shell scripts to language-native code in core, Spark SQL, MLlib, structured streaming, and cluster management integrating build additions to achieve impressive benchmarks for a top-tier custom Spark. Managed Airflow orchestrations and data infrastructure on cloud platforms, Replicated infrastructures in AWS from an existing Informatica/Cloudera based data lake systems with real-time analytics actualized by machine learning approaches throughout data flows for which led a team in developing, maintaining, and deploying ETL pipelines for cloud infrastructure. Analyzed advanced SQL queries, data flows, and control flows for optimization and efficient machine learning solutions. Technology Involved: Apache Spark, Apache Airflow, AWS Glue, AWS Redshift, AWS S3, AWS MWAA, AWS Lambda, PySpark etc Languages Involved: Scala, Java and Python

Associate Big Data Developer

Ekbana Solutions, Japan
November 2021 - August 2022
  • Developed a customized Apache Kafka build to meet specific business needs, implementing the build using Scala, and Python, applied algorithmic and heuristic approaches, such as load balancing, to the project. Did extensive management of Apache Cassandra clusters whereby handled the overall cluster, data quality, data versioning etc related intricacies. Managed and maintained a data lake, including migrations and database modifications. Utilized various data engineering technologies and tools, such as Apache Spark, Airflow, Cassandra, Kubernetes, Grafana etc Focused on maintaining data flows for machine learning teams. Worked on a cryptography based confidential project for which worked with C++, Python for testing components. Proficient in Shell scripting. Technology Involved: Apache Spark, Apache Kafka, Apache Cassandra, Prometheus, Grafana, Amazon Web Services, Digital Ocean, Kubernetes etc Languages Involved: Java, Scala and Python Freelance Data Scientist Intermittently freelancing, delivering impactful insights through advanced analytics and predictive modeling. Collaborated with diverse clients, translating business objectives into data-driven solutions. Proficient in Python and utilizing tools such as tensorflow, pytorch, scikit-learn, spark-mlib, xgboost, tableau etc.

Education

Pokhara University

Bachelor's in Software Engineering
January 2017 - January 2022

ISc

January 2015 - January 2017