Suraj Alimineti
Development
Virginia, United States
Skills
Python
About
SURAJ ALIMINETI's skills align with System Developers and Analysts (Information and Communication Technology). SURAJ also has skills associated with Database Specialists (Information and Communication Technology). SURAJ ALIMINETI has 6 years of work experience.
View more
Work Experience
Software Development Engineer
Amazon Web Services
January 2022 - December 2023
- Led design and implementation of data pipelines using AWS Glue, Apache Spark, and AWS Lambda, ensuring efficient ETL of large datasets. Developed and maintained CI/CD pipelines using AWS Code Pipeline to automate service updates. Engineered service data plane to ingest resource data from several AWS services for both in-region and out-of-region use cases. Utilized Zeppelin, Jupyter notebooks, and Spark-Shell for developing, testing, and analyzing Spark jobs before scheduling customized Spark jobs. Developed custom ETL solutions, including batch processing and real-time data ingestion pipelines, utilizing PySpark and shell scripting to move data in and out of Hadoop. Created and launched AWS Resource Explorer, a platform that enabled AWS customers to search and discover resources amongst trillions of AWS assets with minimal latency. * Developed and deployed machine learning models using Python and AWS SageMaker, resulting in 15% increase in prediction accuracy and 20% reduction in model training time. * Streamlined service's performance monitoring procedures by writing and deploying automated performance testing scripts using Python. * Conceptualized and coded Hydra Integration tests to exercise ingestion and sourcing of resources from data sources using Python. * Facilitated automation of CI/CD workflows as part of data migration process using Concord, Looper, and Docker. * Utilized Spark Streaming and Kinesis to fetch live stream data, enabling real-time analytics and decision-making.
Graduate Teaching Assistant
George Mason University
January 2020 - December 2021
- Provided weekly guidance and assessment for class of 200 students as Graduate Teaching Assistant, ensuring learner participation and assessing student performance. Assisted students in carrying out experiments using Capstone software, analyzing, and interpreting the data using Machine Learning (ML). * Rendered hands-on guidance to students in the School of Business at GMU, assisting with comprehension of R, SQL, and Python languages and supporting learning in financial analysis and relevant resources for business course FNAN-430. * Instructed Machine Learning laboratory, facilitating practical ML experience on real data.
Research Assistant as Data Analyst
George Mason University
January 2020 - December 2021
- 2020 - 2021 Performed statistical analysis on diverse data sets, producing insightful reports, resolving data integrity issues, and delivering strategic recommendations for future project deliverables. Utilized advanced software, including SAS, Python, Tableau, and Excel for data validation, integration, problem-solving, and analysis. * Generated dashboards using Tableau Power BI for executive reporting, leading to improvement in trend analysis efficiency by 20%.
Data Engineer
Apna Cabs (Dhanush Travelomedia Pvt Ltd)
January 2018 - December 2019
- Owned all procedures related to data cleaning, integration, and analysis. Developed PySpark procedures, functions, and packages to streamline the loading of data, enhancing overall data processing efficiency. Developed Spark applications using PySpark and Spark-SQL for data extraction, transformation, and aggregation from various file formats, uncovering valuable insights into customer usage patterns. Designed interactive dashboards for visualizing insights about user data and website performance using Tableau. * Implemented data processing and transformation techniques using Azure Databricks, PySpark, and Spark SQL queries on large-scale customer transactions dataset comprising millions of records. * Conducted data curation using Hive, optimizing data processing capabilities. * Improved trend analysis efficiency by 20% through creation of insightful Tableau dashboards. * Enhanced data efficiency by 15% by generating interactive Google Analytics dashboards for in-depth marketing pipeline validations. * Performed predictive modeling and exploratory analysis on over 100K customer databases using R, Tableau, and SQL.
Data Engineer
TriWest Healthcare Alliance
January 2023 - Present
- Build and implement data ingestion processes using Azure Data Factory, PySpark, and Scala. Implemented and optimized data pipelines using Apache Kafka, ensuring real-time data streaming and processing capabilities. Utilized Hive and Hadoop technologies to perform large-scale data analysis, identifying trends and patterns for actionable insights. Proficient in PySpark and Spark SQL for handling big data transformation and processing in Azure Databricks. Load data from multiple sources into SQL databases for analysis and reporting purposes. Developed PySpark jobs to run data transformations and actions, processing over 2 Tb of data and dumping it into SQL Server * Orchestrated design and implementation of data pipelines using Python, Spark, and Scala to extract, transform, and load (ETL) data from various sources into data warehouse, leading to 30% improvement in data processing efficiency. * Led migration of legacy reporting systems to QlikView, boosting report generation speed by 30% and reducing maintenance costs by 20%.