Sai Muppidi
Development
NC, United States
Skills
DevOps
About
SAI MUPPIDI's skills align with Programmers (Information and Communication Technology). SAI also has skills associated with Consultants and Specialists (Information and Communication Technology). SAI MUPPIDI has 6 years of work experience.
View more
Work Experience
Site Reliability Engineer
CodaMetrix
March 2023 - Present
- Involved in designing and deploying a multiple of applications utilizing almost all the AWS stack including EC2, Route53, S3, RDS, Dynamo DB, SNS, SQS, EBS, Lambda, RedShift, focusing on high-availability, fault tolerance and auto-scaling in AWS using Cloud Formation. Migrated to AWS Cloud and designed, built, and deployed a multitude application utilizing AWS stack (Including EC2, Route53, S3, RDS, SNS, and IAM), by focusing on high-availability, fault tolerance. Used IAM to create and manage AWS users and groups and use permissions to allow and deny their access to AWS resources Configured Access for inbound and outbound traffic to RDS DB services, DynamoDB tables, EBS volumes to set alarms for notifications or automated actions. Worked on AWS Route53 for registering domain names, to route internet traffic for domains and monitor the health checks of the resources. Supported AWS cloud with instances, configured Elastic IP & Storage Deployed configuration management and provisioning to AWS using Terraform involved in automated deployment of EBS on to the AWS Cloud ec2 instance server and automated the complete deployment environment on AWS Used AWS Security Hub to aggregate and prioritize security findings from various sources, including AWS GuardDuty, AWS Config, and third-party solutions. Implemented AWS Key Management service (KMS) is a service that help to create and control the encryption keys used to encrypt data. Experience in building end to end CI/CD Pipelines in GitHub Actions to retrieve code, compile applications, perform tests and push build artifacts to Artifact repository. Built and maintained CI/CD pipelines using GitHub actions and Integrated CI/CD tools (Azure DevOps Pipelines and GitHub Actions) with containerization tools (Docker, Kubernetes) to the applications. Managed a CI/CD pipeline using GitHub Actions to automate the build, test, and deployment of applications to a production environment. The pipeline triggered on push to the main branch, and included steps for building the application, running unit tests, integration tests, and end-to-end tests, building the docker image and deploying the application to a Kubernetes cluster (EKS) by using the Helm charts. Deployed micro-services on AWS VPC and AWS EKS cluster for scalable and efficient deployment. Written Terraform templates with a focus to maintain high-available, fault tolerant, and auto-scalable resources. Hands on experience on using AWS Serverless service like lambda as a back-end service and a middleware for log processing. Created Terraform templates for provisioning Virtual Networks, Auto Scaling, APP Gateway and used Terraform graph to visualize execution plan using the graph command.Worked on HashiCorp Vault secret tool to provide security for credentials, tokens, and API keys. Maintained Terraform State Files to ensure the preservation of infrastructure states and prevent conflicts. Designed and implemented highly scalable, available, and maintainable state management strategies for Terraform, allowing for efficient management and scaling of infrastructure Implemented Terraform deployments by using Terragrunt to create reusable infrastructure templates and modules, and by correcting existing Terraform modules that had version conflicts. This provided better control and added missing capabilities and Experience in Writing and Testing Sentinel Policies for TerraformEnterprise Guide to our Resource Library. Branching, Merging, Release Activities on Version Control Tool GIT. Used GitHub Enterprise as version control to store Source Code and implemented Git for branching and merging operations for Source Codes. Used Gradle and Maven as build tools for Testing the application manually and run the Junit Test suites, selenium test cases wrote Grade and Maven Scripts to automate the build process. Creating Local, Virtual Repositories in Artifactory (Nexus3) for the project and release builds, repository management in Maven to share snapshots and releases of internal projects using Artifactory too Created alarms and trigger points in CloudWatch based on thresholds and monitored the server's performance, CPU Utilization, disk usage. Worked on Shell scripts to automate the Cloud systems in creation of Resource groups, Web Applications, AWS Storage & Tables, firewall rules and used Python scripts to automate day to day administrative tasks. Worked on creating backup procedures using Shell Scripts and Also Written Power Shell scripts for archiving and moving of older log files to AWS Storage Container Management using Docker by writing Docker files and set up the automated build on Docker HUB. Worked with Docker Container Snapshots, including attaching to a running container, removing images, managing directory structures, and managing containers. Worked in setting up Docker daemon, Docker client, Docker hub, Docker registries, images and handling multiple images by storing them in containers to deploy. Developed procedures to unify streamlineand automate applications. Deployed application which is containerized using Docker onto a Kubernetes cluster which is managed by AWS Elastic Kubernetes Service (EKS) Managed Kubernetes deployments by creating objects that ensure high availability and scalability, utilizing Horizontal Auto Pod Scaler and Resource Management techniques. Implemented manifests and Helm charts to deploy micro-services applications into k8s clusters. Deployed AWS Kubernetes Clusters and leveraged my expertise to create numerous Kubernetes object files using YAML files. These object files included Pods, Deployments, ReplicaSets, DaemonSets, StatefulSets, Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), ServiceAccounts, Load balancers, Ingress Controller (NGINX) and Resource, and Health check probes Designed Helm Charts to deploy applications on AWS EKS Clusters, including writing custom charts for internal applications and adding manifest files to the chart templates to apply custom configurations. I also utilized Istio (ServiceMesh) to enable dynamic service discovery and implement traffic management functionality such as traffic shadowing, splitting, and ensuring reliable communication between services Defined Ingress & Egress routing rules connecting external HTTP and HTTPS requests with internal services and individual pods using ISTIO Ingress Gateway, Configured default backend as part of the Ingress controller Worked on Ansible Playbooks scripts for configuring infrastructure management tasks in Linux environments and deployed code to various environments, including development, test, and production, using tools such as Ansible and Chef to automate deployment processes. Automated various infrastructure activities like Continuous Deployment, Application Server setup, Stack Monitoring using Ansible Playbooks where I have created Ansible Roles using Ansible Galaxy Collections. Monitored Kubernetes clusters by integrating it with Prometheus (Promql) using Helm Charts for collection of metrics and Grafana dashboard to view metrics and also configured Prometheus Exporters to collect and expose metrics from different types of applications and services. Integrated Loki as a datasource in Grafana,for Logs enabling efficient log collection, storage, and visualization. This allowed me to quickly and easily troubleshoot problems and identify trends in our systems. Integrated ArgoCD notifications and Grafana alerts with Slack, centralizing all critical alerts and notifications in one place, enabling real-time visibility and response. This helped me and my team to quickly identify and resolve problems, reducing downtime and improving overall system reliability. Used JIRA as our issue tracking tool, to plan, track and manage the application development. Integrated JIRA with Git to track the code changes and progress. Environment: AWS, Terraform, Docker, Kubernetes, Helm, Istio, Git, Github Actions, ArgoCD, Maven, Nexus3, Ansible, Shell Scripting, Python, Snyk, Prometheus, Grafana, Jira, ClickUP, Slack, YAML, JSON. Allstate | Northfield Township, IL - AWS Cloud/DevOps Engineer
Subnets, Security Groups
August 2019 - December 2021
- Worked on creating and configuring resources on AWS such as VPC, Subnets, Security Groups, S3 buckets, and EC2 instances, using both AMIs and custom images as well as setting up launch templates customized for specific applications. Created IAM users to manage access to our AWS resources. Each user was assigned a unique set of security credentials that provided them with access to AWS services such as S3, EC2, and RDS, also allowing us to effectively control access to the aws resources. Managed User Privileges and Groups created Custom Roles and Policies for user and resource management and constructed system performance metrics using AWS Cloud Watch and AWS Cloudtrail. Designed AWS Cloud Formation templates to create custom sized VPC, subnets, NAT to ensure successful deployment of Web applications and database templates and leveraged AWS Migrate, VM import/export for migrating apps from VM Ware to AWS Provisioned the AWS App Mesh for network traffic controls on EC2 instances, ECS, EKS, AWS Fargate and integrated monitoring tool cloud watch to automatically export the data to Splunk and ELK stack. Worked on building Automation logs on AWS using Lambda Cloud Watch logs, S3 Access Logs, Classic ELB Access Logs, Application ELB Access Logs, Cloud Front Access Logs, and RedShift Logs by setting up triggers manually or automate so that Data dog can manage them when new logs are observed. Utilized AWS Route53 to manage DNS zones and assign public DNS names to Elastic Load balancers IP's and also connect to the user requests to the AWS infrastructure supported with MEM Cache and AWS Elastic Cache and configured ALB, NLB for routing traffic between the different zones. Involved in the design and implementation of a continuous integration (CI) and continuous delivery (CD) pipeline using Jenkins and Cloud Formation Modeled and automated the end-to-end CI/CD pipeline, which involved in creating a Continuous Integration server with tools like Jenkins, Maven, Subversion, GIT, Ant, and SonarQube. Implemented Cluster services using Docker and Amazon Kubernetes services (EKS) to manage local deployments in Kubernetes by building a self-hosted Kubernetes cluster using Jenkins CI/CD pipeline. Deployed self-managed kubernetes clusters on top of Amazon EC2 instances using KOPS. Leveraged deployments in Kubernetes by creating custom Helm charts and deploying application container microservices applications with integration to AWS services. Used Istio Service Mesh along with the Kubernetes stack for the deployment of the microservices that helps in getting the automatic metrics, logs, and traces for all traffic within a Kubernetes cluster, including cluster Ingress and Egress Managed Kubernetes charts using Helm. Created reproducible builds of the Kubernetes applications, managed Kubernetes manifest files and managed releases of Helm packages Worked on creating the Docker Images and maintaining a Docker registry for effective management of application lifecycle and managed Docker container snapshots on various artifactory tools, such as Jfrog, Nexus, and ECR registry, as well as effectively managed directory structures on AWS. Designed Terraform templates to create custom sized VPC, Subnets, NAT and Custom EC2 Instances to ensure successful deployment of Web applications, database templates and migration from traditional to cloud environmen t. Worked on setting up Infrastructure and configuration related IAC to spin up Hashicorp Vault for external secret Management and Kube Authroles mapping in EKS cluster using terraform. Also built out DataDog monitors for alerting CPU and memory utilization of the services running on the EKS cluster using Terraform. Written Python, Groovy and ShellScripts to do CI/CD using Jenkins, Git in GitHub and to improve application security. Worked on Automation build tool Maven and thereby deployed artifacts to Jfrog. Used Maven as a Build tool for building deployable Artifacts (War & Jar) from source code. Worked with Maven in Java environment for authoring (pom.xml) files for Java projects and managing maven repositories. Written Ansible Playbooks that serve as the entry point for Ansible provisioning, with automation defined through tasks in YAML format to set up a Continuous Delivery pipeline and run Ansible scripts to provide Dev servers. Developed Ansible Playbook s using Python SSH as a wrapper to manage AWS node configurations and tested them on AWS instances using Python. Implemented practices such as centralized logging Nagios server monitoring Nagios and Ansible. Worked on Ansible for effective Configuration Management of hosted instances within AWS. Also, I have created playbooks for various automation purposes, such as copying files, changing permissions, configuring settings, and creating specific folders. Managed the Maven Repository using Nexus tool to automate the build process and used the same to share the snapshots and releases of internal projects. Also Integrated Maven with Bitbucket to deploy project related tags and troubleshoot the issues raised by Developers while Merging/Rebase their source code. Involved in developing custom scripts using Python, Shell, Cron Jobs to automate the deployment process and for Task scheduling, Systems backups Used Jira points to create a detailed project plan and to identify potential risks and mitigation strategies. Environment: AWS, ISTIO, Lambda, CloudFormation, Fargate, TCP/IP Network protocols, Docker, Kubernetes, Ansible, Nagios, Maven, ANT, Jenkins, DyanamoDB, GIT, Jira, Linux. Cybage Software Pvt ltd | Hyderabad, India - DevOps Engineer
Worker
Azure Virtual Networks
September 2018 - July 2019
- Subnets, DHCP address blocks, Azure network settings, DNS settings, security policies and routing. Also, deployed Azure IaaS virtual machines and Cloud services (PaaS role instances) into secure Virtual Networks and Subnet using Terraform reusable modules. Managed Azure Compute Services, Auto Scaling, Elastic Load Balancing, Horizontal and vertical scaling, VM Scale Set, Application Gateway, Network Security Group, Web role, Worker role and Scaling/Management. Created cloud modules for interacting with Azure services which provides the tools to easily create and Orchestrate infrastructure on Azure and automated cloud-native applications in Azure using Azure Microservices Implemented Azure Site Recovery (ASR) and Azure Backup Installed and Configured the Azure Backup agent and virtual machine backup, Enabled Azure Virtual machine backup from the Vault and configured the Azure Site Recovery (ASR). Worked on Migrating on-premises instances to Azure Cloud by utilizing ARM subscription along with Azure Site Recovery. Additionally, experience in building and installing servers through ARM Templates also worked on various aspects of Azure Network, including VPN and Express Route, as well as Azure DNS and Load Balances. Implemented and managed Azure Key Vault to securely store and manage sensitive data, including secrets, keys, and certificates, resulting in enhanced application security, streamlined credential management, and a notable decrease in security incidents related to unauthorized access and data breaches. Integrated Azure with Docker Enterprise edition to generate Azure VM Scale sets for autoscaling, Azure load balancing, and Azure storage Managed Azure Container Registry to store private Docker Images, which are deployed through Azure Pipelines in Dev, Test, Stage and Production environments. Azure Monitor to collect metrics and logs. Configured Monitor to track performance and maintain security, and also used Splunk tool to collect metrics, queries and logs. Created a Azure CI pipeline for Continuous integration tests using Azure DevOps and got the secrets of this CI pipeline from Azure Key Vault and ran the CI pipeline to build the artifacts and then created the Azure CD Pipeline to deploy the built artifacts and released to production. Worked with Terraform Templates to automate the Azure IaaS virtual machines using terraform modules and deployed virtual machine scale sets in production environment. Responsible for Administering and maintaining Jenkins and Jenkins slaves on windows and Linux (Debian/ Ubuntu). Created Jenkins slaves and set up jobs on master to run on slaves. Worked on Jenkins plugins for integrating with other tools like Slack, SonarQube, Gradle, etc. for better collaboration and monitoring and integrated Jenkins with other tools such as Bitbucket and Jira for complete end-to-end automation of the software development life cycle. Designed and implemented complex workflows for building, testing, and deploying applications using Jenkins file and DSL syntax and maintained Groovy scripts for Jenkins pipeline automation, resulting in streamlined CI/CD processes and increased productivity. Worked on preparing proof of concept (poc) for implementing Kubernetes in the environment by containerizing test micro-services and used Kubernetes to orchestrate the deployment, scaling and management of containers Used Kubernetes to provide a platform for automating deployment, scaling, and operations of application containers across clusters of hosts. Developed and implemented custom Prometheus exporters to monitor Kubernetes resources and application metrics, providing deep insights into system performance and behavior and configured Prometheus alerting rules to proactively detect and resolve issues, ensuring high availability of Kubernetes workloads and applications and those are alerts are integrated to slack channels. Worked with Azure AD service creating new AD users & groups, defining roles and policies and Identity providers and created alarms and trigger points in Azure Monitor based on thresholds and monitored the server's performance, CPU Utilization, disk usage Leveraged AppDynamics Analytics to gain insights into application behavior and trends over time, enabling proactive optimization and resource allocation. Used Bitbucket to onboard new team members and provide them with access to project codebase and worked on a feature branch workflow in Bitbucket for streamlined development and code review processes Implemented policies based on the Microsoft Defender for Cloud recommendations provided by the Azure Security Center to establish a secure environment and maintain compliance with industry standards and regulations, ensuring that the infrastructure is protected from potential threats. Utilized Chef for automating server provisioning and configuration management, reducing manual setup time and ensuring consistency. Effectively utilized Datadog monitoring features to proactively detect and troubleshoot issues related to container-based deployments using Docker and Kubernetes and employed advanced techniques in Datadog to analyze metrics and logs, enabling the quick resolution of critical incidents and ensuring high availability of services. Worked on automating incident management using ServiceNow and Splunk for log analysis by installing Splunk Heavy forwarders inside the Network and used Splunk SaaS solution to create custom dashboards Created PowerShellscripts to automate the creation of Azure Cloud systems, including end-to-end infrastructure, virtual machines, storage, and firewall rules. Environment: Azure, Docker, Kubernetes, Jenkins, Azure Pipelines, Datadog, Splunk, Bit Bucket, GIT, Prometheus, Chef, AppDynamics, Jira, ServiceNow, PowerShellScripts, Windows, Linux.
System Administrator
CA Technologies
June 2017 - August 2018
- Involved in configuration, installation, implementation, maintain and support for the Linux servers, Ubuntu, RHEL, CentOS, Fedora Installed operating system on multiple machines using Kick Start and Performed server updates, patching, upgrade and package installations using RPM and YUM. Installation of SSH and configuration of keys bases authentication. Worked with various storage volume manager such as LVM, SVM, VERITAS volume manager to create disk groups, volume groups and used RAID technology for backup and recovery. Administered Linux servers for several functions including Apache/Tomcat server, mail server and MySQL databases in both development and production. Configured and maintained NFS, LDAP, HTTP, as well as DNS on Linux servers. Monitoring and managing performance of ESX servers and Virtual Machines Performed Administrative and Management tasks using Shell Scripts written in Bash, Python, C Shell and CRON tab in Linux to automate the tasks. Experience in using monitoring tools such as Nagios for general disk space usage and conduct systems performance monitoring and tuning. Monitored system activities like CPU, memory, disk space to avoid performance issues Developed various bash/cron shell scripts to automate resource, job monitoring and alerting and Deploying the scripts to be executed as checks by Nagios for execution in both Windows and Linux environments. Hands-on experience in installation, VF configuration of SSH to enable secure access to servers, Antivirus, deployment of VMs, Snapshots, and templates Responsible for keeping the servers up and running as well as providing direct user support for any technical issues related to Linux and Windows Systems. Maintained network and Windows virtual machines with scripts using PowerShellScripts as well.