DevOps/SRE Engineer at Playsdev
November 2022 - Present
An outsourcing company specializing in providing DevOps services.
Non-project Activities:
Conducted screening interviews for internship positions, conducted internal interviews, candidate selection, mentoring and assisting in training interns.
Project 0. SRE Engineer, Insurance Company (Production Operations & Incident Management).
Managed production CI/CD pipelines with automated deployment, rollback strategies, and secrets management using HashiCorp Vault. Maintained monitoring and logging infrastructure (Prometheus, Grafana, ELK stack).
Handled on-call duties and production incidents: performed rapid diagnosis, implemented fixes and rollbacks, conducted post-incident root cause analysis (RCA) from both technical and business perspectives, and documented findings in runbooks and playbooks.
Identified and resolved system bottlenecks and performance issues, optimized Java applications (JVM tuning, memory optimization), and provided technical support to development teams.
Maintained high availability and reliability of production systems, reducing incident frequency and improving system stability.
Project 1. DevOps Engineer, Mobile Travel Application.
Migrated test infrastructure from AWS to on-premises VMs, set up dev, test, and production environments with zero downtime.
Deployed and managed Kubernetes cluster on VMs, configured access via an external load balancer with high availability.
Optimized Dockerfiles for Python microservices (reduced image size by ~30%), migrated from Docker Compose to Kubernetes using Helm charts.
Set up comprehensive monitoring stack (Prometheus + VictoriaMetrics + Grafana), logging (Loki + Promtail + ELK), and alerting via Telegram.
Deployed self-hosted GitLab in Kubernetes and automated CI/CD pipelines in GitLab, reducing deployment time significantly through library and dependency caching.
Deployed and administered MinIO, Harbor, PostgreSQL, Redis, Kafka, Cassandra, Debezium CDC, Jaeger, Sentry, and EFK stack.
Wrote comprehensive infrastructure documentation and runbooks for team knowledge sharing.
Project 2. DevOps Engineer, Fintech.
Developed and optimized Dockerfiles for Java and Node.js applications, implementing multi-stage builds and best practices.
Created and maintained CI/CD pipelines in Jenkins for building, testing, and deploying applications to Nexus and OpenShift.
Managed repositories in Bitbucket, Nexus, Confluence, SonarQube, and HashiCorp Vault for secrets management.
Configured, deployed, and debugged microservices in OpenShift, ensuring scalability and reliability.
Set up Istio service mesh for internal traffic routing, migrated to newer service mesh versions with zero downtime.
Configured sidecar containers for monitoring (Prometheus exporters) and logging (Fluent Bit).
Conducted load testing with Apache JMeter, visualized results in Grafana, and provided performance recommendations.
Wrote internal documentation and conducted team training sessions on DevOps practices and tools.
Project 3. DevOps Engineer, Infrastructure Migration to AWS (Terraform & CI/CD).
Designed and implemented Terraform infrastructure modules with remote state management, environment separation (dev/staging/prod), and best practices for code organization and reusability.
Migrated infrastructure from on-premises to AWS, wrote comprehensive Terraform code to provision environments (EC2, VPC, EKS, S3, RDS) and test benches.
Wrote Ansible playbooks for automated server configuration and service deployment.
Set up and managed managed Kubernetes cluster (EKS) with high availability.
Created and configured Helm charts for application deployment to Kubernetes.
Built CI/CD pipelines in GitLab with automated build, test, deployment, and rollback capabilities; migrated from Jenkins with improved reliability and faster deployment cycles.
Established observability: Prometheus and Grafana for metrics and alerting, EFK stack (Elasticsearch, Fluentd, Kibana) for log aggregation and analysis.
Conducted load testing, optimized application performance in Kubernetes, and initiated GitOps transition using ArgoCD.
Configured WireGuard VPN for secure developer access and provided ongoing team support.
Project 4. DevOps Engineer, Group of Commercial Websites (Azure Migration & Production Operations).
Migrated production infrastructure from AWS to Azure with zero-downtime strategy, managing Azure VMs, App Service, ACR, CDN, Storage Accounts, SQL Databases, and Virtual Networks.
Designed and implemented Terraform modules for reusable infrastructure components, managed Terraform state with remote backend, and applied infrastructure-as-code best practices (modularization, versioning, environment separation).
Built comprehensive CI/CD pipelines in Azure DevOps with automated build, testing, deployment, and rollback capabilities; implemented secrets management using Azure Key Vault and secure artifact handling.
Established full observability stack: configured Azure Monitor for metrics and alerting, integrated Grafana dashboards for visualization, set up ELK stack (Kibana) for log aggregation and analysis, and deployed Sentry for error tracking and monitoring.
Handled production incidents: performed root cause analysis, implemented hotfixes and rollbacks, created runbooks and playbooks for common scenarios, reducing mean time to resolution (MTTR) by 40%.
Managed DNS, Cloudflare security (WAF rules, DDoS protection), optimized PHP servers for performance, and maintained 99.9% uptime SLA.
Conducted performance testing, optimized cloud costs achieving ~30% reduction, and provided ongoing developer support and troubleshooting.