Shengxu Sun

Senior DevOps Architect | AWS SA-Pro | Kubernetes (CKA/CKS) | Multi-cloud | Observability (Elastic / Grafana / OTel) | 0→1 Infrastructure Builder | MBA

https://www.linkedin.com/in/shengxu-sun/

PROFESSIONAL SUMMARY

Lead-level DevOps Architect with end-to-end ownership of cloud and platform infrastructure.

Specialized in 0→1 platform architecture, Kubernetes at scale, multi-cloud strategy, full-stack observability, and CI/CD automation.

Consistently delivered highly scalable, secure, cost-efficient platforms, enabling rapid product delivery, system reliability, and business expansion.

AWS Certified Solutions Architect - Professional, CKA, CKS, MBA.

CORE COMPETENCIES

  • Cloud & Infrastructure: AWS, Azure, GCP, Aliyun; VPC, HA/DR, cost optimization.
  • Kubernetes & Containers: Kubernetes, Containerd, Helm, AKS/EKS.
  • CI/CD & Automation: Jenkins, Harbor, GitLab, Argo CD, Terraform, Ansible.
  • Observability: Elastic Stack, Grafana Stack, OTel, APM, alerting.
  • Security & Reliability: RBAC, network policies, container security.
  • Networking & Edge: OpenResty, APISIX, Cloudflare, CloudFront.
  • Other Tools: Jira, Confluence, Microsoft 365 admin, HAProxy, Nginx, Bash, Python.

PROFESSIONAL EXPERIENCE

Senior DevOps Architect — Wizlah Ventures (Dec 2020 – Feb 2025)

Cloud Architecture & Cost Optimization

  • Architected and spearheaded the 0→1 migration to a multi-cloud (Azure & AWS) environment, utilizing Terraform (IaC) for automated provisioning; achieved >50% cost reduction through resource rightsizing, auto-scaling, and innovative resource utilization strategies.
  • Collaborated with cross-functional teams (PMs, Devs, Designers) to align cloud architecture with business objectives, facilitating seamless cloud adoption and 99.9% uptime for mission-critical systems.

Platform Engineering & CI/CD

  • Independently designed and optimized the company-wide CI/CD ecosystem (Jenkins, GitLab, Harbor, Nexus); standardized deployment pipelines for 15+ microservices, enabling fully automated, zero-touch delivery and significantly cutting release cycles.
  • Architected a standardized developer platform by integrating SSO (OpenLDAP to Entra ID) for unified authentication and Nacos for dynamic service discovery; drove process consistency across multi-cloud environments and significantly improved developer productivity through automated environment bootstrapping.
  • Developed Python/Bash scripts to extend IaC capabilities and automate routine maintenance tasks.

Full-Stack Observability & Reliability

  • Established a unified observability framework (Elastic Stack, Grafana, OpenTelemetry) for proactive incident detection; rapidly resolved complex technical issues to minimize downtime, reducing MTTR and enhancing system reliability.
  • Managed edge networking and security via OpenResty, APISIX, and CDNs; researched and adopted emerging cloud-native tools to ensure performance optimized for evolving business needs.

Security, Governance & Leadership

  • Defined and enforced a comprehensive cloud security model across multi-cloud VPC/VNet architectures, incorporating IAM least-privilege access, K8s RBAC, and OPA policies.
  • Managed TLS/SSL certificate lifecycles and implemented data encryption (at rest and in transit) to align with security best practices and ensure data integrity.
  • Owned hybrid infrastructure (VMs, File Servers, Microsoft 365) and authored technical documentation; established operational standards to ensure system maintainability.
  • Provided technical mentorship to junior staff and led knowledge-sharing initiatives to improve team efficiency and process standardization.

Senior DevOps (Contract) — Infinite Computer Solutions (Aug 2025 – Nov 2025)

  • Implemented and maintained Infrastructure as Code (IaC) using Terraform for automated cloud provisioning and Ansible playbooks for consistent configuration management and application deployment.
  • Developed and managed robust CI/CD pipelines using Jenkins and GitLab workflows, ensuring secure and efficient deployment processes across development and production environments.
  • Collaborated with Data Science teams to operationalize machine learning models, building event-driven automation to streamline workflows and improve operational efficiency.
  • Monitored system performance and resolved complex technical issues to ensure high availability (HA) and scalability of production-grade cloud workloads.
  • Enforced security best practices and documented architectural decisions, ensuring process standardization and system maintainability in alignment with industry standards.

NOC Engineer — Orion Consultancy (Mar 2018 – Dec 2020)

  • Managed and maintained multi-cloud infrastructure across AWS, GCP, and Aliyun, ensuring high availability and performance across diverse regional environments.
  • Built and administered proactive monitoring solutions using Prometheus and Grafana to track infrastructure health and network latency, facilitating rapid incident response and performance tuning.
  • Leveraged Ansible for automated configuration management and streamlined infrastructure deployment, ensuring environment consistency and reducing manual intervention.
  • Optimized traffic management and system resilience by configuring and maintaining high-availability load balancers and reverse proxies (HAProxy, Nginx, Squid).
  • Enhanced infrastructure security by implementing DDoS protection via Fail2Ban with a centralized database for coordinated threat mitigation.
  • Automated SSL/TLS certificate lifecycle management using Let’s Encrypt, ensuring continuous encryption and reducing operational overhead.

Service Delivery Engineer — AsiaCloud Solutions (2014–2016, 2017–2018)

  • Delivered end-to-end managed services and deployed mission-critical infrastructure, including Windows Servers, NAS, and network switches, to optimize system performance for enterprise clients.
  • Administered regional IT infrastructure across multiple APAC locations, resolving complex connectivity issues and ensuring consistent service delivery across distributed regional offices.
  • Implemented proactive security monitoring and system hardening measures while collaborating with cross-functional teams to standardize operational procedures.
  • Developed custom applications to automate internal workflows, significantly enhancing operational efficiency and improving data management for client teams.
  • Achievement: Led a comprehensive office relocation project, independently designing and executing the migration of servers and network infrastructure to ensure a timely setup with minimal business disruption.

IT Support & SysAdmin — Ley Choon (2013–2014)

  • Daily support, server/backup management, ERP DB maintenance.

Project Supervisor — Foxconn (2011–2013)

  • Led 100–120 staff; handled SOP/KPI; launched new Nintendo RMA project.
  • Achievement: Directed a team to build a new project successfully, met customer’s requirement, and won new profitability for company.

Product Engineer — Foxconn (2009–2011)

  • Fault analysis, customer handling, test automation.
  • Achievement: Reduced manpower by 40%.

EDUCATION

  • Master of Business Administration – MBA — Jinan University (2022–2024)
  • Bachelor’s degree of Engineering — Computer Science, SDUT (2005–2009)
AWS SAP CKS CKA