Shengxu Sun
Senior DevOps Architect | AWS SA-Pro | Kubernetes (CKA/CKS) | Multi-cloud | Observability (Elastic / Grafana / OTel) | 0→1 Infrastructure Builder | MBA
https://www.linkedin.com/in/shengxu-sun/
PROFESSIONAL SUMMARY
Lead-level DevOps Architect with end-to-end ownership of cloud and platform infrastructure.
Specialized in 0→1 platform architecture, Kubernetes at scale, multi-cloud strategy, full-stack observability, and CI/CD automation.
Consistently delivered highly scalable, secure, cost-efficient platforms, enabling rapid product delivery, system reliability, and business expansion.
AWS Certified Solutions Architect - Professional, CKA, CKS, MBA.
CORE COMPETENCIES
- Cloud & Infrastructure: AWS, Azure, GCP, Aliyun; VPC, HA/DR, cost optimization.
- Kubernetes & Containers: Kubernetes, Containerd, Helm, AKS/EKS.
- CI/CD & Automation: Jenkins, Harbor, GitLab, Argo CD, Terraform, Ansible.
- Observability: Elastic Stack, Grafana Stack, OTel, APM, alerting.
- Security & Reliability: RBAC, network policies, container security.
- Networking & Edge: OpenResty, APISIX, Cloudflare, CloudFront.
- Other Tools: Jira, Confluence, Microsoft 365 admin, HAProxy, Nginx, Bash, Python.
PROFESSIONAL EXPERIENCE
Senior DevOps Architect — Wizlah Ventures (Dec 2020 – Feb 2025)
Cloud Architecture & Cost Optimization
- Architected and spearheaded the 0→1 migration to a multi-cloud (Azure & AWS) environment, utilizing Terraform (IaC) for automated provisioning; achieved >50% cost reduction through resource rightsizing, auto-scaling, and innovative resource utilization strategies.
- Collaborated with cross-functional teams (PMs, Devs, Designers) to align cloud architecture with business objectives, facilitating seamless cloud adoption and 99.9% uptime for mission-critical systems.
Platform Engineering & CI/CD
- Independently designed and optimized the company-wide CI/CD ecosystem (Jenkins, GitLab, Harbor, Nexus); standardized deployment pipelines for 15+ microservices, enabling fully automated, zero-touch delivery and significantly cutting release cycles.
- Architected a standardized developer platform by integrating SSO (OpenLDAP to Entra ID) for unified authentication and Nacos for dynamic service discovery; drove process consistency across multi-cloud environments and significantly improved developer productivity through automated environment bootstrapping.
- Developed Python/Bash scripts to extend IaC capabilities and automate routine maintenance tasks.
Full-Stack Observability & Reliability
- Established a unified observability framework (Elastic Stack, Grafana, OpenTelemetry) for proactive incident detection; rapidly resolved complex technical issues to minimize downtime, reducing MTTR and enhancing system reliability.
- Managed edge networking and security via OpenResty, APISIX, and CDNs; researched and adopted emerging cloud-native tools to ensure performance optimized for evolving business needs.
Security, Governance & Leadership
- Defined and enforced a comprehensive cloud security model across multi-cloud VPC/VNet architectures, incorporating IAM least-privilege access, K8s RBAC, and OPA policies.
- Managed TLS/SSL certificate lifecycles and implemented data encryption (at rest and in transit) to align with security best practices and ensure data integrity.
- Owned hybrid infrastructure (VMs, File Servers, Microsoft 365) and authored technical documentation; established operational standards to ensure system maintainability.
- Provided technical mentorship to junior staff and led knowledge-sharing initiatives to improve team efficiency and process standardization.
Senior DevOps (Contract) — Infinite Computer Solutions (Aug 2025 – Nov 2025)
- Implemented and maintained Infrastructure as Code (IaC) using Terraform for automated cloud provisioning and Ansible playbooks for consistent configuration management and application deployment.
- Developed and managed robust CI/CD pipelines using Jenkins and GitLab workflows, ensuring secure and efficient deployment processes across development and production environments.
- Collaborated with Data Science teams to operationalize machine learning models, building event-driven automation to streamline workflows and improve operational efficiency.
- Monitored system performance and resolved complex technical issues to ensure high availability (HA) and scalability of production-grade cloud workloads.
- Enforced security best practices and documented architectural decisions, ensuring process standardization and system maintainability in alignment with industry standards.
NOC Engineer — Orion Consultancy (Mar 2018 – Dec 2020)
- Managed and maintained multi-cloud infrastructure across AWS, GCP, and Aliyun, ensuring high availability and performance across diverse regional environments.
- Built and administered proactive monitoring solutions using Prometheus and Grafana to track infrastructure health and network latency, facilitating rapid incident response and performance tuning.
- Leveraged Ansible for automated configuration management and streamlined infrastructure deployment, ensuring environment consistency and reducing manual intervention.
- Optimized traffic management and system resilience by configuring and maintaining high-availability load balancers and reverse proxies (HAProxy, Nginx, Squid).
- Enhanced infrastructure security by implementing DDoS protection via Fail2Ban with a centralized database for coordinated threat mitigation.
- Automated SSL/TLS certificate lifecycle management using Let’s Encrypt, ensuring continuous encryption and reducing operational overhead.
Service Delivery Engineer — AsiaCloud Solutions (2014–2016, 2017–2018)
- Delivered end-to-end managed services and deployed mission-critical infrastructure, including Windows Servers, NAS, and network switches, to optimize system performance for enterprise clients.
- Administered regional IT infrastructure across multiple APAC locations, resolving complex connectivity issues and ensuring consistent service delivery across distributed regional offices.
- Implemented proactive security monitoring and system hardening measures while collaborating with cross-functional teams to standardize operational procedures.
- Developed custom applications to automate internal workflows, significantly enhancing operational efficiency and improving data management for client teams.
- Achievement: Led a comprehensive office relocation project, independently designing and executing the migration of servers and network infrastructure to ensure a timely setup with minimal business disruption.
IT Support & SysAdmin — Ley Choon (2013–2014)
- Daily support, server/backup management, ERP DB maintenance.
Project Supervisor — Foxconn (2011–2013)
- Led 100–120 staff; handled SOP/KPI; launched new Nintendo RMA project.
- Achievement: Directed a team to build a new project successfully, met customer’s requirement, and won new profitability for company.
Product Engineer — Foxconn (2009–2011)
- Fault analysis, customer handling, test automation.
- Achievement: Reduced manpower by 40%.
EDUCATION
- Master of Business Administration – MBA — Jinan University (2022–2024)
- Bachelor’s degree of Engineering — Computer Science, SDUT (2005–2009)
