The mission of OpsDev team is to energize TechOps' ability and power that control and manage massive resources and traffic in a highly efficient, accurate and consistent way.
The team provides productional software, intelligent engines, and stable system architectures devote themselves to build a DevOps ecosystem to integrate all resources and tools, eliminates the gap between Ops and Dev.
The main scope focuses on Global Traffic Schedule and Management Platform(NLB, ALB, GSLB, Hybrid CDN, DNS and etc), Hybrid Cloud Resource Schedule and Management Platform(Bromo, Hybrid Cloud Management, Mesos, Kubernetes, Container, Physical Server, VM, CICD and etc), Internal System(CMDB, SPACE, TOC and etc).
Job Description :
Design and develop Shopee compute platform and related toolchains.
Improve Shopee compute platform’s availability, stability, security, and extensibility; Ensure the smooth running of Shopee compute platform no matter big campaigns or performing daily routines.
Improve resource utilization of the Shopee compute platform; Optimize the scheduling model for the mixed running of online services and batch jobs on large-scale.
Enhance workload isolation on the Shopee compute platform; Improve the resource control of containers and virtual machines in memory bandwidth, disk IO, and network QoS.
Make the Shopee compute platform easy to use and maintain; Optimize processes in Shopee computer platform and reduce its learning cost base on daily support feedback and business requirements.
Develop and implement automation and engineering solutions; Detect and fix potential problems in advance via chaos engineering and regular fire drills, and to react quickly to incidents to reduce unnecessary manual operations and improve response time.
Bachelor’s or higher degree in Computer Science, Engineering, Information Systems, or related fields.
Hands-on experience with at least one of the following programming languages : Go, Python, C++, Java.
In-depth understanding of Linux internals, such as cgroups v2, namespaces, KVM, etc.
Familiar with Linux dynamic tracing and performance profiling; Experience with software troubleshooting.
Experience with OpenStack, Kubernetes, Service Mesh (Preferred).
Experience with Hybrid Cloud Platform development (Preferred).
Experiences in design and development of large-scale DevOps systems (preferred)
Contributed to open-source projects (Preferred).