Epicareer Might not Working Properly
Learn More

Head of Systems & Operations

Salary undisclosed

Checking job availability...

Original
Simplified
Location: Bukit Bintang, KL BASIC QUALIFICATIONS • Bachelor’s degree in Computer Science or a related technical field • Proven experience (10+ years) in an operations or technology leadership role within the IT or cloud services industry. • At least 5 years in a leadership role. • Strong understanding of GPU technologies and cloud computing principles. • Demonstrated experience in managing complex IT systems and operational processes. • Exceptional analytical and troubleshooting skills • Understand the Kubernetes environments and be able to run the debugging. • Familiarity with energy-efficient computing and sustainable data center operations. • Proven ability to manage priorities in a dynamic, fast-paced environment. JOB DESCRIPTION Looking for a Head of System and Operations to lead the technical team to develop, manage and operate the GPU cluster infrastructure. This role will be responsible for overseeing and optimizing the operational framework of our GPU cluster and service offerings. This role combines strategic leadership with hands-on management to ensure the seamless integration of technology systems, operational processes, and customer service delivery. The ideal candidate will have a strong background in IT operations, cloud services, and team leadership within a technology-driven environment. Key Responsibilities: • Oversee the design, implementation, and maintenance of IT systems that support operational activities, ensuring high availability and performance of GPU resources. • Provide technical guidance across complex infrastructure projects. • Develop and execute operational strategies that align with the company’s goals for GPU as-a-Service, focusing on scalability, efficiency, and reliability. • Lead and mentor a diverse team of technology professionals, facilitating a culture of innovation, accountability, and continuous improvement. • Manage relationships with key vendors and third-party service providers to ensure compliance with service level agreements (SLAs) and industry standards. • Identify opportunities for process improvements across operations. Implement best practices to enhance productivity, reduce costs, and improve service quality. • Work closely with product development, sales, and marketing teams to ensure seamless integration of services and alignment with customer needs. • Ensure all operations comply with relevant laws, regulations, and industry standards related to data protection and service delivery. Desired Skills: • Hands-on expertise and comprehensive knowledge of CPU/GPU cluster and platform. • Exceptional communication skills, capable of discussing both technical and non-technical topics with diverse audiences. • Strong interpersonal skills, with a proven ability to develop professional relationships across business and technical teams. • Ability to manage multiple projects simultaneously while maintaining attention to detail. • Knowledgeable in operating and managing processes in CPU/GPU cluster. • Strategic thinker with the ability to implement innovative solutions that drive business success. • Excellent documentation skills to effectively articulate technical designs, issues, procedures, and assessments.