W
AI Delivery & Operations Engineer
RM 6,000 - RM 7,999 / month
Checking job availability...
Original
Simplified
Job Responsibilities: 1. Handle the installation, deployment, testing, and daily operations of AI computing clusters, ensuring system stability and high availability. 2. Develop and maintain auxiliary tools for AI cluster management, including fault diagnosis, alerting systems, and emergency response mechanisms. 3. Familiarity with and mastery of a wide range of computer hardware and software, and can independently perform setup, debugging, and troubleshooting. 4. Set up and maintain LANs; perform basic maintenance and troubleshooting of networking devices with strong knowledge of network security. 5. Optimize AI system performance, troubleshoot issues, and ensure data integrity and security. 6. Monitor project progress, ensure smooth delivery, and handle proper documentation and handovers. Requirements: 1. Bachelor’s Degree or higher in Computer Science, Information Technology, Computer Engineering, or related fields. 2. Minimum 2 years of hands-on experience in IT operations or technical delivery roles. 3. Familiar with distributed systems, microservices architecture, and containerized deployment technologies; hands-on experience with industry-standard tools such as Kubernetes and Docker. 4. Passion for AI and large-scale model systems. Curious, self-motivated, and a good team communicator. 5. Familiar with Linux OS and kernel-level modules (networking, storage, memory, file systems, etc.). 6. Strong scripting skills in Shell; Python proficiency is a strong advantage. 7. Able to work under pressure with good interpersonal and communication skills. 8. Proficient in Mandarin (spoken and written) is required for effective communication with internal teams and clients.