Site Reliability Engineer - Hardware

xAI
Full-time
Memphis, TN
Posted on a month ago

Job Description

xAI is seeking a Site Reliability Engineer focused on hardware to manage firmware, hardware specifications, vendor relations, and failure analysis. The role involves proactive identification and resolution of hardware issues, RMA management, and evaluation of emerging technologies to support datacenter operations.

Responsibilities

  • Analyze firmware and hardware specifications
  • Investigate and diagnose hardware failures
  • Manage vendor relationships and RMA processes
  • Collaborate with Datacenter Operations Technicians
  • Research and evaluate next-generation hardware
  • Develop monitoring tools and processes
  • Document failure modes and evaluations
  • Participate in on-call rotations

Requirements

  • Bachelor's degree in Systems Engineering, Electrical Engineering, Computer Science, or related field
  • 5+ years of experience in hardware reliability engineering
  • Expertise in firmware analysis and hardware specifications
  • Strong experience with RMA processes
  • Ability to diagnose complex hardware failures
  • Familiarity with datacenter hardware components
  • Proficiency in scripting languages (Python, Bash)
  • Excellent problem-solving skills

Benefits

  • No benefits