Job Description
Responsibilities
- Run System Validation and SKU Qualification
- Work with development teams to optimize software and hardware configurations; Collaborate with architects and developers to enhance the performance of existing systems.
- Develop node/rack system level test cases, test requirements against system level requirements and verify them for functionality at node/rack level.
- Lead in cross-boundary issue triage, debug, and resolution.
- Be part of the concept design discussions, gather node/rack system level requirements, clarify interfaces, provide feedback into future design requirements to help develop robust and high-performance cloud hardware solutions.
- Architect cloud test scenarios and guide software engineers to automate test flows.
- Work with ODMs, other system engineers, internal design teams and customers to develop validation execution plans for new technologies and MSFT IP features.
- Work with ODMs, Silicon vendors, component suppliers and internal design teams on cross-boundary triaging, debugging, and resolving issues.
- Review system engineering and validation coverage throughout the lifecycle of the program and publish plan vs. progress reports.
- Collaborate with internal and external partners to ensure systems meet significant quality, reliability, and service level requirements for a cloud environment. Architect cloud test scenarios and guide software engineers to automate test flows.
- Develop performance testing frameworks and tools.
- Conduct system performance testing to ensure reliability, capacity, and scalability; Monitor system performance over time to catch degradations before they impact customers.
Qualifications
Required Qualifications:
- BS/MS in Electrical/Computer Engineering or related degree
- 0 to 3 years of experience in “server systems/platforms design, development and Validation” OR “Proven experience in performance engineering”
- Hands-on experience in server hardware architecture, design, and development with solid understanding of hardware, firmware, OS interfaces, and/or Hands on experience in system performance engineering
- Strong knowledge of performance metrics and benchmarking.
- Strong understanding of software and hardware performance factors.
- Strong technical communication skills (verbal and written) to interface with cross-functional technical leads within and/or outside of the organization.
- Advanced troubleshooting and debugging skills.
- Familiar with networking, power, rack device management and remote access environments
- Experience in performance benchmarking tools such as SPEC workloads, Linpack
- Experience in GPUs, and various networking standards including InfiniBand
- Experience in windows and Linux operating systems, test automation and Hyperscale testing covering hundreds of Systems under Test.
- Understanding and experience in device drivers and debugging issues related to interactions with HW subsystem
Preferred Qualifications:
- Experience in evaluating off the shelf OEM hardware designs, HW/FW/OS interactions, platform config trade-offs, performance tuning and optimizations is required.
- Understanding how standard server interfaces, such as PCIe, SATA, and memory, work with their respective software stacks.
- Functional knowledge of secure boot, attestation, FW update & recovery on server platform architectures.
- Experience in platform HW and FW security capabilities (RoT) and implementations.
- Familiarity with NIST 800 standards pertains to FW support for secure update and secure recovery.
- Experience in Server platform HW designs or Server platform validation with knowledge of system level firmware, will be an added advantage.
- Experience in platform level test architecture and usage of debug tools like (ITP, Arium, ARM JTAG tools or equivalent).
- Volume hardware test and debug expertise
- Experienced in debugging complex system level issues and ability to root-cause/identifying potential fixes down to a board hardware, signal integrity, CPLD, thermal and Firmware components, OS is required.
- Knowledge of Python or other scripting languages will be addon.