HBM/Memory Validation Engineer
Advanced Micro Devices Lihat semua pekerjaan
- Pulau Pinang
- Tetap
- Sepenuh masa
- HBM Validation & Debug
- Develop and execute post-silicon validation plans for HBM (HBM2E / HBM3 / HBM3E) subsystems on AMD data center GPUs
- Validate HBM PHY, memory controller, and full memory stack behavior, including:
- Initialization, training, and calibration
- Timing, voltage, and frequency margins
- Power management states and transitions
- Perform system-level stress testing under AI, HPC, and graphics workloads
- Silicon Bring-Up Support
- Participate in first silicon bring-up, including board power-on and early HBM access
- Debug HBM-related silicon issues using:
- Memory test tools
- Firmware/driver instrumentation and internal debug hooks
- Drive root cause analysis, identify design or integration gaps, and work with design teams on fixes
- Automation & Tool Development
- Develop and maintain validation scripts and automation (Python, C/C++, shell)
- Enhance test coverage, regression efficiency, and debug visibility
- Enable reuse of validation infrastructure across GPU programs
- Cross-Functional Collaboration,Collaborate with:
- HBM PHY and controller designers
- GPU architecture and RTL teams
- Platform hardware and SI/PI teams
- Firmware, driver, and system software teams
- Provide clear debug documentation and validation sign-off reports
- 2–6 years of experience (aligned to E2 / early MTS level) in one or more of:
- Memory validation, silicon validation, or post-silicon debug
- DRAM, HBM, DDR, or high-speed interfaces
- Strong understanding of:
- Memory subsystems and SoC/GPU architectures
- Signal integrity and high-speed interfaces
- Hands-on experience with:
- Lab equipment (oscilloscope, logic analyzer, BERT, power supplies)
- Linux-based validation environments
- Programming/scripting experience in Python, C/C++, or similar
- Experience with HBM2E / HBM3 / HBM3E validation or debug
- Exposure to high-performance GPU or accelerator platforms
- Understanding of:
- Memory training algorithms
- Error detection and correction (ECC, retry, RAS)
- Power/performance tuning
- Experience working with global teams across time zones
- Strong documentation and technical communication skills
- Bachelor’s degree in electrical engineering, Computer Engineering, or related field.