• Worked on the Wafer-scale Machine Learning Accelerator, the largest chip ever built. It improves machine learning performance with reduced power consumption due to on-chip communication.
• Performed testing of the Accelerator system using Python, Assembly, C++, System Verilog.
• Used low-level C/C++ with register programming and a custom instruction set architecture.
• Created directed and random test benches, debugged regression failures and ran full-chip gate level simulations.
• Analyzed code coverage reports and used checkers and assertions.
• Created register specifications and models.