Burlingame, California, United States
I have worked on multiple software teams within quadric: graph compiler (TVM-based), SDK, and applications. I tend to work on software problems that require careful attention to numerical accuracy of algorithms and require mathematical analysis.
Implemented optimized algorithms for quadric’s parallel GPNPU processor from a range of domains such as numerical linear algebra (Gaussian Elimination / LU decomposition, Cholesky decomposition), signal processing (MVDR and delay-and-sum beamforming), and neural networks (convolutional layers).
Developed tooling for numerical accuracy checks which take into account numerical conditioning and stability of algorithms. Extending numerical analysis concepts from numerical linear algebra literature written from the view point of floating point to fixed point arithmetic.
Spent a year leading the applications team with 8 people. My team was responsible for building and supporting application pipelines which include implementing core algorithms with quadric's SDK, neural network layers, image processing, linear algebra routines, and numerical accuracy testing.
Worked on the integration and development of the floating point Fused-Multiply-Add (FMA) operation. This was the first floating point operation introduced to quadric's architecture. Analyzed worst-case numerical accuracy mathematically for various dot-products methods while considering the conversion to fixed point.
Working on creating a CPU backend on a onnxruntime library fork for simulating numerical behavior of quadric's processor. This will be used for numerical validation studies.