Kuan Zhou
I am currently a machine learning engineer focusing on ML/AI systems including distributed training(with US patents), inference service performance, AI platform engineering based on Kubernetes, MLOps etc. Additionally, I have a keen interest in building AI applications which leverage the power of generative AI and understanding the mathematics and physics behind neural networks.
Before immersing myself in AI systems, I worked on scientific research in physics - I developed mathematical analysis, research capabilities, and programming skills during undergraduate studies in Physics(thesis: computational simulation for NMR based quantum computing systems, advised by Prof. Xinhua Peng and Prof. Jiangfeng Du) at Univeristy of Science and Technology of China and PhD in Computational Physics(thesis: electronic properties modeling of two-dimensional materials, advised by Prof. Roger Lake) at LATTE lab at University of California, Riverside.
The journey which navigates me from Physics to ML/AI started with reading news about ML/AI, attending ML/AI seminars in Prof. Linli Xu's group, taking ML cources in CS department, participating Kaggle competitions and completing Insight data science bootcamp. My passion for math and physics was ignited in high school by reading inspiring stories about Albert Einstein and Richard Feynman and participating in Math and Physics Olympiads.
In my spare time, I enjoy films, music, and spending time with my family, hiking, camping, biking, traveling, and trying new foods, along with our two cats, Gemma (orange tabby) and Nova (ragdoll).
Passion
Exploring the synergy between science and technology, building AI applications, understanding the math and physics behind neural networks.
Tech Stack
Proficient in, familiar with, or able to contribute after a brief learning period
Programming Languages
Python, Golang, C/C++, Java, JavaScript/TypeScript, Rust
AI Frameworks
PyTorch, HF Transformers, JAX, TensorFlow, Triton, CUDA
Distributed Systems
Torch Distributed, Megatron-ML, DeepSpeed
ML Platforms
Docker, gRPC, Kubernetes, Istio, OpenTelemetry, Kubebuilder
MLOps
MLFlow, Weights & Biases, BentoML, Flyte, Kubeflow, Hydra
ML Compilers
MLIR, LLVM, TVM
Service Serving
vLLM, Triton Inference Server, Text Generation Inference
AI applications
Electron, Swift/SwiftUI, Streamlit
Frontend
React, NextJS, Material UI, TailwindCSS, FastAPI
Databases
PostgreSQL, BoltDB, SQL
Scientific Tools
Mathematica, Julia, Matlab, LaTeX
Others
Bazel, Mermaid, Pybind, Pydantic, JsonSchema, Spark, Hadoop, ORTools, Numba
Experience
Principal Engineer - Machine LearningSambaNova Systems
April 2020 - PresentPALO ALTO, CA
- Tech lead in containerizing and deploying generative AI models onto Kubernetes platform SambaStudio
- Led a 5+ engineers team to deploy foundation model based solutions to business customers
- Prototyped the generative AI model deployment pipeline and Kubernetes platform
- Built general and extensive infrastructure for continuous model integration and deployment
- Standardized the model bringup and integration procedure via refactoring ML applications
- Co-designed and co-developed distributed learning infrastructure for extreme large models
- Overlapping gradient synchronization in machine learning
- System for executing an application on heterogeneous reconfigurable processors
- System of heterogeneous reconfigurable processors for the data parallel execution of applications
- Contributed in core features of SambaNova AI framework
- Designed, implemented and maintained a binary data extractor as bridge between compiler and runtime
- Refactored and upgraded AI framework codebase to support functional programming style dataflow execution
- Implemented various deep learning operators from compiler low level kernels to AI framework end to end
- Optimized performance of deep learning models(HIPNN etc.) based on SambaNova AI framework and dataflow architecture
- Integrated TensorBoard as visualization and accuracy debugger tool into SambaNova AI framework
Software Engineer - Machine LearningPetuum Inc.
February 2019 - March 2020SUNNYVALE, CA
- Leveraged OCR engines and deep learning models to process logistic bills automatically with 0.87 accuracy
- Collaborated in implementation of various anomaly detection models for equipment health prediction
- Contributed in machine learning pipeline refactoring and model improvement based on various use cases
Artificial Intelligence FellowInsight Data Science
June 2018 - September 2018SAN FRANCISCO, CA
- Architected SketchTML that takes in several hand drawn sketches and produces an interactive HTML website
- Leveraged the framework of pix2code to build a more robust image captioning model with different styles
- Improved BLEU score up to 0.88 through inventive data augmentation methods and weighted loss functions
Education
PhD in Computational PhysicsUniversity of California, Riverside
September 2013 - December 2018RIVERSIDE, CA
BSc in PhysicsUniversity of Science and Technology of China
Zhongyao Zhao Applied Physics Elite Class
August 2009 - June 2013HEFEI, CHINA