-
Anycost GANs for Interactive Image Synthesis and Editing
J. Lin, R. Zhang, F. Ganz, S. Han, J. Zhu
CVPR’21
- Efficient and Robust LiDAR-Based End-to-End Navigation
Z Liu, A. Amini, S. Zhu, S. Karaman, S. Han, D. Rus
ICRA’21
-
IOS: Inter-Operator Scheduler For CNN Acceleration
Y. Ding, L. Zhu, Z. Jia, G. Pekhimenko, S. Han
MLSys’21
[code] -
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
H. Wang, Z. Zhang, S. Han
HPCA’21 -
MCUNet: Tiny Deep Learning on IoT Devices
J. Lin, W. Chen, Y. Lin, J. Cohn, C. Gan, S. Han
NeurIPS’20, spotlight
[MIT News][website][Wired][Stacey on IoT][Morning Brew][IBM][Analytics Insight] -
Tiny Transfer Learning: Reduce Memory, not Parameters for Efficient On-Device Learning
H. Cai, C. Gan, L. Zhu, S. Han
NeurIPS’20
[website] - Differentiable Augmentation for Data-Efficient GAN Training
S. Zhao, Z. Liu, J. Lin, J. Zhu, S. Han
NeurIPS’20
[VentureBeat][blog][code][website][talk] -
Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
H. Tang∗, Z. Liu∗, S. Zhao, Y. Lin, J. Lin. H. Wang, S. Han
ECCV’20 -
DataMix: Efficient Privacy-Preserving Edge-Cloud Inference
Z Liu∗, Z. Wu∗, C. Gan, L. Zhu, S. Han
ECCV’20 - HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
H. Wang, Z. Wu, Z. Liu, H. Cai, L. Zhu, C Gan, S. Han
ACL’20
[paper][website][code] - GAN Compression: Learning Efficient Architectures for Conditional GANs
M. Li, J. Lin, Y. Ding, Z. Liu, J. Zhu, S. Han
CVPR’20
[paper][website][demo][tutorial] - APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
T. Wang, K. Wang, H. Cai, J. Lin, Z. Liu, S. Han
CVPR’20
[paper][code][video] - GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning
H. Wang, K. Wang, J. Yang, L. Shen, N. Sun, H.S. Lee and S. Han
Design Automation Conference (DAC), 2020.
[paper] - SpArch: Efficient Architecture for Sparse Matrix Multiplication
Z. Zhang, H. Wang, S. Han, W.J. Dally
International Symposium on High-Performance Computer Architecture (HPCA), 2020.
[paper][slides][website][2min talk][full talk] -
Once For All: Train One Network and Specialize It for Efficient Deployment
H. Cai, C. Gan, T. Wang, Z. Zhang, S. Han
International Conference on Learning Representations (ICLR), 2020.
also appeared at TinyML summit, SysML workshop and CVPR workshop 2020.
[paper][website][code][tutorial][poster][ICLR talk][tinyML talk][CVPR talk][tutorial][MIT news][Qualcomm news][Venture Beat][news][news][news][news][news][news]
- Lite Transformer with Long Short Term Attention
Z. Wu, Z. Liu, J. Lin, Y. Lin, S. Han
International Conference on Learning Representations (ICLR), 2020.
[paper][website][code][slides]
- Point Voxel CNN for Efficient 3D Deep Learning
Z. Liu, H. Tang, Y. Lin, S. Han
Neural Information Processing System (NeurIPS), 2019. Spotlight
[paper][deployment on MIT Driverless][playlist][talk][slides][code][website] - Deep Leakage from Gradients
L. Zhu, Z. Liu, S. Han
Neural Information Processing System (NeurIPS), 2019.
[paper][website][code][poster][colab] - TSM: Temporal Shift Module for Efficient Video Understanding
J. Lin, C. Gan, S. Han.
International Conference on Computer Vision (ICCV), 2019.
[paper][demo][code][poster][MIT News][Engadget][MIT Technology Review][NVIDIA News][NVIDIA Jetson Developer Forum]
[industry integration: @NVIDIA, @Baidu] - HAQ: Hardware-Aware Automated Quantization with Mixed Precision
K. Wang, Z. Liu, Y. Lin, J. Lin, S. Han.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. Oral presentation.
[paper][slides][poster][code][video][BibTex] - ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
H. Cai, L. Zhu, S. Han
International Conference on Learning Representations (ICLR), 2019.
[paper][code][demo][poster][MIT news][IEEE Spectrum][BibTex]
[industry integration: @AWS, @Facebook] - Defensive Quantization: When Efficiency Meets Robustness
J. Lin, C. Gan, S. Han
International Conference on Learning Representations (ICLR), 2019.
[paper][poster][MIT News][BibTex] - AMC: AutoML for Model Compression and Acceleration on Mobile Devices.
Y. He, J. Lin, Z. Liu, H. Wang, J. Li, S. Han
European Conference on Computer Vision (ECCV), 2018
[paper][poster][website][code][BibTex]
Past publications at Stanford:
- Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Yujun Lin, Song Han, Huizi Mao, Yu Wang, William J. Dally
International Conference on Learning Representations (ICLR), April 2018. - Efficient Sparse-winograd Convolutional Neural Networks
Xingyu Liu, Jeff Pool, Song Han, William J. Dally
International Conference on Learning Representations (ICLR), April 2018. - DSD: Dense-Sparse-Dense Training for Deep Neural Networks
Song Han, Jeff Pool, Sharan Narang, Huizi Mao, Shijian Tang, Erich Elsen, Bryan Catanzaro, John Tran, William J. Dally
International Conference on Learning Representations (ICLR), April 2017. - Trained Tenary Quantization
Chenzhuo Zhu, Song Han, Huizi Mao, William J. Dally
International Conference on Learning Representations (ICLR), April 2017. - Software-Hardware Co-Design for Efficient Neural Network Acceleration
Kaiyuan Guo, Song Han, Song Yao, Yu Wang, Yuan Xie, Huazhong Yang
Hot Chips special issue of IEEE Micro, March/April 2017 - ESE: Efficient Speech Recognition Engine for Sparse LSTM on FPGA
Song Han, Junlong Kang, Huizi Mao, Yubin Li, Dongliang Xie, Hong Luo, Yu Wang, Huazhong Yang, William J. Dally
NIPS workshop on Efficient Methods for Deep Neural Networks (EMDNN), Dec 2016, Best Paper Honorable Mention.
International Symposium on Field-Programmable Gate Arrays (FPGA), Feb 2017, Best Paper Award. - Compressing and Regularizing Deep Neural Networks, Improving Prediction Accuracy Using Deep Compression and DSD Training
Song Han
O’Reilly, Nov 2016. - EIE: Efficient Inference Engine on Compressed Deep Neural Network
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, William J. Dally
International Symposium on Computer Architecture (ISCA), June 2016; Hotchips, Aug 2016. - Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han, Huizi Mao, William J. Dally
NIPS Deep Learning Symposium, December 2015.
International Conference on Learning Representations (ICLR), May 2016, Best Paper Award. - Learning both Weights and Connections for Efficient Neural Networks
Song Han, Jeff Pool, John Tran, William J. Dally
Advances in Neural Information Processing Systems (NIPS), December 2015. - SqueezeNet: AlexNet-Level Accuracy with 50x Fewer Parameters and < 0.5MB Model Size
Forrest Iandola, Song Han, Matthew Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer
arXiv 2016. - Angel-Eye: A Complete Design Flow for Mapping CNN onto Customized Hardware
Kaiyuan Guo, Lingzhi Sui, Jiantao Qiu, Song Yao, Song Han, Yu Wang, Huazhong Yang
IEEE Computer Society Annual Symposium on VLSI (ISVLSI), July 2016. - Hardware-friendly Convolutional Neural Network with Even-number Filter Size
Song Yao, Song Han, Kaiyuan Guo, Jianqiao Wangni, Yu Wang, William J. Dally
International Conference on Learning Representations Workshop, May 2016.
Services
- On-device Intelligence Workshop at MLSys’20, Organizing Committee
- ICCAD’19, Program Committee
- ICLR’19, Area Chair
- HPCA’18, Program Committee
Invited Talks
- Bandwidth-Efficient Deep Learning with Algorithm and Hardware Co-Design
- ISSCC’19 forum of “Intelligence at the Edge: How Can We Make Machine Learning More Energy Efficient?”, Feb 2019
- Hardware-centric AutoML: Design Automation for Efficient Deep Learning Computing
- Facebook, Feb 2019
- Xilinx, Feb 2019
- Intel, Feb 2019
- Samsung, Feb 2019
- NVIDIA, Feb 2019
- SONY, Feb 2019
- MediaTek, Jan 2019
- TSMC, Jan 2019
- Samsung, Dec 2018
- SenseTime, Dec 2018
- Google, Dec 2018
- Bandwidth-Efficient Deep Learning on Edge Devices
- SONY, San Jose, November 2017
- Renesas, Santa Clara, November 2017
- Qualcomm, San Diego, December 2017
- CloudMinds, Santa Clara, December 2017
- Samsung AI Summit, Mountain View, January 2017
- Efficient Methods and Hardware for Deep Learning
- Faculty interview, MIT, Princeton, UC Berkely, UT Austin, etc., March 2017
- PhD thesis defense, Stanford, June 2017
- Deep Learning – Tutorial and Recent Trends
- Conference tutorial at FPGA’17, Monterey. [slides]
- Deep Compression: A Deep Neural Network Compression Pipeline
- Conference talk at ICLR, Puerto Rico, May 2016.
- GPU Technology Conference (GTC), San Jose, March 2016.
- Google, Mountain View, March 2015.
- Stanford Computer System Colloquium, January 2016.
- EIE: Efficient Inference Engine on Compressed Deep Neural Network
- Conference talk at ISCA, Korea, June 2016.
- Movidius, San Mateo, April 2016.
- HP Labs, Palo Alto, February 2016.
- Apple, Cupertino, December 2015.
- Stanford SystemX Fall Conference, Stanford, November 2015.
- Techniques for Efficient Implementation of Deep Neural Networks
- Embedded Vision Alliance Member Meeting, March 2016.
- Deep Compression, DSD Training and EIE: Deep Neural Network Model Compression, Regularization and Hardware Acceleration
- O’Reilly Artificial Intelligence Conference, New York, Sep 2016.
- Facebook, Menlo Park, Aug 2016.
- Tesla, Palo Alto, Aug 2016.
- Xilinx, Santa Clara, Aug 2016.
- OpenAI, San Francisco, Aug 2016. Video.
- Microsoft Research, Redmond, June 2016.
- Apple, Cupertino, June 2016.
Efficient AI on the edge, auto AI; model compression, gradient compression, compact model design, sparsity, auto pruning, auto quantization, neural architecture search, efficient video recognition, efficient 3D point cloud, efficient transformer, specialized model, sparse hardware accelerator, accelerating compressed NN, FPGA.