Knowledge Distillation: A Survey
Authors:
Jian** Gou,
Baosheng Yu,
Stephen John Maybank,
Dacheng Tao
Abstract:
In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedd…
▽ More
In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedded devices, not only because of the high computational complexity but also the large storage requirements. To this end, a variety of model compression and acceleration techniques have been developed. As a representative type of model compression and acceleration, knowledge distillation effectively learns a small student model from a large teacher model. It has received rapid increasing attention from the community. This paper provides a comprehensive survey of knowledge distillation from the perspectives of knowledge categories, training schemes, teacher-student architecture, distillation algorithms, performance comparison and applications. Furthermore, challenges in knowledge distillation are briefly reviewed and comments on future research are discussed and forwarded.
△ Less
Submitted 20 May, 2021; v1 submitted 9 June, 2020;
originally announced June 2020.
Thou Shalt Not Reject the P-value
Authors:
Oliver Y. Chén,
Raúl G. Saraiva,
Guy Nagels,
Huy Phan,
Tom Schwantje,
Hengyi Cao,
Jiangtao Gou,
Jenna M. Reinen,
Bin Xiong,
Bangdong Zhi,
Xiaojun Wang,
Maarten de Vos
Abstract:
Since its debut in the 18th century, the P-value has been an important part of hypothesis testing-based scientific discoveries. As the statistical engine accelerates, questions are beginning to be raised, asking to what extent scientific discoveries based on P-values are reliable and reproducible, and the voice calling for adjusting the significance level or banning the P-value has been increasing…
▽ More
Since its debut in the 18th century, the P-value has been an important part of hypothesis testing-based scientific discoveries. As the statistical engine accelerates, questions are beginning to be raised, asking to what extent scientific discoveries based on P-values are reliable and reproducible, and the voice calling for adjusting the significance level or banning the P-value has been increasingly heard. Inspired by these questions and discussions, here we enquire into the useful roles and misuses of the P-value in scientific studies. For common misuses and misinterpretations, we provide modest recommendations for practitioners. Additionally, we compare statistical significance with clinical relevance. In parallel, we review the Bayesian alternatives for seeking evidence. Finally, we discuss the promises and risks of using meta-analysis to pool P-values from multiple studies to aggregate evidence. Taken together, the P-value underpins a useful probabilistic decision-making system and provides evidence at a continuous scale. But its interpretation must be contextual, considering the scientific question, experimental design (including the model specification, sample size, and significance level), statistical power, effect size, and reproducibility.
△ Less
Submitted 28 July, 2022; v1 submitted 17 February, 2020;
originally announced February 2020.