-
Effectiveness of Function Matching in Driving Scene Recognition
Authors:
Shingo Yashima
Abstract:
Knowledge distillation is an effective approach for training compact recognizers required in autonomous driving. Recent studies on image classification have shown that matching student and teacher on a wide range of data points is critical for improving performance in distillation. This concept (called function matching) is suitable for driving scene recognition, where generally an almost infinite…
▽ More
Knowledge distillation is an effective approach for training compact recognizers required in autonomous driving. Recent studies on image classification have shown that matching student and teacher on a wide range of data points is critical for improving performance in distillation. This concept (called function matching) is suitable for driving scene recognition, where generally an almost infinite amount of unlabeled data are available. In this study, we experimentally investigate the impact of using such a large amount of unlabeled data for distillation on the performance of student models in structured prediction tasks for autonomous driving. Through extensive experiments, we demonstrate that the performance of the compact student model can be improved dramatically and even match the performance of the large-scale teacher by knowledge distillation with massive unlabeled data.
△ Less
Submitted 20 August, 2022;
originally announced August 2022.
-
Feature Space Particle Inference for Neural Network Ensembles
Authors:
Shingo Yashima,
Teppei Suzuki,
Kohta Ishikawa,
Ikuro Sato,
Rei Kawakami
Abstract:
Ensembles of deep neural networks demonstrate improved performance over single models. For enhancing the diversity of ensemble members while kee** their performance, particle-based inference methods offer a promising approach from a Bayesian perspective. However, the best way to apply these methods to neural networks is still unclear: seeking samples from the weight-space posterior suffers from…
▽ More
Ensembles of deep neural networks demonstrate improved performance over single models. For enhancing the diversity of ensemble members while kee** their performance, particle-based inference methods offer a promising approach from a Bayesian perspective. However, the best way to apply these methods to neural networks is still unclear: seeking samples from the weight-space posterior suffers from inefficiency due to the over-parameterization issues, while seeking samples directly from the function-space posterior often results in serious underfitting. In this study, we propose optimizing particles in the feature space where the activation of a specific intermediate layer lies to address the above-mentioned difficulties. Our method encourages each member to capture distinct features, which is expected to improve ensemble prediction robustness. Extensive evaluation on real-world datasets shows that our model significantly outperforms the gold-standard Deep Ensembles on various metrics, including accuracy, calibration, and robustness. Code is available at https://github.com/DensoITLab/featurePI .
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features
Authors:
Shingo Yashima,
Atsushi Nitanda,
Taiji Suzuki
Abstract:
Although kernel methods are widely used in many learning problems, they have poor scalability to large datasets. To address this problem, sketching and stochastic gradient methods are the most commonly used techniques to derive efficient large-scale learning algorithms. In this study, we consider solving a binary classification problem using random features and stochastic gradient descent. In rece…
▽ More
Although kernel methods are widely used in many learning problems, they have poor scalability to large datasets. To address this problem, sketching and stochastic gradient methods are the most commonly used techniques to derive efficient large-scale learning algorithms. In this study, we consider solving a binary classification problem using random features and stochastic gradient descent. In recent research, an exponential convergence rate of the expected classification error under the strong low-noise condition has been shown. We extend these analyses to a random features setting, analyzing the error induced by the approximation of random features in terms of the distance between the generated hypothesis including population risk minimizers and empirical risk minimizers when using general Lipschitz loss functions, to show that an exponential convergence of the expected classification error is achieved even if random features approximation is applied. Additionally, we demonstrate that the convergence rate does not depend on the number of features and there is a significant computational benefit in using random features in classification problems because of the strong low-noise condition.
△ Less
Submitted 2 June, 2022; v1 submitted 13 November, 2019;
originally announced November 2019.
-
Shearing-induced contact pattern formation in hydrogels sliding in polymer solution
Authors:
Shintaro Yashima,
Satoshi Hirayama,
Takayuki Kurokawa,
Thomas Salez,
Haruna Takefuji,
Wei Hong,
Jian ** Gong
Abstract:
The contact of a hydrogel during the rotational shearing on glass surface in concentrated polymer solution was observed in situ. Dynamic contact patterns that rotate in-phase with the rotational shearing of the gel were observed for the first time. The contact patterns with a periodicity in the circumferential direction appeared and developed into fine characters with the shearing time. The patter…
▽ More
The contact of a hydrogel during the rotational shearing on glass surface in concentrated polymer solution was observed in situ. Dynamic contact patterns that rotate in-phase with the rotational shearing of the gel were observed for the first time. The contact patterns with a periodicity in the circumferential direction appeared and developed into fine characters with the shearing time. The patterns appeared more quickly at elevated sliding velocity, polymer concentration, and normal pressure. Furthermore, the softness of the gel also substantially influenced the character of the patterns. The pattern formation was discussed in terms of the non-linear rheology of the polymer solution at the rotational soft interface.
△ Less
Submitted 16 February, 2019; v1 submitted 18 January, 2019;
originally announced January 2019.
-
Normal contact and friction of rubber with model randomly rough surfaces
Authors:
S. Yashima,
V. Romero,
E. Wandersman,
C. Frétigny,
M. K. Chaudhury,
A. Chateauminois,
A. M. Prevost
Abstract:
We report on normal contact and friction measurements of model multicontact interfaces formed between smooth surfaces and substrates textured with a statistical distribution of spherical micro-asperities. Contacts are either formed between a rigid textured lens and a smooth rubber, or a flat textured rubber and a smooth rigid lens. Measurements of the real area of contact $A$ versus normal load…
▽ More
We report on normal contact and friction measurements of model multicontact interfaces formed between smooth surfaces and substrates textured with a statistical distribution of spherical micro-asperities. Contacts are either formed between a rigid textured lens and a smooth rubber, or a flat textured rubber and a smooth rigid lens. Measurements of the real area of contact $A$ versus normal load $P$ are performed by imaging the light transmitted at the microcontacts. For both interfaces, $A(P)$ is found to be sub-linear with a power law behavior. Comparison to two multi-asperity contact models, which extend Greenwood-Williamson (J. Greenwood, J. Williamson, \textit{Proc. Royal Soc. London Ser. A} \textbf{295}, 300 (1966)) model by taking into account the elastic interaction between asperities at different length scales, is performed, and allows their validation for the first time. We find that long range elastic interactions arising from the curvature of the nominal surfaces are the main source of the non-linearity of $A(P)$. At a shorter range, and except for very low pressures, the pressure dependence of both density and area of micro-contacts remains well described by Greenwood-Williamson's model, which neglects any interaction between asperities. In addition, in steady sliding, friction measurements reveal that the mean shear stress at the scale of the asperities is systematically larger than that found for a macroscopic contact between a smooth lens and a rubber. This suggests that frictional stresses measured at macroscopic length scales may not be simply transposed to microscopic multicontact interfaces.
△ Less
Submitted 5 January, 2017;
originally announced January 2017.