-
Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search
Authors:
Yuki Tsukada,
Hideaki Iiduka
Abstract:
While stochastic gradient descent (SGD) can use various learning rates, such as constant or diminishing rates, the previous numerical results showed that SGD performs better than other deep learning optimizers using when it uses learning rates given by line search methods. In this paper, we perform a convergence analysis on SGD with a learning rate given by an Armijo line search for nonconvex opti…
▽ More
While stochastic gradient descent (SGD) can use various learning rates, such as constant or diminishing rates, the previous numerical results showed that SGD performs better than other deep learning optimizers using when it uses learning rates given by line search methods. In this paper, we perform a convergence analysis on SGD with a learning rate given by an Armijo line search for nonconvex optimization indicating that the upper bound of the expectation of the squared norm of the full gradient becomes small when the number of steps and the batch size are large. Next, we show that, for SGD with the Armijo-line-search learning rate, the number of steps needed for nonconvex optimization is a monotone decreasing convex function of the batch size; that is, the number of steps needed for nonconvex optimization decreases as the batch size increases. Furthermore, we show that the stochastic first-order oracle (SFO) complexity, which is the stochastic gradient computation cost, is a convex function of the batch size; that is, there exists a critical batch size that minimizes the SFO complexity. Finally, we provide numerical results that support our theoretical results. The numerical results indicate that the number of steps needed for training deep neural networks decreases as the batch size increases and that there exist the critical batch sizes that can be estimated from the theoretical results.
△ Less
Submitted 1 February, 2024; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Cost-effective search for lower-error region in material parameter space using multifidelity Gaussian process modeling
Authors:
Shion Takeno,
Yuhki Tsukada,
Hitoshi Fukuoka,
Toshiyuki Koyama,
Motoki Shiga,
Masayuki Karasuyama
Abstract:
Information regarding precipitate shapes is critical for estimating material parameters. Hence, we considered estimating a region of material parameter space in which a computational model produces precipitates having shapes similar to those observed in the experimental images. This region, called the lower-error region (LER), reflects intrinsic information of the material contained in the precipi…
▽ More
Information regarding precipitate shapes is critical for estimating material parameters. Hence, we considered estimating a region of material parameter space in which a computational model produces precipitates having shapes similar to those observed in the experimental images. This region, called the lower-error region (LER), reflects intrinsic information of the material contained in the precipitate shapes. However, the computational cost of LER estimation can be high because the accurate computation of the model is required many times to better explore parameters. To overcome this difficulty, we used a Gaussian-process-based multifidelity modeling, in which training data can be sampled from multiple computations with different accuracy levels (fidelity). Lower-fidelity samples may have lower accuracy, but the computational cost is lower than that for higher-fidelity samples. Our proposed sampling procedure iteratively determines the most cost-effective pair of a point and a fidelity level for enhancing the accuracy of LER estimation. We demonstrated the efficiency of our method through estimation of the interface energy and lattice mismatch between MgZn2 and α-Mg phases in an Mg-based alloy. The results showed that the sampling cost required to obtain accurate LER estimation could be drastically reduced.
△ Less
Submitted 15 March, 2020;
originally announced March 2020.
-
Multi-fidelity Bayesian Optimization with Max-value Entropy Search and its parallelization
Authors:
Shion Takeno,
Hitoshi Fukuoka,
Yuhki Tsukada,
Toshiyuki Koyama,
Motoki Shiga,
Ichiro Takeuchi,
Masayuki Karasuyama
Abstract:
In a standard setting of Bayesian optimization (BO), the objective function evaluation is assumed to be highly expensive. Multi-fidelity Bayesian optimization (MFBO) accelerates BO by incorporating lower fidelity observations available with a lower sampling cost. In this paper, we focus on the information-based approach, which is a popular and empirically successful approach in BO. For MFBO, howev…
▽ More
In a standard setting of Bayesian optimization (BO), the objective function evaluation is assumed to be highly expensive. Multi-fidelity Bayesian optimization (MFBO) accelerates BO by incorporating lower fidelity observations available with a lower sampling cost. In this paper, we focus on the information-based approach, which is a popular and empirically successful approach in BO. For MFBO, however, existing information-based methods are plagued by difficulty in estimating the information gain. We propose an approach based on max-value entropy search (MES), which greatly facilitates computations by considering the entropy of the optimal function value instead of the optimal input point. We show that, in our multi-fidelity MES (MF-MES), most of additional computations, compared with usual MES, is reduced to analytical computations. Although an additional numerical integration is necessary for the information across different fidelities, this is only in one dimensional space, which can be performed efficiently and accurately. Further, we also propose parallelization of MF-MES. Since there exist a variety of different sampling costs, queries typically occur asynchronously in MFBO. We show that similar simple computations can be derived for asynchronous parallel MFBO. We demonstrate effectiveness of our approach by using benchmark datasets and a real-world application to materials science data.
△ Less
Submitted 12 February, 2020; v1 submitted 24 January, 2019;
originally announced January 2019.
-
An Epistemic Approach to Compositional Reasoning about Anonymity and Privacy
Authors:
Yasuyuki Tsukada,
Hideki Sakurada,
Ken Mano,
Yoshifumi Manabe
Abstract:
In this paper, we present an epistemic logic approach to the compositionality of several privacy-related informationhiding/ disclosure properties. The properties considered here are anonymity, privacy, onymity, and identity. Our initial observation reveals that anonymity and privacy are not necessarily sequentially compositional; this means that even though a system comprising several sequential p…
▽ More
In this paper, we present an epistemic logic approach to the compositionality of several privacy-related informationhiding/ disclosure properties. The properties considered here are anonymity, privacy, onymity, and identity. Our initial observation reveals that anonymity and privacy are not necessarily sequentially compositional; this means that even though a system comprising several sequential phases satisfies a certain unlinkability property in each phase, the entire system does not always enjoy a desired unlinkability property. We show that the compositionality can be guaranteed provided that the phases of the system satisfy what we call the independence assumptions. More specifically, we develop a series of theoretical case studies of what assumptions are sufficient to guarantee the sequential compositionality of various degrees of anonymity, privacy, onymity, and/or identity properties. Similar results for parallel composition are also discussed.
△ Less
Submitted 23 October, 2013;
originally announced October 2013.