-
Distilled ChatGPT Topic & Sentiment Modeling with Applications in Finance
Authors:
Olivier Gandouet,
Mouloud Belbahri,
Armelle Jezequel,
Yuriy Bodjov
Abstract:
In this study, ChatGPT is utilized to create streamlined models that generate easily interpretable features. These features are then used to evaluate financial outcomes from earnings calls. We detail a training approach that merges knowledge distillation and transfer learning, resulting in lightweight topic and sentiment classification models without significant loss in accuracy. These models are…
▽ More
In this study, ChatGPT is utilized to create streamlined models that generate easily interpretable features. These features are then used to evaluate financial outcomes from earnings calls. We detail a training approach that merges knowledge distillation and transfer learning, resulting in lightweight topic and sentiment classification models without significant loss in accuracy. These models are assessed through a dataset annotated by experts. The paper also delves into two practical case studies, highlighting how the generated features can be effectively utilized in quantitative investing scenarios.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
A Twin Neural Model for Uplift
Authors:
Mouloud Belbahri,
Olivier Gandouet,
Alejandro Murua,
Vahid Partovi Nia
Abstract:
Uplift is a particular case of conditional treatment effect modeling. Such models deal with cause-and-effect inference for a specific factor, such as a marketing intervention or a medical treatment. In practice, these models are built on individual data from randomized clinical trials where the goal is to partition the participants into heterogeneous groups depending on the uplift. Most existing a…
▽ More
Uplift is a particular case of conditional treatment effect modeling. Such models deal with cause-and-effect inference for a specific factor, such as a marketing intervention or a medical treatment. In practice, these models are built on individual data from randomized clinical trials where the goal is to partition the participants into heterogeneous groups depending on the uplift. Most existing approaches are adaptations of random forests for the uplift case. Several split criteria have been proposed in the literature, all relying on maximizing heterogeneity. However, in practice, these approaches are prone to overfitting. In this work, we bring a new vision to uplift modeling. We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk. Our solution is developed for a specific twin neural network architecture allowing to jointly optimize the marginal probabilities of success for treated and control individuals. We show that this model is a generalization of the uplift logistic interaction model. We modify the stochastic gradient descent algorithm to allow for structured sparse solutions. This helps training our uplift models to a great extent. We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
How Does Batch Normalization Help Binary Training?
Authors:
Eyyüb Sari,
Mouloud Belbahri,
Vahid Partovi Nia
Abstract:
Binary Neural Networks (BNNs) are difficult to train, and suffer from drop of accuracy. It appears in practice that BNNs fail to train in the absence of Batch Normalization (BatchNorm) layer. We find the main role of BatchNorm is to avoid exploding gradients in the case of BNNs. This finding suggests that the common initialization methods developed for full-precision networks are irrelevant to BNN…
▽ More
Binary Neural Networks (BNNs) are difficult to train, and suffer from drop of accuracy. It appears in practice that BNNs fail to train in the absence of Batch Normalization (BatchNorm) layer. We find the main role of BatchNorm is to avoid exploding gradients in the case of BNNs. This finding suggests that the common initialization methods developed for full-precision networks are irrelevant to BNNs. We build a theoretical study on the role of BatchNorm in binary training, backed up by numerical experiments.
△ Less
Submitted 29 April, 2020; v1 submitted 18 September, 2019;
originally announced September 2019.
-
Active Learning for High-Dimensional Binary Features
Authors:
Ali Vahdat,
Mouloud Belbahri,
Vahid Partovi Nia
Abstract:
Erbium-doped fiber amplifier (EDFA) is an optical amplifier/repeater device used to boost the intensity of optical signals being carried through a fiber optic communication system. A highly accurate EDFA model is important because of its crucial role in optical network management and optimization. The input channels of an EDFA device are treated as either on or off, hence the input features are bi…
▽ More
Erbium-doped fiber amplifier (EDFA) is an optical amplifier/repeater device used to boost the intensity of optical signals being carried through a fiber optic communication system. A highly accurate EDFA model is important because of its crucial role in optical network management and optimization. The input channels of an EDFA device are treated as either on or off, hence the input features are binary. Labeled training data is very expensive to collect for EDFA devices, therefore we devise an active learning strategy suitable for binary variables to overcome this issue. We propose to take advantage of sparse linear models to simplify the predictive model. This approach simultaneously improves prediction and accelerates active learning query generation. We show the performance of our proposed active learning strategies on simulated data and real EDFA data.
△ Less
Submitted 11 June, 2019; v1 submitted 5 February, 2019;
originally announced February 2019.
-
Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks
Authors:
Mouloud Belbahri,
Eyyüb Sari,
Sajad Darabi,
Vahid Partovi Nia
Abstract:
Deep neural networks (DNNs) have demonstrated success for many supervised learning tasks, ranging from voice recognition, object detection, to image classification. However, their increasing complexity might yield poor generalization error that make them hard to be deployed on edge devices. Quantization is an effective approach to compress DNNs in order to meet these constraints. Using a quasiconv…
▽ More
Deep neural networks (DNNs) have demonstrated success for many supervised learning tasks, ranging from voice recognition, object detection, to image classification. However, their increasing complexity might yield poor generalization error that make them hard to be deployed on edge devices. Quantization is an effective approach to compress DNNs in order to meet these constraints. Using a quasiconvex base function in order to construct a binary quantizer helps training binary neural networks (BNNs) and adding noise to the input data or using a concrete regularization function helps to improve generalization error. Here we introduce foothill function, an infinitely differentiable quasiconvex function. This regularizer is flexible enough to deform towards $L_1$ and $L_2$ penalties. Foothill can be used as a binary quantizer, as a regularizer, or as a loss. In particular, we show this regularizer reduces the accuracy gap between BNNs and their full-precision counterpart for image classification on ImageNet.
△ Less
Submitted 23 May, 2019; v1 submitted 18 January, 2019;
originally announced January 2019.
-
Regularized Binary Network Training
Authors:
Sajad Darabi,
Mouloud Belbahri,
Matthieu Courbariaux,
Vahid Partovi Nia
Abstract:
There is a significant performance gap between Binary Neural Networks (BNNs) and floating point Deep Neural Networks (DNNs). We propose to improve the binary training method, by introducing a new regularization function that encourages training weights around binary values. In addition, we add trainable scaling factors to our regularization functions. Additionally, an improved approximation of the…
▽ More
There is a significant performance gap between Binary Neural Networks (BNNs) and floating point Deep Neural Networks (DNNs). We propose to improve the binary training method, by introducing a new regularization function that encourages training weights around binary values. In addition, we add trainable scaling factors to our regularization functions. Additionally, an improved approximation of the derivative of the sign activation function in the backward computation. These modifications are based on linear operations that are easily implementable into the binary training framework. Experimental results on ImageNet shows our method outperforms the traditional BNN method and XNOR-net.
△ Less
Submitted 21 April, 2020; v1 submitted 31 December, 2018;
originally announced December 2018.