-
Uniform Inference in High-Dimensional Threshold Regression Models
Authors:
Jiatong Li,
Hongqiang Yan
Abstract:
We develop uniform inference for high-dimensional threshold regression parameters and valid inference for the threshold parameter in this paper. We first establish oracle inequalities for prediction errors and $\ell_1$ estimation errors for the Lasso estimator of the slope parameters and the threshold parameter, allowing for heteroskedastic non-subgaussian error terms and non-subgaussian covariate…
▽ More
We develop uniform inference for high-dimensional threshold regression parameters and valid inference for the threshold parameter in this paper. We first establish oracle inequalities for prediction errors and $\ell_1$ estimation errors for the Lasso estimator of the slope parameters and the threshold parameter, allowing for heteroskedastic non-subgaussian error terms and non-subgaussian covariates. Next, we derive the asymptotic distribution of tests involving an increasing number of slope parameters by debiasing (or desparsifying) the scaled Lasso estimator. The asymptotic distribution of tests without the threshold effect is identical to that with a fixed effect. Moreover, we perform valid inference for the threshold parameter using subsampling method. Finally, we conduct simulation studies to demonstrate the performance of our method in finite samples.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Early Adoption of Generative AI by Global Business Leaders: Insights from an INSEAD Alumni Survey
Authors:
Jason P Davis,
Jian Bai Li
Abstract:
How are new technologies like generative AI quickly adopted and used by executive and managerial leaders to create value in organizations? A survey of INSEAD's global alumni base revealed several intriguing insights into perceptions and engagements with generative AI across a broad spectrum of demographics, industries, and geographies. Notably, there's a prevailing optimism about the role of gener…
▽ More
How are new technologies like generative AI quickly adopted and used by executive and managerial leaders to create value in organizations? A survey of INSEAD's global alumni base revealed several intriguing insights into perceptions and engagements with generative AI across a broad spectrum of demographics, industries, and geographies. Notably, there's a prevailing optimism about the role of generative AI in enhancing productivity and innovation, as evidenced by the 90% of respondents being excited about its time-saving and efficiency benefits. Analysis revealed different attitudes about adoption and use across demographic variables. Younger respondents are significantly more excited about generative AI and more likely to be using it at work and in personal life than older participants. Those in Europe have a somewhat more distant view of generative AI than those in North America in Asia, in that they see the gains more likely to be captured by organizations than individuals, and are less likely to be using it in professional and personal contexts than those in North America and Asia. This may also be related to the fact that those in Europe are more likely to be working in Financial Services and less likely to be working in Information Technology industries than those in North America and Asia. Despite this, those in Europe are more likely to see AGI happening faster than those in North America, although this may reflect less interaction with generative AI in personal and professional contexts. These findings collectively underscore the complex and multifaceted perceptions of generative AI's role in society, pointing to both its promising potential and the challenges it presents.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
MTRGL:Effective Temporal Correlation Discerning through Multi-modal Temporal Relational Graph Learning
Authors:
Junwei Su,
Shan Wu,
**hui Li
Abstract:
In this study, we explore the synergy of deep learning and financial market applications, focusing on pair trading. This market-neutral strategy is integral to quantitative finance and is apt for advanced deep-learning techniques. A pivotal challenge in pair trading is discerning temporal correlations among entities, necessitating the integration of diverse data modalities. Addressing this, we int…
▽ More
In this study, we explore the synergy of deep learning and financial market applications, focusing on pair trading. This market-neutral strategy is integral to quantitative finance and is apt for advanced deep-learning techniques. A pivotal challenge in pair trading is discerning temporal correlations among entities, necessitating the integration of diverse data modalities. Addressing this, we introduce a novel framework, Multi-modal Temporal Relation Graph Learning (MTRGL). MTRGL combines time series data and discrete features into a temporal graph and employs a memory-based temporal graph neural network. This approach reframes temporal correlation identification as a temporal graph link prediction task, which has shown empirical success. Our experiments on real-world datasets confirm the superior performance of MTRGL, emphasizing its promise in refining automated pair trading strategies.
△ Less
Submitted 5 February, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
A Day-to-Day Dynamical Approach to the Most Likely User Equilibrium Problem
Authors:
Jiayang Li,
Qianni Wang,
Liyang Feng,
Jun Xie,
Yu Marco Nie
Abstract:
The lack of a unique user equilibrium (UE) route flow in traffic assignment has posed a significant challenge to many transportation applications. The maximum-entropy principle, which advocates for the consistent selection of the most likely solution as a representative, is often used to address the challenge. Built on a recently proposed day-to-day (DTD) discrete-time dynamical model called cumul…
▽ More
The lack of a unique user equilibrium (UE) route flow in traffic assignment has posed a significant challenge to many transportation applications. The maximum-entropy principle, which advocates for the consistent selection of the most likely solution as a representative, is often used to address the challenge. Built on a recently proposed day-to-day (DTD) discrete-time dynamical model called cumulative logit (CULO), this study provides a new behavioral underpinning for the maximum-entropy UE (MEUE) route flow. It has been proven that CULO can reach a UE state without presuming travelers are perfectly rational. Here, we further establish that CULO always converges to the MEUE route flow if (i) travelers have zero prior information about routes and thus are forced to give all routes an equal choice probability, or (ii) all travelers gather information from the same source such that the so-called general proportionality condition is satisfied. Thus, CULO may be used as a practical solution algorithm for the MEUE problem. To put this idea into practice, we propose to eliminate the route enumeration requirement of the original CULO model through an iterative route discovery scheme. We also examine the discrete-time versions of four popular continuous-time dynamical models and compare them to CULO. The analysis shows that the replicator dynamic is the only one that has the potential to reach the MEUE solution with some regularity. The analytical results are confirmed through numerical experiments.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Review on Decarbonizing the Transportation Sector in China: Overview, Analysis, and Perspectives
Authors:
Jiewei Li,
Ling **,
Han Deng,
Lin Yang
Abstract:
This review identifies challenges and effective strategies to decarbonize China's rapidly growing transportation sector, currently the third largest carbon emitter, considering China's commitment to peak carbon emissions by 2030 and achieve carbon neutrality by 2060. Key challenges include rising travel demand, unreached peak car ownership, declining bus ridership, gaps between energy technology r…
▽ More
This review identifies challenges and effective strategies to decarbonize China's rapidly growing transportation sector, currently the third largest carbon emitter, considering China's commitment to peak carbon emissions by 2030 and achieve carbon neutrality by 2060. Key challenges include rising travel demand, unreached peak car ownership, declining bus ridership, gaps between energy technology research and practical application, and limited institutional capacity for decarbonization. This review categorizes current decarbonization measures, strategies, and policies in China's transportation sector using the "Avoid, Shift, Improve" framework, complemented by a novel strategic vector of "Institutional Capacity & Technology Development" to capture broader development perspectives. This comprehensive analysis aims to facilitate informed decision-making and promote collaborative strategies for China's transition to a sustainable transportation future.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Blockchain-based Decentralized Co-governance: Innovations and Solutions for Sustainable Crowdfunding
Authors:
Bingyou Chen,
Yu Luo,
Jieni Li,
Yujian Li,
Ying Liu,
Fan Yang,
Junge Bo,
Yanan Qiao
Abstract:
This thesis provides an in-depth exploration of the Decentralized Co-governance Crowdfunding (DCC) Ecosystem, a novel solution addressing prevailing challenges in conventional crowdfunding methods faced by MSMEs and innovative projects. Among the problems it seeks to mitigate are high transaction costs, lack of transparency, fraud, and inefficient resource allocation. Leveraging a comprehensive re…
▽ More
This thesis provides an in-depth exploration of the Decentralized Co-governance Crowdfunding (DCC) Ecosystem, a novel solution addressing prevailing challenges in conventional crowdfunding methods faced by MSMEs and innovative projects. Among the problems it seeks to mitigate are high transaction costs, lack of transparency, fraud, and inefficient resource allocation. Leveraging a comprehensive review of the existing literature on crowdfunding economic activities and blockchain's impact on organizational governance, we propose a transformative socio-economic model based on digital tokens and decentralized co-governance. This ecosystem is marked by a tripartite community structure - the Labor, Capital, and Governance communities - each contributing uniquely to the ecosystem's operation. Our research unfolds the evolution of the DCC ecosystem through distinct phases, offering a novel understanding of socioeconomic dynamics in a decentralized digital world. It also delves into the intricate governance mechanism of the ecosystem, ensuring integrity, fairness, and a balanced distribution of value and wealth.
△ Less
Submitted 2 June, 2023; v1 submitted 1 June, 2023;
originally announced June 2023.
-
A Sco** Review of Internal Migration and Left-behind Children's Wellbeing in China
Authors:
**kai Li
Abstract:
Children's well-being of immigrants is facing several challenges related to physical, mental, and educational risks, which may obstacle human capital accumulation and further development. In rural China, due to the restriction of Hukou registration system, nearly 9 million left-behind children (LBC) are in lack of parental care and supervision in 2020 when their parents internally migrate out for…
▽ More
Children's well-being of immigrants is facing several challenges related to physical, mental, and educational risks, which may obstacle human capital accumulation and further development. In rural China, due to the restriction of Hukou registration system, nearly 9 million left-behind children (LBC) are in lack of parental care and supervision in 2020 when their parents internally migrate out for work. Through the systematic sco** review, this study provides a comprehensive literature summary and concludes the overall negative effects of parental migration on LBC's physical, mental (especially for left-behind girls), and educational outcomes (especially for left-behind boys). Noticeably, both parents' and mother's migration may exacerbate LBC's disadvantages. Furthermore, remittance from migrants and more family-level and social support may help mitigate the negative influence. Finally, we put forward theoretical and realistic implications which may shed light on potential research directions. Further studies, especially quantitative studies, are needed to conduct a longitudinal survey, combine the ongoing Hukou reform in China, and simultaneously focus on left-behind children and migrant children.
△ Less
Submitted 7 May, 2023;
originally announced May 2023.
-
Hedonic Prices and Quality Adjusted Price Indices Powered by AI
Authors:
Patrick Bajari,
Zhihao Cen,
Victor Chernozhukov,
Manoj Manukonda,
Suhas Vijaykumar,
** Wang,
Ramon Huerta,
Junbo Li,
Ling Leng,
George Monokroussos,
Shan Wan
Abstract:
Accurate, real-time measurements of price index changes using electronic records are essential for tracking inflation and productivity in today's economic environment. We develop empirical hedonic models that can process large amounts of unstructured product data (text, images, prices, quantities) and output accurate hedonic price estimates and derived indices. To accomplish this, we generate abst…
▽ More
Accurate, real-time measurements of price index changes using electronic records are essential for tracking inflation and productivity in today's economic environment. We develop empirical hedonic models that can process large amounts of unstructured product data (text, images, prices, quantities) and output accurate hedonic price estimates and derived indices. To accomplish this, we generate abstract product attributes, or ``features,'' from text descriptions and images using deep neural networks, and then use these attributes to estimate the hedonic price function. Specifically, we convert textual information about the product to numeric features using large language models based on transformers, trained or fine-tuned using product descriptions, and convert the product image to numeric features using a residual network model. To produce the estimated hedonic price function, we again use a multi-task neural network trained to predict a product's price in all time periods simultaneously. To demonstrate the performance of this approach, we apply the models to Amazon's data for first-party apparel sales and estimate hedonic prices. The resulting models have high predictive accuracy, with $R^2$ ranging from $80\%$ to $90\%$. Finally, we construct the AI-based hedonic Fisher price index, chained at the year-over-year frequency. We contrast the index with the CPI and other electronic indices.
△ Less
Submitted 28 April, 2023;
originally announced May 2023.
-
Wardrop Equilibrium Can Be Boundedly Rational: A New Behavioral Theory of Route Choice
Authors:
Jiayang Li,
Zhaoran Wang,
Yu Marco Nie
Abstract:
As one of the most fundamental concepts in transportation science, Wardrop equilibrium (WE) has always had a relatively weak behavioral underpinning. To strengthen this foundation, one must reckon with bounded rationality in human decision-making processes, such as the lack of accurate information, limited computing power, and sub-optimal choices. This retreat from behavioral perfectionism in the…
▽ More
As one of the most fundamental concepts in transportation science, Wardrop equilibrium (WE) has always had a relatively weak behavioral underpinning. To strengthen this foundation, one must reckon with bounded rationality in human decision-making processes, such as the lack of accurate information, limited computing power, and sub-optimal choices. This retreat from behavioral perfectionism in the literature, however, was typically accompanied by a conceptual modification of WE. Here, we show that giving up perfect rationality need not force a departure from WE. On the contrary, WE can be reached with global stability in a routing game played by boundedly rational travelers. We achieve this result by develo** a day-to-day (DTD) dynamical model that mimics how travelers gradually adjust their route valuations, hence choice probabilities, based on past experiences. Our model, called cumulative logit (CumLog), resembles the classical DTD models but makes a crucial change: whereas the classical models assume routes are valued based on the cost averaged over historical data, ours values the routes based on the cost accumulated. To describe route choice behaviors, the CumLog model only uses two parameters, one accounting for the rate at which the future route cost is discounted in the valuation relative to the past ones and the other describing the sensitivity of route choice probabilities to valuation differences. We prove tha CumLog always converges to WE, regardless of the initial point, as long as the behavioral parameters satisfy certain mild conditions. Our theory thus upholds WE's role as a benchmark in transportation systems analysis. It also resolves the theoretical challenge posed by Harsanyi's instability problem by explaining why equally good routes at WE are selected with different probabilities.
△ Less
Submitted 6 February, 2024; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Macro carbon price prediction with support vector regression and Paris accord targets
Authors:
**hui Li
Abstract:
Carbon neutralization is an urgent task in society because of the global warming threat. And carbon trading is an essential market mechanics to solve carbon reduction targets. Macro carbon price prediction is vital in the useful management and decision-making of the carbon market. We focus on the EU carbon market and we choose oil price, coal price, gas price, and DAX index to be the four market f…
▽ More
Carbon neutralization is an urgent task in society because of the global warming threat. And carbon trading is an essential market mechanics to solve carbon reduction targets. Macro carbon price prediction is vital in the useful management and decision-making of the carbon market. We focus on the EU carbon market and we choose oil price, coal price, gas price, and DAX index to be the four market factors in predicting carbon price, and also we select carbon emission targets from Paris Accord as the political factor in the carbon market in terms of the macro view of the carbon price prediction. Thus we use these five factors as inputs to predict the future carbon yearly price in 2030 with the support vector regression models. We use grid search and cross validation to guarantee the prediction performance of our models. We believe this model will have great applications in the macro carbon price prediction.
△ Less
Submitted 29 November, 2022;
originally announced December 2022.
-
Simultaneous Inference of a Partially Linear Model in Time Series
Authors:
Jiaqi Li,
Likai Chen,
Kun Ho Kim,
Tianwei Zhou
Abstract:
We introduce a new methodology to conduct simultaneous inference of the nonparametric component in partially linear time series regression models where the nonparametric part is a multivariate unknown function. In particular, we construct a simultaneous confidence region (SCR) for the multivariate function by extending the high-dimensional Gaussian approximation to dependent processes with continu…
▽ More
We introduce a new methodology to conduct simultaneous inference of the nonparametric component in partially linear time series regression models where the nonparametric part is a multivariate unknown function. In particular, we construct a simultaneous confidence region (SCR) for the multivariate function by extending the high-dimensional Gaussian approximation to dependent processes with continuous index sets. Our results allow for a more general dependence structure compared to previous works and are widely applicable to a variety of linear and nonlinear autoregressive processes. We demonstrate the validity of our proposed methodology by examining the finite-sample performance in the simulation study. Finally, an application in time series, the forward premium regression, is presented, where we construct the SCR for the foreign exchange risk premium from the exchange rate and macroeconomic data.
△ Less
Submitted 2 September, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
Synthetic Principal Component Design: Fast Covariate Balancing with Synthetic Controls
Authors:
Yi** Lu,
Jia** Li,
Lexing Ying,
Jose Blanchet
Abstract:
The optimal design of experiments typically involves solving an NP-hard combinatorial optimization problem. In this paper, we aim to develop a globally convergent and practically efficient optimization algorithm. Specifically, we consider a setting where the pre-treatment outcome data is available and the synthetic control estimator is invoked. The average treatment effect is estimated via the dif…
▽ More
The optimal design of experiments typically involves solving an NP-hard combinatorial optimization problem. In this paper, we aim to develop a globally convergent and practically efficient optimization algorithm. Specifically, we consider a setting where the pre-treatment outcome data is available and the synthetic control estimator is invoked. The average treatment effect is estimated via the difference between the weighted average outcomes of the treated and control units, where the weights are learned from the observed data. {Under this setting, we surprisingly observed that the optimal experimental design problem could be reduced to a so-called \textit{phase synchronization} problem.} We solve this problem via a normalized variant of the generalized power method with spectral initialization. On the theoretical side, we establish the first global optimality guarantee for experiment design when pre-treatment data is sampled from certain data-generating processes. Empirically, we conduct extensive experiments to demonstrate the effectiveness of our method on both the US Bureau of Labor Statistics and the Abadie-Diemond-Hainmueller California Smoking Data. In terms of the root mean square error, our algorithm surpasses the random design by a large margin.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
How a Brand's Social Activism Impacts Consumers' Brand Evaluations: The Role of Brand Relationship Norms
Authors:
**g**g Li,
Nicole Montgomery,
Reza Mousavi
Abstract:
With the proliferation of social activism online, brands face heightened pressure from consumers to publicly address these issues. Yet, the optimal brand response strategy (i.e., whether and how to respond) in these contexts remains unclear. This research investigates consumers' reactions to brand response strategies (e.g., engage vs. not) during social activism and offers potentially effective re…
▽ More
With the proliferation of social activism online, brands face heightened pressure from consumers to publicly address these issues. Yet, the optimal brand response strategy (i.e., whether and how to respond) in these contexts remains unclear. This research investigates consumers' reactions to brand response strategies (e.g., engage vs. not) during social activism and offers potentially effective responses that brands can employ to engage in these issues. By analyzing real-world data collected from Twitter and conducting four randomized experiments, this research discovers that brand relationship type (exchange, communal) affects consumers' brand evaluations in the wake of social activism. Communal (vs. exchange) brands are evaluated less favorably when they do not respond or utilize a low-empathy response. This difference is attenuated when brands employ a high-empathy response. These findings are attributable to consumers' perceptions of whether the brand's response strategy complies with relationship norms during social activism. The effects persist across activism events that vary in their political polarization. This research contributes to the literatures on brand engagement in social activism, brand relationships, and crisis communication. The findings also offer guidance to practitioners on crafting response strategies during social activism and aid activists in securing brand support for societal benefits.
△ Less
Submitted 6 September, 2023; v1 submitted 19 October, 2022;
originally announced October 2022.
-
A Data Science Pipeline for Algorithmic Trading: A Comparative Study of Applications for Finance and Cryptoeconomics
Authors:
Luyao Zhang,
Tianyu Wu,
Saad Lahrichi,
Carlos-Gustavo Salas-Flores,
Jiayi Li
Abstract:
Recent advances in Artificial Intelligence (AI) have made algorithmic trading play a central role in finance. However, current research and applications are disconnected information islands. We propose a generally applicable pipeline for designing, programming, and evaluating the algorithmic trading of stock and crypto assets. Moreover, we demonstrate how our data science pipeline works with respe…
▽ More
Recent advances in Artificial Intelligence (AI) have made algorithmic trading play a central role in finance. However, current research and applications are disconnected information islands. We propose a generally applicable pipeline for designing, programming, and evaluating the algorithmic trading of stock and crypto assets. Moreover, we demonstrate how our data science pipeline works with respect to four conventional algorithms: the moving average crossover, volume-weighted average price, sentiment analysis, and statistical arbitrage algorithms. Our study offers a systematic way to program, evaluate, and compare different trading strategies. Furthermore, we implement our algorithms through object-oriented programming in Python3, which serves as open-source software for future academic research and applications.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
Budget-Constrained Auctions with Unassured Priors: Strategic Equivalence and Structural Properties
Authors:
Zhaohua Chen,
Mingwei Yang,
Chang Wang,
Jicheng Li,
Zheng Cai,
Yukun Ren,
Zhihua Zhu,
Xiaotie Deng
Abstract:
In today's online advertising markets, it is common for advertisers to set long-term budgets. Correspondingly, advertising platforms adopt budget control methods to ensure that advertisers' payments lie within their budgets. Most budget control methods rely on the value distributions of advertisers. However, due to the complex advertising landscape and potential privacy concerns, the platform hard…
▽ More
In today's online advertising markets, it is common for advertisers to set long-term budgets. Correspondingly, advertising platforms adopt budget control methods to ensure that advertisers' payments lie within their budgets. Most budget control methods rely on the value distributions of advertisers. However, due to the complex advertising landscape and potential privacy concerns, the platform hardly learns advertisers' true priors. Thus, it is crucial to understand how budget control auction mechanisms perform under unassured priors.
This work answers this problem from multiple aspects. We consider the unassured prior game among the seller and all buyers induced by different mechanisms in the stochastic model. We restrict the parameterized mechanisms to satisfy the budget-extracting condition, which maximizes the seller's revenue by extracting buyers' budgets as effectively as possible. Our main result shows that the Bayesian revenue-optimal mechanism and the budget-extracting bid-discount first-price mechanism yield the same set of Nash equilibrium outcomes in the unassured prior game. This implies that simple mechanisms can be as robust as the optimal mechanism under unassured priors in the budget-constrained setting. In the symmetric case, we further show that all these five (budget-extracting) mechanisms share the same set of possible outcomes. We further dig into the structural properties of these mechanisms. We characterize sufficient and necessary conditions on the budget-extracting parameter tuple for bid-discount/pacing first-price auctions. Meanwhile, when buyers do not take strategic behaviors, we exploit the dominance relationships of these mechanisms by revealing their intrinsic structures.
△ Less
Submitted 10 February, 2024; v1 submitted 31 March, 2022;
originally announced March 2022.
-
NumHTML: Numeric-Oriented Hierarchical Transformer Model for Multi-task Financial Forecasting
Authors:
Linyi Yang,
Jiazheng Li,
Ruihai Dong,
Yue Zhang,
Barry Smyth
Abstract:
Financial forecasting has been an important and active area of machine learning research because of the challenges it presents and the potential rewards that even minor improvements in prediction accuracy or forecasting may entail. Traditionally, financial forecasting has heavily relied on quantitative indicators and metrics derived from structured financial statements. Earnings conference call da…
▽ More
Financial forecasting has been an important and active area of machine learning research because of the challenges it presents and the potential rewards that even minor improvements in prediction accuracy or forecasting may entail. Traditionally, financial forecasting has heavily relied on quantitative indicators and metrics derived from structured financial statements. Earnings conference call data, including text and audio, is an important source of unstructured data that has been used for various prediction tasks using deep earning and related approaches. However, current deep learning-based methods are limited in the way that they deal with numeric data; numbers are typically treated as plain-text tokens without taking advantage of their underlying numeric structure. This paper describes a numeric-oriented hierarchical transformer model to predict stock returns, and financial risk using multi-modal aligned earnings calls data by taking advantage of the different categories of numbers (monetary, temporal, percentages etc.) and their magnitude. We present the results of a comprehensive evaluation of NumHTML against several state-of-the-art baselines using a real-world publicly available dataset. The results indicate that NumHTML significantly outperforms the current state-of-the-art across a variety of evaluation metrics and that it has the potential to offer significant financial gains in a practical trading context.
△ Less
Submitted 5 January, 2022;
originally announced January 2022.
-
Efficient Likelihood-based Estimation via Annealing for Dynamic Structural Macrofinance Models
Authors:
Andras Fulop,
Jeremy Heng,
Junye Li
Abstract:
Most solved dynamic structural macrofinance models are non-linear and/or non-Gaussian state-space models with high-dimensional and complex structures. We propose an annealed controlled sequential Monte Carlo method that delivers numerically stable and low variance estimators of the likelihood function. The method relies on an annealing procedure to gradually introduce information from observations…
▽ More
Most solved dynamic structural macrofinance models are non-linear and/or non-Gaussian state-space models with high-dimensional and complex structures. We propose an annealed controlled sequential Monte Carlo method that delivers numerically stable and low variance estimators of the likelihood function. The method relies on an annealing procedure to gradually introduce information from observations and constructs globally optimal proposal distributions by solving associated optimal control problems that yield zero variance likelihood estimators. To perform parameter inference, we develop a new adaptive SMC$^2$ algorithm that employs likelihood estimators from annealed controlled sequential Monte Carlo. We provide a theoretical stability analysis that elucidates the advantages of our methodology and asymptotic results concerning the consistency and convergence rates of our SMC$^2$ estimators. We illustrate the strengths of our proposed methodology by estimating two popular macrofinance models: a non-linear new Keynesian dynamic stochastic general equilibrium model and a non-linear non-Gaussian consumption-based long-run risk model.
△ Less
Submitted 4 January, 2022;
originally announced January 2022.
-
Solving the Data Sparsity Problem in Predicting the Success of the Startups with Machine Learning Methods
Authors:
Dafei Yin,
**g Li,
Gaosheng Wu
Abstract:
Predicting the success of startup companies is of great importance for both startup companies and investors. It is difficult due to the lack of available data and appropriate general methods. With data platforms like Crunchbase aggregating the information of startup companies, it is possible to predict with machine learning algorithms. Existing research suffers from the data sparsity problem as mo…
▽ More
Predicting the success of startup companies is of great importance for both startup companies and investors. It is difficult due to the lack of available data and appropriate general methods. With data platforms like Crunchbase aggregating the information of startup companies, it is possible to predict with machine learning algorithms. Existing research suffers from the data sparsity problem as most early-stage startup companies do not have much data available to the public. We try to leverage the recent algorithms to solve this problem. We investigate several machine learning algorithms with a large dataset from Crunchbase. The results suggest that LightGBM and XGBoost perform best and achieve 53.03% and 52.96% F1 scores. We interpret the predictions from the perspective of feature contribution. We construct portfolios based on the models and achieve high success rates. These findings have substantial implications on how machine learning methods can help startup companies and investors.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Dynamic Selection in Algorithmic Decision-making
Authors:
** Li,
Ye Luo,
Xiaowei Zhang
Abstract:
This paper identifies and addresses dynamic selection problems in online learning algorithms with endogenous data. In a contextual multi-armed bandit model, a novel bias (self-fulfilling bias) arises because the endogeneity of the data influences the choices of decisions, affecting the distribution of future data to be collected and analyzed. We propose an instrumental-variable-based algorithm to…
▽ More
This paper identifies and addresses dynamic selection problems in online learning algorithms with endogenous data. In a contextual multi-armed bandit model, a novel bias (self-fulfilling bias) arises because the endogeneity of the data influences the choices of decisions, affecting the distribution of future data to be collected and analyzed. We propose an instrumental-variable-based algorithm to correct for the bias. It obtains true parameter values and attains low (logarithmic-like) regret levels. We also prove a central limit theorem for statistical inference. To establish the theoretical properties, we develop a general technique that untangles the interdependence between data and actions.
△ Less
Submitted 27 September, 2023; v1 submitted 27 August, 2021;
originally announced August 2021.
-
Causal Reinforcement Learning: An Instrumental Variable Approach
Authors:
** Li,
Ye Luo,
Xiaowei Zhang
Abstract:
In the standard data analysis framework, data is first collected (once for all), and then data analysis is carried out. Moreover, the data-generating process is typically assumed to be exogenous. This approach is natural when the data analyst has no impact on how the data is generated. The advancement of digital technology, however, has facilitated firms to learn from data and make decisions at th…
▽ More
In the standard data analysis framework, data is first collected (once for all), and then data analysis is carried out. Moreover, the data-generating process is typically assumed to be exogenous. This approach is natural when the data analyst has no impact on how the data is generated. The advancement of digital technology, however, has facilitated firms to learn from data and make decisions at the same time. As these decisions generate new data, the data analyst -- a business manager or an algorithm -- also becomes the data generator. This interaction generates a new type of bias -- reinforcement bias -- that exacerbates the endogeneity problem in static data analysis. Causal inference techniques ought to be incorporated into reinforcement learning to address such issues.
△ Less
Submitted 2 September, 2022; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Algorithmic subsampling under multiway clustering
Authors:
Harold D. Chiang,
Jiatong Li,
Yuya Sasaki
Abstract:
This paper proposes a novel method of algorithmic subsampling (data sketching) for multiway cluster dependent data. We establish a new uniform weak law of large numbers and a new central limit theorem for the multiway algorithmic subsample means. Consequently, we discover an additional advantage of the algorithmic subsampling that it allows for robustness against potential degeneracy, and even non…
▽ More
This paper proposes a novel method of algorithmic subsampling (data sketching) for multiway cluster dependent data. We establish a new uniform weak law of large numbers and a new central limit theorem for the multiway algorithmic subsample means. Consequently, we discover an additional advantage of the algorithmic subsampling that it allows for robustness against potential degeneracy, and even non-Gaussian degeneracy, of the asymptotic distribution under multiway clustering. Simulation studies support this novel result, and demonstrate that inference with the algorithmic subsampling entails more accuracy than that without the algorithmic subsampling. Applying these basic asymptotic theories, we derive the consistency and the asymptotic normality for the multiway algorithmic subsampling generalized method of moments estimator and for the multiway algorithmic subsampling M-estimator. We illustrate an application to scanner data.
△ Less
Submitted 30 October, 2022; v1 submitted 28 February, 2021;
originally announced March 2021.
-
The Impact of COVID-19 and Policy Responses on Australian Income Distribution and Poverty
Authors:
****g Li,
Yogi Vidyattama,
Hai Anh La,
Riyana Miranti,
Denisa M Sologon
Abstract:
This paper undertakes a near real-time analysis of the income distribution effects of the COVID-19 crisis in Australia to understand the ongoing changes in the income distribution as well as the impact of policy responses. By semi-parametrically combining incomplete observed data from three different sources, namely, the Monthly Longitudinal Labour Force Survey, the Survey of Income and Housing an…
▽ More
This paper undertakes a near real-time analysis of the income distribution effects of the COVID-19 crisis in Australia to understand the ongoing changes in the income distribution as well as the impact of policy responses. By semi-parametrically combining incomplete observed data from three different sources, namely, the Monthly Longitudinal Labour Force Survey, the Survey of Income and Housing and the administrative payroll data, we estimate the impact of COVID-19 and the associated policy responses on the Australian income distribution between February and June 2020, covering the immediate periods before and after the initial outbreak. Our results suggest that despite the growth in unemployment, the Gini of the equalised disposable income inequality dropped by nearly 0.03 point since February. The reduction is because of the additional wage subsidies and welfare supports offered as part of the policy response, offsetting a potential surge in income inequality. Additionally, the poverty rate, which could have been doubled in the absence of the government response, also reduced by 3 to 4 percentage points. The result shows the effectiveness of temporary policy measures in maintaining both the living standards and the level of income inequality. However, the heavy reliance on the support measures raises the possibility that the changes in the income distribution may be reversed and even substantially worsened off should the measures be withdrawn.
△ Less
Submitted 8 September, 2020;
originally announced September 2020.
-
Permutation-based tests for discontinuities in event studies
Authors:
Federico A. Bugni,
Jia Li,
Qiyuan Li
Abstract:
We propose using a permutation test to detect discontinuities in an underlying economic model at a known cutoff point. Relative to the existing literature, we show that this test is well suited for event studies based on time-series data. The test statistic measures the distance between the empirical distribution functions of observed data in two local subsamples on the two sides of the cutoff. Cr…
▽ More
We propose using a permutation test to detect discontinuities in an underlying economic model at a known cutoff point. Relative to the existing literature, we show that this test is well suited for event studies based on time-series data. The test statistic measures the distance between the empirical distribution functions of observed data in two local subsamples on the two sides of the cutoff. Critical values are computed via a standard permutation algorithm. Under a high-level condition that the observed data can be coupled by a collection of conditionally independent variables, we establish the asymptotic validity of the permutation test, allowing the sizes of the local subsamples to be either be fixed or grow to infinity. In the latter case, we also establish that the permutation test is consistent. We demonstrate that our high-level condition can be verified in a broad range of problems in the infill asymptotic time-series setting, which justifies using the permutation test to detect jumps in economic variables such as volatility, trading activity, and liquidity. These potential applications are illustrated in an empirical case study for selected FOMC announcements during the ongoing COVID-19 pandemic.
△ Less
Submitted 10 July, 2022; v1 submitted 19 July, 2020;
originally announced July 2020.
-
Mercury-related health benefits from retrofitting coal-fired power plants in China
Authors:
Jiashuo Li,
Sili Zhou,
Wendong Wei,
Jianchuan Qi,
Yumeng Li,
Bin Chen,
Ning Zhang,
Dabo Guan,
Haoqi Qian,
Xiaohui Wu,
Jiawen Miao,
Long Chen,
Sai Liang,
Kuishuang Feng
Abstract:
China has implemented retrofitting measures in coal-fired power plants (CFPPs) to reduce air pollution through small unit shutdown (SUS), the installation of air pollution control devices (APCDs) and power generation efficiency (PGE) improvement. The reductions in highly toxic Hg emissions and their related health impacts by these measures have not been well studied. To refine mitigation options,…
▽ More
China has implemented retrofitting measures in coal-fired power plants (CFPPs) to reduce air pollution through small unit shutdown (SUS), the installation of air pollution control devices (APCDs) and power generation efficiency (PGE) improvement. The reductions in highly toxic Hg emissions and their related health impacts by these measures have not been well studied. To refine mitigation options, we evaluated the health benefits of reduced Hg emissions via retrofitting measures during China's 12th Five-Year Plan by combining plant-level Hg emission inventories with the China Hg Risk Source-Tracking Model. We found that the measures reduced Hg emissions by 23.5 tons (approximately 1/5 of that from CFPPs in 2010), preventing 0.0021 points of per-foetus intelligence quotient (IQ) decrements and 114 deaths from fatal heart attacks. These benefits were dominated by CFPP shutdowns and APCD installations. Provincial health benefits were largely attributable to Hg reductions in other regions. We also demonstrated the necessity of considering human health impacts, rather than just Hg emission reductions, in selecting Hg control devices. This study also suggests that Hg control strategies should consider various factors, such as CFPP locations, population densities and trade-offs between reductions of total Hg (THg) and Hg2+.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Strategically Simple Mechanisms
Authors:
Tilman Borgers,
Jiangtao Li
Abstract:
We define and investigate a property of mechanisms that we call "strategic simplicity," and that is meant to capture the idea that, in strategically simple mechanisms, strategic choices require limited strategic sophistication. We define a mechanism to be strategically simple if choices can be based on first-order beliefs about the other agents' preferences and first-order certainty about the othe…
▽ More
We define and investigate a property of mechanisms that we call "strategic simplicity," and that is meant to capture the idea that, in strategically simple mechanisms, strategic choices require limited strategic sophistication. We define a mechanism to be strategically simple if choices can be based on first-order beliefs about the other agents' preferences and first-order certainty about the other agents' rationality alone, and there is no need for agents to form higher-order beliefs, because such beliefs are irrelevant to the optimal strategies. All dominant strategy mechanisms are strategically simple. But many more mechanisms are strategically simple. In particular, strategically simple mechanisms may be more flexible than dominant strategy mechanisms in the bilateral trade problem and the voting problem.
△ Less
Submitted 3 December, 2018;
originally announced December 2018.
-
Moment Inequalities in the Context of Simulated and Predicted Variables
Authors:
Hiroaki Kaido,
Jiaxuan Li,
Marc Rysman
Abstract:
This paper explores the effects of simulated moments on the performance of inference methods based on moment inequalities. Commonly used confidence sets for parameters are level sets of criterion functions whose boundary points may depend on sample moments in an irregular manner. Due to this feature, simulation errors can affect the performance of inference in non-standard ways. In particular, a (…
▽ More
This paper explores the effects of simulated moments on the performance of inference methods based on moment inequalities. Commonly used confidence sets for parameters are level sets of criterion functions whose boundary points may depend on sample moments in an irregular manner. Due to this feature, simulation errors can affect the performance of inference in non-standard ways. In particular, a (first-order) bias due to the simulation errors may remain in the estimated boundary of the confidence set. We demonstrate, through Monte Carlo experiments, that simulation errors can significantly reduce the coverage probabilities of confidence sets in small samples. The size distortion is particularly severe when the number of inequality restrictions is large. These results highlight the danger of ignoring the sampling variations due to the simulation errors in moment inequality models. Similar issues arise when using predicted variables in moment inequalities models. We propose a method for properly correcting for these variations based on regularizing the intersection of moments in parameter space, and we show that our proposed method performs well theoretically and in practice.
△ Less
Submitted 10 April, 2018;
originally announced April 2018.