Search | arXiv e-print repository

Hybrid Approach to Parallel Stochastic Gradient Descent

Authors: Aakash Sudhirbhai Vora, Dhrumil Chetankumar Joshi, Aksh Kantibhai Patel

Abstract: Stochastic Gradient Descent is used for large datasets to train models to reduce the training time. On top of that data parallelism is widely used as a method to efficiently train neural networks using multiple worker nodes in parallel. Synchronous and asynchronous approach to data parallelism is used by most systems to train the model in parallel. However, both of them have their drawbacks. We pr… ▽ More Stochastic Gradient Descent is used for large datasets to train models to reduce the training time. On top of that data parallelism is widely used as a method to efficiently train neural networks using multiple worker nodes in parallel. Synchronous and asynchronous approach to data parallelism is used by most systems to train the model in parallel. However, both of them have their drawbacks. We propose a third approach to data parallelism which is a hybrid between synchronous and asynchronous approaches, using both approaches to train the neural network. When the threshold function is selected appropriately to gradually shift all parameter aggregation from asynchronous to synchronous, we show that in a given time period our hybrid approach outperforms both asynchronous and synchronous approaches. △ Less

Submitted 27 June, 2024; originally announced July 2024.

arXiv:2307.02644 [pdf, ps, other]

Achievable Rates for Information Extraction from a Strategic Sender

Authors: Anuj S. Vora, Ankur A. Kulkarni

Abstract: We consider a setting of non-cooperative communication where a receiver wants to recover randomly generated sequences of symbols that are observed by a strategic sender. The sender aims to maximize an average utility that may not align with the recovery criterion of the receiver, whereby the received signals may not be truthful. We pose this problem as a sequential game between the sender and the… ▽ More We consider a setting of non-cooperative communication where a receiver wants to recover randomly generated sequences of symbols that are observed by a strategic sender. The sender aims to maximize an average utility that may not align with the recovery criterion of the receiver, whereby the received signals may not be truthful. We pose this problem as a sequential game between the sender and the receiver with the receiver as the leader and determine `achievable strategies' for the receiver that attain arbitrarily small probability of error for large blocklengths. We show the existence of such achievable strategies under a sufficient condition on the utility of the sender. For the case of the binary alphabet, this condition is also necessary, in the absence of which, the probability of error goes to one for all choices of strategies of the receiver. We show that for reliable recovery, the receiver chooses to correctly decode only a subset of messages received from the sender and deliberately makes an error on messages outside this subset. Due to this decoding strategy, despite a clean channel, our setting exhibits a notion of maximum rate of communication above which the probability of error may not vanish asymptotically and in certain cases, may even tend to one. For the case of the binary alphabet, the maximum rate may be strictly less than unity for certain classes of utilities. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: Submitted to IEEE Transactions on Information Theory

arXiv:2010.15008 [pdf, ps, other]

Optimal Questionnaires for Screening of Strategic Agents

Authors: Anuj S. Vora, Ankur A. Kulkarni

Abstract: During the COVID-$19$ pandemic the health authorities at airports and train stations try to screen and identify the travellers possibly exposed to the virus. However, many individuals avoid getting tested and hence may misreport their travel history. This is a challenge for the health authorities who wish to ascertain the truly susceptible cases in spite of this strategic misreporting. We investig… ▽ More During the COVID-$19$ pandemic the health authorities at airports and train stations try to screen and identify the travellers possibly exposed to the virus. However, many individuals avoid getting tested and hence may misreport their travel history. This is a challenge for the health authorities who wish to ascertain the truly susceptible cases in spite of this strategic misreporting. We investigate the problem of questioning travellers to classify them for further testing when the travellers are strategic or are unwilling to reveal their travel histories. We show there are fundamental limits to how many travel histories the health authorities can recover.% can be correctly classified by any probing mechanism. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: Longer version of our paper submitted to ICASSP 2021

MSC Class: 91A28; 94D99

arXiv:2006.10641 [pdf, ps, other]

Shannon meets Myerson: Information Extraction from a Strategic Sender

Authors: Anuj S. Vora, Ankur A. Kulkarni

Abstract: We study a setting where a receiver must design a questionnaire to recover a sequence of symbols known to strategic sender, whose utility may not be incentive compatible. We allow the receiver the possibility of selecting the alternatives presented in the questionnaire, and thereby linking decisions across the components of the sequence. We show that, despite the strategic sender and the noise in… ▽ More We study a setting where a receiver must design a questionnaire to recover a sequence of symbols known to strategic sender, whose utility may not be incentive compatible. We allow the receiver the possibility of selecting the alternatives presented in the questionnaire, and thereby linking decisions across the components of the sequence. We show that, despite the strategic sender and the noise in the channel, the receiver can recover exponentially many sequences, but also that exponentially many sequences are unrecoverable even by the best strategy. We define the growth rate of the number of recovered sequences as the information extraction capacity. A generalization of the Shannon capacity, it characterizes the optimal amount of communication resources required. We derive bounds leading to an exact evaluation of the information extraction capacity in many cases. Our results form the building blocks of a novel, noncooperative regime of communication involving a strategic sender. △ Less

Submitted 15 September, 2022; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: Submitted to Games and Economic Behaviour

arXiv:1907.05324 [pdf, ps, other]

Minimax Theorems for Finite Blocklength Lossy Joint Source-Channel Coding over an AVC

Authors: Anuj S. Vora, Ankur A. Kulkarni

Abstract: Motivated by applications in the security of cyber-physical systems, we pose the finite blocklength communication problem in the presence of a jammer as a zero-sum game between the encoder-decoder team and the jammer, by allowing the communicating team as well as the jammer only locally randomized strategies. The communicating team's problem is non-convex under locally randomized codes, and hence,… ▽ More Motivated by applications in the security of cyber-physical systems, we pose the finite blocklength communication problem in the presence of a jammer as a zero-sum game between the encoder-decoder team and the jammer, by allowing the communicating team as well as the jammer only locally randomized strategies. The communicating team's problem is non-convex under locally randomized codes, and hence, in general, a minimax theorem need not hold for this game. However, we show that approximate minimax theorems hold in the sense that the minimax and maximin values of the game approach each other asymptotically. In particular, for rates strictly below a critical threshold, both the minimax and maximin values approach zero, and for rates strictly above it, they both approach unity. We then show a second order minimax theorem, i.e., for rates exactly approaching the threshold with along a specific scaling, the minimax and maximin values approach the same constant value, that is neither zero nor one. Critical to these results is our derivation of finite blocklength bounds on the minimax and maximin values of the game and our derivation of second order dispersion-based bounds. △ Less

Submitted 11 July, 2019; originally announced July 2019.

Comments: Under review with Problems of Information Transmission

MSC Class: 94A15; 91A99

Showing 1–5 of 5 results for author: Vora, A S