Search | arXiv e-print repository

Impact of Geographic Diversity on Citation of Collaborative Research

Authors: Cian Naik, Cassidy R. Sugimoto, Vincent Larivière, Chenlei Leng, Weisi Guo

Abstract: Diversity in human capital is widely seen as critical to creating holistic and high quality research, especially in areas that engage with diverse cultures, environments, and challenges. Quantifying diverse academic collaborations and its effect on research quality is lacking, especially at international scale and across different domains. Here, we present the first effort to measure the impact of… ▽ More Diversity in human capital is widely seen as critical to creating holistic and high quality research, especially in areas that engage with diverse cultures, environments, and challenges. Quantifying diverse academic collaborations and its effect on research quality is lacking, especially at international scale and across different domains. Here, we present the first effort to measure the impact of geographic diversity in coauthorships on the citation of their papers across different academic domains. Our results unequivocally show that geographic coauthor diversity improves paper citation, but very long distance collaborations has variable impact. We also discover "well-trodden" collaboration circles that yield much less impact than similar travel distances. These relationships are observed to exist across different subject areas, but with varying strengths. These findings can help academics identify new opportunities from a diversity perspective, as well as inform funders on areas that require additional mobility support. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: 26 pages, 16 figures

arXiv:2203.09675 [pdf, other]

Fast Bayesian Coresets via Subsampling and Quasi-Newton Refinement

Authors: Cian Naik, Judith Rousseau, Trevor Campbell

Abstract: Bayesian coresets approximate a posterior distribution by building a small weighted subset of the data points. Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the… ▽ More Bayesian coresets approximate a posterior distribution by building a small weighted subset of the data points. Any inference procedure that is too computationally expensive to be run on the full posterior can instead be run inexpensively on the coreset, with results that approximate those on the full data. However, current approaches are limited by either a significant run-time or the need for the user to specify a low-cost approximation to the full posterior. We propose a Bayesian coreset construction algorithm that first selects a uniformly random subset of data, and then optimizes the weights using a novel quasi-Newton method. Our algorithm is a simple to implement, black-box method, that does not require the user to specify a low-cost posterior approximation. It is the first to come with a general high-probability bound on the KL divergence of the output coreset posterior. Experiments demonstrate that our method provides significant improvements in coreset quality against alternatives with comparable construction times, with far less storage cost and user input required. △ Less

Submitted 15 January, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

arXiv:1912.07323 [pdf]

Analysis of Software Engineering for Agile Machine Learning Projects

Authors: Kushal Singla, Joy Bose, Chetan Naik

Abstract: The number of machine learning, artificial intelligence or data science related software engineering projects using Agile methodology is increasing. However, there are very few studies on how such projects work in practice. In this paper, we analyze project issues tracking data taken from Scrum (a popular tool for Agile) for several machine learning projects. We compare this data with correspondin… ▽ More The number of machine learning, artificial intelligence or data science related software engineering projects using Agile methodology is increasing. However, there are very few studies on how such projects work in practice. In this paper, we analyze project issues tracking data taken from Scrum (a popular tool for Agile) for several machine learning projects. We compare this data with corresponding data from non-machine learning projects, in an attempt to analyze how machine learning projects are executed differently from normal software engineering projects. On analysis, we find that machine learning project issues use different kinds of words to describe issues, have higher number of exploratory or research oriented tasks as compared to implementation tasks, and have a higher number of issues in the product backlog after each sprint, denoting that it is more difficult to estimate the duration of machine learning project related tasks in advance. After analyzing this data, we propose a few ways in which Agile machine learning projects can be better logged and executed, given their differences with normal software engineering projects. △ Less

Submitted 16 December, 2019; originally announced December 2019.

Comments: 5 pages, 8 figures , INDICON conference

ACM Class: D.2

arXiv:1910.09679 [pdf, other]

Sparse Networks with Core-Periphery Structure

Authors: Cian Naik, François Caron, Judith Rousseau

Abstract: We propose a statistical model for graphs with a core-periphery structure. To do this we define a precise notion of what it means for a graph to have this structure, based on the sparsity properties of the subgraphs of core and periphery nodes. We present a class of sparse graphs with such properties, and provide methods to simulate from this class, and to perform posterior inference. We demonstra… ▽ More We propose a statistical model for graphs with a core-periphery structure. To do this we define a precise notion of what it means for a graph to have this structure, based on the sparsity properties of the subgraphs of core and periphery nodes. We present a class of sparse graphs with such properties, and provide methods to simulate from this class, and to perform posterior inference. We demonstrate that our model can detect core-periphery structure in simulated and real-world networks. △ Less

Submitted 21 October, 2019; originally announced October 2019.

MSC Class: Primary: 62F15; 05C80. Secondary: 60G55

arXiv:1906.01149 [pdf, other]

Improving Long Distance Slot Carryover in Spoken Dialogue Systems

Authors: Tongfei Chen, Chetan Naik, Hua He, Pushpendre Rastogi, Lambert Mathias

Abstract: Tracking the state of the conversation is a central component in task-oriented spoken dialogue systems. One such approach for tracking the dialogue state is slot carryover, where a model makes a binary decision if a slot from the context is relevant to the current turn. Previous work on the slot carryover task used models that made independent decisions for each slot. A close analysis of the resul… ▽ More Tracking the state of the conversation is a central component in task-oriented spoken dialogue systems. One such approach for tracking the dialogue state is slot carryover, where a model makes a binary decision if a slot from the context is relevant to the current turn. Previous work on the slot carryover task used models that made independent decisions for each slot. A close analysis of the results show that this approach results in poor performance over longer context dialogues. In this paper, we propose to jointly model the slots. We propose two neural network architectures, one based on pointer networks that incorporate slot ordering information, and the other based on transformer networks that uses self attention mechanism to model the slot interdependencies. Our experiments on an internal dialogue benchmark dataset and on the public DSTC2 dataset demonstrate that our proposed models are able to resolve longer distance slot references and are able to achieve competitive performance. △ Less

Submitted 3 June, 2019; originally announced June 2019.

Comments: Accepted at ACL 2019 workshop on NLP for Conversational AI (NLP4ConvAI)

arXiv:1904.06972 [pdf, other]

Efficient Feature Selection of Power Quality Events using Two Dimensional (2D) Particle Swarms

Authors: Faizal Hafiz, Akshya Swain, Chirag Naik, Nitish Patel

Abstract: A novel two-dimensional (2D) learning framework has been proposed to address the feature selection problem in Power Quality (PQ) events. Unlike the existing feature selection approaches, the proposed 2D learning explicitly incorporates the information about the subset cardinality (i.e., the number of features) as an additional learning dimension to effectively guide the search process. The efficac… ▽ More A novel two-dimensional (2D) learning framework has been proposed to address the feature selection problem in Power Quality (PQ) events. Unlike the existing feature selection approaches, the proposed 2D learning explicitly incorporates the information about the subset cardinality (i.e., the number of features) as an additional learning dimension to effectively guide the search process. The efficacy of this approach has been demonstrated considering fourteen distinct classes of PQ events which conform to the IEEE Standard 1159. The search performance of the 2D learning approach has been compared to the other six well-known feature selection wrappers by considering two induction algorithms: Naive Bayes (NB) and k-Nearest Neighbors (k-NN). Further, the robustness of the selected/reduced feature subsets has been investigated considering seven different levels of noise. The results of this investigation convincingly demonstrate that the proposed 2D learning can identify significantly better and robust feature subsets for PQ events. △ Less

Submitted 15 April, 2019; originally announced April 2019.

arXiv:1811.11161 [pdf, other]

Cross-Lingual Approaches to Reference Resolution in Dialogue Systems

Authors: Amr Sharaf, Arpit Gupta, Hancheng Ge, Chetan Naik, Lambert Mathias

Abstract: In the slot-filling paradigm, where a user can refer back to slots in the context during the conversation, the goal of the contextual understanding system is to resolve the referring expressions to the appropriate slots in the context. In this paper, we build on the context carryover system~\citep{Naik2018ContextualSC}, which provides a scalable multi-domain framework for resolving references. How… ▽ More In the slot-filling paradigm, where a user can refer back to slots in the context during the conversation, the goal of the contextual understanding system is to resolve the referring expressions to the appropriate slots in the context. In this paper, we build on the context carryover system~\citep{Naik2018ContextualSC}, which provides a scalable multi-domain framework for resolving references. However, scaling this approach across languages is not a trivial task, due to the large demand on acquisition of annotated data in the target language. Our main focus is on cross-lingual methods for reference resolution as a way to alleviate the need for annotated data in the target language. In the cross-lingual setup, we assume there is access to annotated resources as well as a well trained model in the source language and little to no annotated data in the target language. In this paper, we explore three different approaches for cross-lingual transfer \textemdash~\ delexicalization as data augmentation, multilingual embeddings and machine translation. We compare these approaches both on a low resource setting as well as a large resource setting. Our experiments show that multilingual embeddings and delexicalization via data augmentation have a significant impact in the low resource setting, but the gains diminish as the amount of available data in the target language increases. Furthermore, when combined with machine translation we can get performance very close to actual live data in the target language, with only 25\% of the data projected into the target language. △ Less

Submitted 27 November, 2018; originally announced November 2018.

Comments: Accepted at NIPS 2018 Conversational AI Workshop

arXiv:1808.01150 [pdf, other]

doi 10.1016/j.patcog.2017.11.027

A Two-Dimensional (2-D) Learning Framework for Particle Swarm based Feature Selection

Authors: Faizal Hafiz, Akshya Swain, Nitish Patel, Chirag Naik

Abstract: This paper proposes a new generalized two dimensional learning approach for particle swarm based feature selection. The core idea of the proposed approach is to include the information about the subset cardinality into the learning framework by extending the dimension of the velocity. The 2D-learning framework retains all the key features of the original PSO, despite the extra learning dimension.… ▽ More This paper proposes a new generalized two dimensional learning approach for particle swarm based feature selection. The core idea of the proposed approach is to include the information about the subset cardinality into the learning framework by extending the dimension of the velocity. The 2D-learning framework retains all the key features of the original PSO, despite the extra learning dimension. Most of the popular variants of PSO can easily be adapted into this 2D learning framework for feature selection problems. The efficacy of the proposed learning approach has been evaluated considering several benchmark data and two induction algorithms: Naive-Bayes and k-Nearest Neighbor. The results of the comparative investigation including the time-complexity analysis with GA, ACO and five other PSO variants illustrate that the proposed 2D learning approach gives feature subset with relatively smaller cardinality and better classification performance with shorter run times. △ Less

Submitted 3 August, 2018; originally announced August 2018.

Journal ref: Elsevier - Pattern Recognition, Volume 76, 2018, Pages 416-433

arXiv:1806.01773 [pdf, other]

doi 10.21437/Interspeech.2018-1035

Contextual Slot Carryover for Disparate Schemas

Authors: Chetan Naik, Arpit Gupta, Hancheng Ge, Lambert Mathias, Ruhi Sarikaya

Abstract: In the slot-filling paradigm, where a user can refer back to slots in the context during a conversation, the goal of the contextual understanding system is to resolve the referring expressions to the appropriate slots in the context. In large-scale multi-domain systems, this presents two challenges - scaling to a very large and potentially unbounded set of slot values, and dealing with diverse sch… ▽ More In the slot-filling paradigm, where a user can refer back to slots in the context during a conversation, the goal of the contextual understanding system is to resolve the referring expressions to the appropriate slots in the context. In large-scale multi-domain systems, this presents two challenges - scaling to a very large and potentially unbounded set of slot values, and dealing with diverse schemas. We present a neural network architecture that addresses the slot value scalability challenge by reformulating the contextual interpretation as a decision to carryover a slot from a set of possible candidates. To deal with heterogenous schemas, we introduce a simple data-driven method for trans- forming the candidate slots. Our experiments show that our approach can scale to multiple domains and provides competitive results over a strong baseline. △ Less

Submitted 5 June, 2018; originally announced June 2018.

Comments: Accepted at Interspeech 2018

Showing 1–9 of 9 results for author: Naik, C