-
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark
Authors:
David Romero,
Chenyang Lyu,
Haryo Akbarianto Wibowo,
Teresa Lynn,
Injy Hamed,
Aditya Nanda Kishore,
Aishik Mandal,
Alina Dragonetti,
Artem Abzaliev,
Atnafu Lambebo Tonja,
Bontu Fufa Balcha,
Chenxi Whitehouse,
Christian Salamea,
Dan John Velasco,
David Ifeoluwa Adelani,
David Le Meur,
Emilio Villa-Cueva,
Fajri Koto,
Fauzan Farooqui,
Frederico Belcavello,
Ganzorig Batnasan,
Gisela Vallejo,
Grainne Caulfield,
Guido Ivetta,
Haiyue Song
, et al. (50 additional authors not shown)
Abstract:
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recen…
▽ More
Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recent efforts have tried to increase the number of languages covered on VQA datasets, they still lack diversity in low-resource languages. More importantly, although these datasets often extend their linguistic range via translation or some other approaches, they usually keep images the same, resulting in narrow cultural representation. To address these limitations, we construct CVQA, a new Culturally-diverse multilingual Visual Question Answering benchmark, designed to cover a rich set of languages and cultures, where we engage native speakers and cultural experts in the data collection process. As a result, CVQA includes culturally-driven images and questions from across 28 countries on four continents, covering 26 languages with 11 scripts, providing a total of 9k questions. We then benchmark several Multimodal Large Language Models (MLLMs) on CVQA, and show that the dataset is challenging for the current state-of-the-art models. This benchmark can serve as a probing evaluation suite for assessing the cultural capability and bias of multimodal models and hopefully encourage more research efforts toward increasing cultural awareness and linguistic diversity in this field.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Advancing Anomaly Detection in Computational Workflows with Active Learning
Authors:
Krishnan Raghavan,
George Papadimitriou,
Hongwei **,
Anirban Mandal,
Mariam Kiran,
Prasanna Balaprakash,
Ewa Deelman
Abstract:
A computational workflow, also known as workflow, consists of tasks that are executed in a certain order to attain a specific computational campaign. Computational workflows are commonly employed in science domains, such as physics, chemistry, genomics, to complete large-scale experiments in distributed and heterogeneous computing environments. However, running computations at such a large scale m…
▽ More
A computational workflow, also known as workflow, consists of tasks that are executed in a certain order to attain a specific computational campaign. Computational workflows are commonly employed in science domains, such as physics, chemistry, genomics, to complete large-scale experiments in distributed and heterogeneous computing environments. However, running computations at such a large scale makes the workflow applications prone to failures and performance degradation, which can slowdown, stall, and ultimately lead to workflow failure. Learning how these workflows behave under normal and anomalous conditions can help us identify the causes of degraded performance and subsequently trigger appropriate actions to resolve them. However, learning in such circumstances is a challenging task because of the large volume of high-quality historical data needed to train accurate and reliable models. Generating such datasets not only takes a lot of time and effort but it also requires a lot of resources to be devoted to data generation for training purposes. Active learning is a promising approach to this problem. It is an approach where the data is generated as required by the machine learning model and thus it can potentially reduce the training data needed to derive accurate models. In this work, we present an active learning approach that is supported by an experimental framework, Poseidon-X, that utilizes a modern workflow management system and two cloud testbeds. We evaluate our approach using three computational workflows. For one workflow we run an end-to-end live active learning experiment, for the other two we evaluate our active learning algorithms using pre-captured data traces provided by the Flow-Bench benchmark. Our findings indicate that active learning not only saves resources, but it also improves the accuracy of the detection of anomalies.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Quantum State Compression with Polar Codes
Authors:
Jack Weinberg,
Avijit Mandal,
Henry D. Pfister
Abstract:
In the quantum compression scheme proposed by Schumacher, Alice compresses a message that Bob decompresses. In that approach, there is some probability of failure and, even when successful, some distortion of the state. For sufficiently large blocklengths, both of these imperfections can be made arbitrarily small while achieving a compression rate that asymptotically approaches the source coding b…
▽ More
In the quantum compression scheme proposed by Schumacher, Alice compresses a message that Bob decompresses. In that approach, there is some probability of failure and, even when successful, some distortion of the state. For sufficiently large blocklengths, both of these imperfections can be made arbitrarily small while achieving a compression rate that asymptotically approaches the source coding bound. However, direct implementation of Schumacher compression suffers from poor circuit complexity. In this paper, we consider a slightly different approach based on classical syndrome source coding. The idea is to use a linear error-correcting code and treat the message to be compressed as an error pattern. If the message is a correctable error (i.e., a coset leader) then Alice can use the error-correcting code to convert her message to a corresponding quantum syndrome. An implementation of this based on polar codes is described and simulated. As in classical source coding based on polar codes, Alice maps the information into the ``frozen" qubits that constitute the syndrome. To decompress, Bob utilizes a quantum version of successive cancellation coding.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Attentive Fusion: A Transformer-based Approach to Multimodal Hate Speech Detection
Authors:
Atanu Mandal,
Gargi Roy,
Amit Barman,
Indranil Dutta,
Sudip Kumar Naskar
Abstract:
With the recent surge and exponential growth of social media usage, scrutinizing social media content for the presence of any hateful content is of utmost importance. Researchers have been diligently working since the past decade on distinguishing between content that promotes hatred and content that does not. Traditionally, the main focus has been on analyzing textual content. However, recent res…
▽ More
With the recent surge and exponential growth of social media usage, scrutinizing social media content for the presence of any hateful content is of utmost importance. Researchers have been diligently working since the past decade on distinguishing between content that promotes hatred and content that does not. Traditionally, the main focus has been on analyzing textual content. However, recent research attempts have also commenced into the identification of audio-based content. Nevertheless, studies have shown that relying solely on audio or text-based content may be ineffective, as recent upsurge indicates that individuals often employ sarcasm in their speech and writing. To overcome these challenges, we present an approach to identify whether a speech promotes hate or not utilizing both audio and textual representations. Our methodology is based on the Transformer framework that incorporates both audio and text sampling, accompanied by our very own layer called "Attentive Fusion". The results of our study surpassed previous state-of-the-art techniques, achieving an impressive macro F1 score of 0.927 on the Test Set.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
Polar Codes for CQ Channels: Decoding via Belief-Propagation with Quantum Messages
Authors:
Avijit Mandal,
S. Brandsen,
Henry D. Pfister
Abstract:
This paper considers the design and decoding of polar codes for general classical-quantum (CQ) channels. It focuses on decoding via belief-propagation with quantum messages (BPQM) and, in particular, the idea of paired-measurement BPQM (PM-BPQM) decoding. Since the PM-BPQM decoder admits a classical density evolution (DE) analysis, one can use DE to design a polar code for any CQ channel and then…
▽ More
This paper considers the design and decoding of polar codes for general classical-quantum (CQ) channels. It focuses on decoding via belief-propagation with quantum messages (BPQM) and, in particular, the idea of paired-measurement BPQM (PM-BPQM) decoding. Since the PM-BPQM decoder admits a classical density evolution (DE) analysis, one can use DE to design a polar code for any CQ channel and then efficiently compute the trade-off between code rate and error probability. We have also implemented and tested a classical simulation of our PM-BPQM decoder for polar codes. While the decoder can be implemented efficiently on a quantum computer, simulating the decoder on a classical computer actually has exponential complexity. Thus, simulation results for the decoder are somewhat limited and are included primarily to validate our theoretical results.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
Self-supervised Learning for Anomaly Detection in Computational Workflows
Authors:
Hongwei **,
Krishnan Raghavan,
George Papadimitriou,
Cong Wang,
Anirban Mandal,
Ewa Deelman,
Prasanna Balaprakash
Abstract:
Anomaly detection is the task of identifying abnormal behavior of a system. Anomaly detection in computational workflows is of special interest because of its wide implications in various domains such as cybersecurity, finance, and social networks. However, anomaly detection in computational workflows~(often modeled as graphs) is a relatively unexplored problem and poses distinct challenges. For i…
▽ More
Anomaly detection is the task of identifying abnormal behavior of a system. Anomaly detection in computational workflows is of special interest because of its wide implications in various domains such as cybersecurity, finance, and social networks. However, anomaly detection in computational workflows~(often modeled as graphs) is a relatively unexplored problem and poses distinct challenges. For instance, when anomaly detection is performed on graph data, the complex interdependency of nodes and edges, the heterogeneity of node attributes, and edge types must be accounted for. Although the use of graph neural networks can help capture complex inter-dependencies, the scarcity of labeled anomalous examples from workflow executions is still a significant challenge. To address this problem, we introduce an autoencoder-driven self-supervised learning~(SSL) approach that learns a summary statistic from unlabeled workflow data and estimates the normal behavior of the computational workflow in the latent space. In this approach, we combine generative and contrastive learning objectives to detect outliers in the summary statistics. We demonstrate that by estimating the distribution of normal behavior in the latent space, we can outperform state-of-the-art anomaly detection methods on our benchmark datasets.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Biased Attention: Do Vision Transformers Amplify Gender Bias More than Convolutional Neural Networks?
Authors:
Abhishek Mandal,
Susan Leavy,
Suzanne Little
Abstract:
Deep neural networks used in computer vision have been shown to exhibit many social biases such as gender bias. Vision Transformers (ViTs) have become increasingly popular in computer vision applications, outperforming Convolutional Neural Networks (CNNs) in many tasks such as image classification. However, given that research on mitigating bias in computer vision has primarily focused on CNNs, it…
▽ More
Deep neural networks used in computer vision have been shown to exhibit many social biases such as gender bias. Vision Transformers (ViTs) have become increasingly popular in computer vision applications, outperforming Convolutional Neural Networks (CNNs) in many tasks such as image classification. However, given that research on mitigating bias in computer vision has primarily focused on CNNs, it is important to evaluate the effect of a different network architecture on the potential for bias amplification. In this paper we therefore introduce a novel metric to measure bias in architectures, Accuracy Difference. We examine bias amplification when models belonging to these two architectures are used as a part of large multimodal models, evaluating the different image encoders of Contrastive Language Image Pretraining which is an important model used in many generative models such as DALL-E and Stable Diffusion. Our experiments demonstrate that architecture can play a role in amplifying social biases due to the different techniques employed by the models for feature extraction and embedding as well as their different learning properties. This research found that ViTs amplified gender bias to a greater extent than CNNs
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Gender Bias in Multimodal Models: A Transnational Feminist Approach Considering Geographical Region and Culture
Authors:
Abhishek Mandal,
Suzanne Little,
Susan Leavy
Abstract:
Deep learning based visual-linguistic multimodal models such as Contrastive Language Image Pre-training (CLIP) have become increasingly popular recently and are used within text-to-image generative models such as DALL-E and Stable Diffusion. However, gender and other social biases have been uncovered in these models, and this has the potential to be amplified and perpetuated through AI systems. In…
▽ More
Deep learning based visual-linguistic multimodal models such as Contrastive Language Image Pre-training (CLIP) have become increasingly popular recently and are used within text-to-image generative models such as DALL-E and Stable Diffusion. However, gender and other social biases have been uncovered in these models, and this has the potential to be amplified and perpetuated through AI systems. In this paper, we present a methodology for auditing multimodal models that consider gender, informed by concepts from transnational feminism, including regional and cultural dimensions. Focusing on CLIP, we found evidence of significant gender bias with varying patterns across global regions. Harmful stereotypical associations were also uncovered related to visual cultural cues and labels such as terrorism. Levels of gender bias uncovered within CLIP for different regions aligned with global indices of societal gender equality, with those from the Global South reflecting the highest levels of gender bias.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
Automatic Historical Stock Price Dataset Generation Using Python
Authors:
Arunima Mandal,
Yuanhang Shao,
Xiuwen Liu
Abstract:
With the dynamic political and economic environments, the ever-changing stock markets generate large amounts of data daily. Acquiring up-to-date data is crucial to enhancing predictive precision in stock price behavior studies. However, preparing the dataset manually can be challenging and time-demanding. The stock market analysis usually revolves around specific indices such as S&P500, Nasdaq, Do…
▽ More
With the dynamic political and economic environments, the ever-changing stock markets generate large amounts of data daily. Acquiring up-to-date data is crucial to enhancing predictive precision in stock price behavior studies. However, preparing the dataset manually can be challenging and time-demanding. The stock market analysis usually revolves around specific indices such as S&P500, Nasdaq, Dow Jones, the New York Stock Exchange (NYSE), etc. It is necessary to analyze all the companies of any particular index. While raw data are accessible from diverse financial websites, these resources are tailored for individual company data retrieval and there is a big gap between what is available and what is needed to generate large datasets. Python emerges as a valuable tool for comprehensively collecting all constituent stocks within a given index. While certain online sources offer code snippets for limited dataset generation, a comprehensive and unified script is yet to be developed and publicly available. Therefore, we present a comprehensive and consolidated code resource that facilitates the extraction of updated datasets for any particular time period and for any specific stock market index and closes the gap. The code is available at https://github.com/amp1590/automatic_stock_data_collection.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI
Authors:
Hangjie Shi,
Leslie Ball,
Govind Thattai,
Desheng Zhang,
Lucy Hu,
Qiaozi Gao,
Suhaila Shakiah,
Xiaofeng Gao,
Aishwarya Padmakumar,
Bofei Yang,
Cadence Chung,
Dinakar Guthy,
Gaurav Sukhatme,
Karthika Arumugam,
Matthew Wen,
Osman Ipek,
Patrick Lange,
Rohan Khanna,
Shreyas Pansare,
Vasu Sharma,
Chao Zhang,
Cris Flagg,
Daniel Pressel,
Lavina Vaz,
Luke Dai
, et al. (17 additional authors not shown)
Abstract:
The Alexa Prize program has empowered numerous university students to explore, experiment, and showcase their talents in building conversational agents through challenges like the SocialBot Grand Challenge and the TaskBot Challenge. As conversational agents increasingly appear in multimodal and embodied contexts, it is important to explore the affordances of conversational interaction augmented wi…
▽ More
The Alexa Prize program has empowered numerous university students to explore, experiment, and showcase their talents in building conversational agents through challenges like the SocialBot Grand Challenge and the TaskBot Challenge. As conversational agents increasingly appear in multimodal and embodied contexts, it is important to explore the affordances of conversational interaction augmented with computer vision and physical embodiment. This paper describes the SimBot Challenge, a new challenge in which university teams compete to build robot assistants that complete tasks in a simulated physical environment. This paper provides an overview of the SimBot Challenge, which included both online and offline challenge phases. We describe the infrastructure and support provided to the teams including Alexa Arena, the simulated environment, and the ML toolkit provided to teams to accelerate their building of vision and language models. We summarize the approaches the participating teams took to overcome research challenges and extract key lessons learned. Finally, we provide analysis of the performance of the competing SimBots during the competition.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
Semantic Equivalence of e-Commerce Queries
Authors:
Aritra Mandal,
Daniel Tunkelang,
Zhe Wu
Abstract:
Search query variation poses a challenge in e-commerce search, as equivalent search intents can be expressed through different queries with surface-level differences. This paper introduces a framework to recognize and leverage query equivalence to enhance searcher and business outcomes. The proposed approach addresses three key problems: map** queries to vector representations of search intent,…
▽ More
Search query variation poses a challenge in e-commerce search, as equivalent search intents can be expressed through different queries with surface-level differences. This paper introduces a framework to recognize and leverage query equivalence to enhance searcher and business outcomes. The proposed approach addresses three key problems: map** queries to vector representations of search intent, identifying nearest neighbor queries expressing equivalent or similar intent, and optimizing for user or business objectives. The framework utilizes both surface similarity and behavioral similarity to determine query equivalence. Surface similarity involves canonicalizing queries based on word inflection, word order, compounding, and noise words. Behavioral similarity leverages historical search behavior to generate vector representations of query intent. An offline process is used to train a sentence similarity model, while an online nearest neighbor approach supports processing of unseen queries. Experimental evaluations demonstrate the effectiveness of the proposed approach, outperforming popular sentence transformer models and achieving a Pearson correlation of 0.85 for query similarity. The results highlight the potential of leveraging historical behavior data and training models to recognize and utilize query equivalence in e-commerce search, leading to improved user experiences and business outcomes. Further advancements and benchmark datasets are encouraged to facilitate the development of solutions for this critical problem in the e-commerce domain.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Deep ANN-based Touch-less 3D Pad for Digit Recognition
Authors:
Pramit Kumar Pal,
Debarshi Dutta,
Attreyee Mandal,
Dipshika Das
Abstract:
The Covid-19 pandemic has changed the way humans interact with their environment. Common touch surfaces such as elevator switches and ATM switches are hazardous to touch as they are used by countless people every day, increasing the chance of getting infected. So, a need for touch-less interaction with machines arises. In this paper, we propose a method of recognizing the ten decimal digits (0-9)…
▽ More
The Covid-19 pandemic has changed the way humans interact with their environment. Common touch surfaces such as elevator switches and ATM switches are hazardous to touch as they are used by countless people every day, increasing the chance of getting infected. So, a need for touch-less interaction with machines arises. In this paper, we propose a method of recognizing the ten decimal digits (0-9) by writing the digits in the air near a sensing printed circuit board using a human hand. We captured the movement of the hand by a sensor based on projective capacitance and classified it into digits using an Artificial Neural Network. Our method does not use pictures, which significantly reduces the computational requirements and preserves users' privacy. Thus, the proposed method can be easily implemented in public places.
△ Less
Submitted 15 July, 2023;
originally announced July 2023.
-
Flow-Bench: A Dataset for Computational Workflow Anomaly Detection
Authors:
George Papadimitriou,
Hongwei **,
Cong Wang,
Rajiv Mayani,
Krishnan Raghavan,
Anirban Mandal,
Prasanna Balaprakash,
Ewa Deelman
Abstract:
A computational workflow, also known as workflow, consists of tasks that must be executed in a specific order to attain a specific goal. Often, in fields such as biology, chemistry, physics, and data science, among others, these workflows are complex and are executed in large-scale, distributed, and heterogeneous computing environments prone to failures and performance degradation. Therefore, anom…
▽ More
A computational workflow, also known as workflow, consists of tasks that must be executed in a specific order to attain a specific goal. Often, in fields such as biology, chemistry, physics, and data science, among others, these workflows are complex and are executed in large-scale, distributed, and heterogeneous computing environments prone to failures and performance degradation. Therefore, anomaly detection for workflows is an important paradigm that aims to identify unexpected behavior or errors in workflow execution. This crucial task to improve the reliability of workflow executions can be further assisted by machine learning-based techniques. However, such application is limited, in large part, due to the lack of open datasets and benchmarking. To address this gap, we make the following contributions in this paper: (1) we systematically inject anomalies and collect raw execution logs from workflows executing on distributed infrastructures; (2) we summarize the statistics of new datasets, and provide insightful analyses; (3) we convert workflows into tabular, graph and text data, and benchmark with supervised and unsupervised anomaly detection techniques correspondingly. The presented dataset and benchmarks allow examining the effectiveness and efficiency of scientific computational workflows and identifying potential research opportunities for improvement and generalization. The dataset and benchmark code are publicly available \url{https://poseidon-workflows.github.io/FlowBench/} under the MIT License.
△ Less
Submitted 13 June, 2024; v1 submitted 16 June, 2023;
originally announced June 2023.
-
Multimodal Composite Association Score: Measuring Gender Bias in Generative Multimodal Models
Authors:
Abhishek Mandal,
Susan Leavy,
Suzanne Little
Abstract:
Generative multimodal models based on diffusion models have seen tremendous growth and advances in recent years. Models such as DALL-E and Stable Diffusion have become increasingly popular and successful at creating images from texts, often combining abstract ideas. However, like other deep learning models, they also reflect social biases they inherit from their training data, which is often crawl…
▽ More
Generative multimodal models based on diffusion models have seen tremendous growth and advances in recent years. Models such as DALL-E and Stable Diffusion have become increasingly popular and successful at creating images from texts, often combining abstract ideas. However, like other deep learning models, they also reflect social biases they inherit from their training data, which is often crawled from the internet. Manually auditing models for biases can be very time and resource consuming and is further complicated by the unbounded and unconstrained nature of inputs these models can take. Research into bias measurement and quantification has generally focused on small single-stage models working on a single modality. Thus the emergence of multistage multimodal models requires a different approach. In this paper, we propose Multimodal Composite Association Score (MCAS) as a new method of measuring gender bias in multimodal generative models. Evaluating both DALL-E 2 and Stable Diffusion using this approach uncovered the presence of gendered associations of concepts embedded within the models. We propose MCAS as an accessible and scalable method of quantifying potential bias for models with different modalities and a range of potential biases.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
Alexa Arena: A User-Centric Interactive Platform for Embodied AI
Authors:
Qiaozi Gao,
Govind Thattai,
Suhaila Shakiah,
Xiaofeng Gao,
Shreyas Pansare,
Vasu Sharma,
Gaurav Sukhatme,
Hangjie Shi,
Bofei Yang,
Desheng Zheng,
Lucy Hu,
Karthika Arumugam,
Shui Hu,
Matthew Wen,
Dinakar Guthy,
Cadence Chung,
Rohan Khanna,
Osman Ipek,
Leslie Ball,
Kate Bland,
Heather Rocker,
Yadunandana Rao,
Michael Johnston,
Reza Ghanadan,
Arindam Mandal
, et al. (2 additional authors not shown)
Abstract:
We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks readily accessible to general human users, thus openi…
▽ More
We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks readily accessible to general human users, thus opening a new venue for high-efficiency HRI data collection and EAI system evaluation. Along with the platform, we introduce a dialog-enabled instruction-following benchmark and provide baseline results for it. We make Alexa Arena publicly available to facilitate research in building generalizable and assistive embodied agents.
△ Less
Submitted 7 June, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Catch Me If You Can: Semi-supervised Graph Learning for Spotting Money Laundering
Authors:
Md. Rezaul Karim,
Felix Hermsen,
Sisay Adugna Chala,
Paola de Perthuis,
Avikarsha Mandal
Abstract:
Money laundering is the process where criminals use financial services to move massive amounts of illegal money to untraceable destinations and integrate them into legitimate financial systems. It is very crucial to identify such activities accurately and reliably in order to enforce an anti-money laundering (AML). Despite tremendous efforts to AML only a tiny fraction of illicit activities are pr…
▽ More
Money laundering is the process where criminals use financial services to move massive amounts of illegal money to untraceable destinations and integrate them into legitimate financial systems. It is very crucial to identify such activities accurately and reliably in order to enforce an anti-money laundering (AML). Despite tremendous efforts to AML only a tiny fraction of illicit activities are prevented. From a given graph of money transfers between accounts of a bank, existing approaches attempted to detect money laundering. In particular, some approaches employ structural and behavioural dynamics of dense subgraph detection thereby not taking into consideration that money laundering involves high-volume flows of funds through chains of bank accounts. Some approaches model the transactions in the form of multipartite graphs to detect the complete flow of money from source to destination. However, existing approaches yield lower detection accuracy, making them less reliable. In this paper, we employ semi-supervised graph learning techniques on graphs of financial transactions in order to identify nodes involved in potential money laundering. Experimental results suggest that our approach can sport money laundering from real and synthetic transaction graphs.
△ Less
Submitted 24 February, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
mmDrive: mmWave Sensing for Live Monitoring and On-Device Inference of Dangerous Driving
Authors:
Argha Sen,
Avijit Mandal,
Prasenjit Karmakar,
Anirban Das,
Sandip Chakraborty
Abstract:
Detecting dangerous driving has been of critical interest for the past few years. However, a practical yet minimally intrusive solution remains challenging as existing technologies heavily rely on visual features or physical proximity. With this motivation, we explore the feasibility of purely using mmWave radars to detect dangerous driving behaviors. We first study characteristics of dangerous dr…
▽ More
Detecting dangerous driving has been of critical interest for the past few years. However, a practical yet minimally intrusive solution remains challenging as existing technologies heavily rely on visual features or physical proximity. With this motivation, we explore the feasibility of purely using mmWave radars to detect dangerous driving behaviors. We first study characteristics of dangerous driving and find some unique patterns of range-doppler caused by 9 typical dangerous driving actions. We then develop a novel Fused-CNN model to detect dangerous driving instances from regular driving and classify 9 different dangerous driving actions. Through extensive experiments with 5 volunteer drivers in real driving environments, we observe that our system can distinguish dangerous driving actions with an average accuracy of > 95%. We also compare our models with existing state-of-the-art baselines to establish their significance.
△ Less
Submitted 21 September, 2023; v1 submitted 19 January, 2023;
originally announced January 2023.
-
ExpresSense: Exploring a Standalone Smartphone to Sense Engagement of Users from Facial Expressions Using Acoustic Sensing
Authors:
Pragma Kar,
Shyamvanshikumar Singh,
Avijit Mandal,
Samiran Chattopadhyay,
Sandip Chakraborty
Abstract:
Facial expressions have been considered a metric reflecting a person's engagement with a task. While the evolution of expression detection methods is consequential, the foundation remains mostly on image processing techniques that suffer from occlusion, ambient light, and privacy concerns. In this paper, we propose ExpresSense, a lightweight application for standalone smartphones that relies on ne…
▽ More
Facial expressions have been considered a metric reflecting a person's engagement with a task. While the evolution of expression detection methods is consequential, the foundation remains mostly on image processing techniques that suffer from occlusion, ambient light, and privacy concerns. In this paper, we propose ExpresSense, a lightweight application for standalone smartphones that relies on near-ultrasound acoustic signals for detecting users' facial expressions. ExpresSense has been tested on different users in lab-scaled and large-scale studies for both posed as well as natural expressions. By achieving a classification accuracy of ~75% over various basic expressions, we discuss the potential of a standalone smartphone to sense expressions through acoustic sensing.
△ Less
Submitted 17 January, 2023;
originally announced January 2023.
-
Understanding EEG signals for subject-wise Definition of Armoni Activities
Authors:
Kislay Raj,
Aditya Singh,
Abhishek Mandal,
Teerath Kumar,
Arunabha M. Roy
Abstract:
In a growing world of technology, psychological disorders became a challenge to be solved. The methods used for cognitive stimulation are very conventional and based on one-way communication, which only relies on the material or method used for training of an individual. It doesn't use any kind of feedback from the individual to analyze the progress of the training process. We have proposed a clos…
▽ More
In a growing world of technology, psychological disorders became a challenge to be solved. The methods used for cognitive stimulation are very conventional and based on one-way communication, which only relies on the material or method used for training of an individual. It doesn't use any kind of feedback from the individual to analyze the progress of the training process. We have proposed a closed-loop methodology to improve the cognitive state of a person with ID (Intellectual disability). We have used a platform named 'Armoni', for providing training to the intellectually disabled individuals. The learning is performed in a closed-loop by using feedback in the form of change in affective state. For feedback to the Armoni, an EEG (Electroencephalograph) headband is used. All the changes in EEG are observed and classified against the change in the mean and standard deviation value of all frequency bands of signal. This comparison is being helpful in defining every activity with respect to change in brain signals. In this paper, we have discussed the process of treatment of EEG signal and its definition against the different activities of Armoni. We have tested it on 6 different systems with different age groups and cognitive levels.
△ Less
Submitted 26 April, 2023; v1 submitted 3 January, 2023;
originally announced January 2023.
-
A Revenue Function for Comparison-Based Hierarchical Clustering
Authors:
Aishik Mandal,
Michaël Perrot,
Debarghya Ghoshdastidar
Abstract:
Comparison-based learning addresses the problem of learning when, instead of explicit features or pairwise similarities, one only has access to comparisons of the form: \emph{Object $A$ is more similar to $B$ than to $C$.} Recently, it has been shown that, in Hierarchical Clustering, single and complete linkage can be directly implemented using only such comparisons while several algorithms have b…
▽ More
Comparison-based learning addresses the problem of learning when, instead of explicit features or pairwise similarities, one only has access to comparisons of the form: \emph{Object $A$ is more similar to $B$ than to $C$.} Recently, it has been shown that, in Hierarchical Clustering, single and complete linkage can be directly implemented using only such comparisons while several algorithms have been proposed to emulate the behaviour of average linkage. Hence, finding hierarchies (or dendrograms) using only comparisons is a well understood problem. However, evaluating their meaningfulness when no ground-truth nor explicit similarities are available remains an open question.
In this paper, we bridge this gap by proposing a new revenue function that allows one to measure the goodness of dendrograms using only comparisons. We show that this function is closely related to Dasgupta's cost for hierarchical clustering that uses pairwise similarities. On the theoretical side, we use the proposed revenue function to resolve the open problem of whether one can approximately recover a latent hierarchy using few triplet comparisons. On the practical side, we present principled algorithms for comparison-based hierarchical clustering based on the maximisation of the revenue and we empirically compare them with existing methods.
△ Less
Submitted 2 April, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Towards Maximizing Nonlinear Delay Sensitive Rewards in Queuing Systems
Authors:
Sushmitha Shree S,
Avijit Mandal,
Avhishek Chatterjee,
Krishna Jagannathan
Abstract:
We consider maximizing the long-term average reward in a single server queue, where the reward obtained for a job is a non-increasing function of its sojourn time. The motivation behind this work comes from multiple applications, including quantum information processing and multimedia streaming. We introduce a new service discipline, shortest predicted sojourn time (SPST), which, in simulations, p…
▽ More
We consider maximizing the long-term average reward in a single server queue, where the reward obtained for a job is a non-increasing function of its sojourn time. The motivation behind this work comes from multiple applications, including quantum information processing and multimedia streaming. We introduce a new service discipline, shortest predicted sojourn time (SPST), which, in simulations, performs better than well-known disciplines. We also present some limited analytical guarantees for this highly intricate problem.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Pre-trained Language Models for the Legal Domain: A Case Study on Indian Law
Authors:
Shounak Paul,
Arpan Mandal,
Pawan Goyal,
Saptarshi Ghosh
Abstract:
NLP in the legal domain has seen increasing success with the emergence of Transformer-based Pre-trained Language Models (PLMs) pre-trained on legal text. PLMs trained over European and US legal text are available publicly; however, legal text from other domains (countries), such as India, have a lot of distinguishing characteristics. With the rapidly increasing volume of Legal NLP applications in…
▽ More
NLP in the legal domain has seen increasing success with the emergence of Transformer-based Pre-trained Language Models (PLMs) pre-trained on legal text. PLMs trained over European and US legal text are available publicly; however, legal text from other domains (countries), such as India, have a lot of distinguishing characteristics. With the rapidly increasing volume of Legal NLP applications in various countries, it has become necessary to pre-train such LMs over legal text of other countries as well. In this work, we attempt to investigate pre-training in the Indian legal domain. We re-train (continue pre-training) two popular legal PLMs, LegalBERT and CaseLawBERT, on Indian legal data, as well as train a model from scratch with a vocabulary based on Indian legal text. We apply these PLMs over three benchmark legal NLP tasks -- Legal Statute Identification from facts, Semantic Segmentation of Court Judgment Documents, and Court Appeal Judgment Prediction -- over both Indian and non-Indian (EU, UK) datasets. We observe that our approach not only enhances performance on the new domain (Indian texts) but also over the original domain (European and UK texts). We also conduct explainability experiments for a qualitative comparison of all these different PLMs.
△ Less
Submitted 15 May, 2023; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Bandwidth-Hard Functions from Random Permutations
Authors:
Rishiraj Bhattacharyya,
Avradip Mandal
Abstract:
ASIC hash engines are specifically optimized for parallel computations of cryptographic hashes and thus a natural environment for mounting brute-force attacks on hash functions. Two fundamental advantages of ASICs over general purpose computers are the area advantage and the energy efficiency. The memory-hard functions approach the problem by reducing the area advantage of ASICs compared to genera…
▽ More
ASIC hash engines are specifically optimized for parallel computations of cryptographic hashes and thus a natural environment for mounting brute-force attacks on hash functions. Two fundamental advantages of ASICs over general purpose computers are the area advantage and the energy efficiency. The memory-hard functions approach the problem by reducing the area advantage of ASICs compared to general-purpose computers. Traditionally, memory-hard functions have been analyzed in the (parallel) random oracle model. However, as the memory-hard security game is multi-stage, indifferentiability does not apply and instantiating the random oracle becomes a non-trivial problem. Chen and Tessaro (CRYPTO 2019) considered this issue and showed how random oracles should be instantiated in the context of memory-hard functions. The Bandwidth-Hard functions, introduced by Ren and Devadas (TCC 2017), aim to provide ASIC resistance by reducing the energy advantage of ASICs. In particular, bandwidth-hard functions provide ASIC resistance by guaranteeing high run time energy cost if the available cache is not large enough. Previously, bandwidth-hard functions have been analyzed in the parallel random oracle model. In this work, we show how those random oracles can be instantiated using random permutations in the context of bandwidth-hard functions. Our results are generic and valid for any hard-to-pebble graphs.
△ Less
Submitted 23 July, 2022;
originally announced July 2022.
-
Human Brains Can't Detect Fake News: A Neuro-Cognitive Study of Textual Disinformation Susceptibility
Authors:
Cagri Arisoy,
Anuradha Mandal,
Nitesh Saxena
Abstract:
The spread of digital disinformation (aka "fake news") is arguably one of the most significant threats on the Internet which can cause individual and societal harm of large scales. The susceptibility to fake news attacks hinges on whether Internet users perceive a fake news article/snippet to be legitimate after reading it. In this paper, we attempt to garner an in-depth understanding of users' su…
▽ More
The spread of digital disinformation (aka "fake news") is arguably one of the most significant threats on the Internet which can cause individual and societal harm of large scales. The susceptibility to fake news attacks hinges on whether Internet users perceive a fake news article/snippet to be legitimate after reading it. In this paper, we attempt to garner an in-depth understanding of users' susceptibility to text-centric fake news attacks via a neuro-cognitive methodology. We investigate the neural underpinnings relevant to fake/real news through EEG. We run an experiment with human users to pursue a thorough investigation of users' perception and cognitive processing of fake/real news. We analyze the neural activity associated with the fake/real news detection task for different categories of news articles. Our results show there may be no statistically significant or automatically inferable differences in the way the human brain processes the fake vs. real news, while marked differences are observed when people are subject to (real/fake) news vs. resting state and even between some different categories of fake news. This neuro-cognitive finding may help to justify users' susceptibility to fake news attacks, as also confirmed from the behavioral analysis. In other words, the fake news articles may seem almost indistinguishable from the real news articles in both behavioral and neural domains. Our work serves to dissect the fundamental neural phenomena underlying fake news attacks and explains users' susceptibility to these attacks through the limits of human biology. We believe this could be a notable insight for the researchers and practitioners suggesting the human detection of fake news might be ineffective, which may also have an adverse impact on the design of automated detection approaches that crucially rely upon human labeling of text articles for building training models
△ Less
Submitted 18 July, 2022;
originally announced July 2022.
-
Belief Propagation with Quantum Messages for Symmetric Classical-Quantum Channels
Authors:
S. Brandsen,
Avijit Mandal,
Henry D. Pfister
Abstract:
Belief propagation (BP) is a classical algorithm that approximates the marginal distribution associated with a factor graph by passing messages between adjacent nodes in the graph. It gained popularity in the 1990's as a powerful decoding algorithm for LDPC codes. In 2016, Renes introduced a belief propagation with quantum messages (BPQM) and described how it could be used to decode classical code…
▽ More
Belief propagation (BP) is a classical algorithm that approximates the marginal distribution associated with a factor graph by passing messages between adjacent nodes in the graph. It gained popularity in the 1990's as a powerful decoding algorithm for LDPC codes. In 2016, Renes introduced a belief propagation with quantum messages (BPQM) and described how it could be used to decode classical codes defined by tree factor graphs that are sent over the classical-quantum pure-state channel. In this work, we propose an extension of BPQM to general binary-input symmetric classical-quantum (BSCQ) channels based on the implementation of a symmetric "paired measurement". While this new paired-measurement BPQM (PMBPQM) approach is suboptimal in general, it provides a concrete BPQM decoder that can be implemented with local operations.
△ Less
Submitted 11 July, 2022;
originally announced July 2022.
-
Data Integrity Error Localization in Networked Systems with Missing Data
Authors:
Yufeng Xin,
Shih-Wen Fu,
Anirban Mandal,
Ryan Tanaka,
Mats Rynge,
Karan Vahi,
Ewa Deelman
Abstract:
Most recent network failure diagnosis systems focused on data center networks where complex measurement systems can be deployed to derive routing information and ensure network coverage in order to achieve accurate and fast fault localization. In this paper, we target wide-area networks that support data-intensive distributed applications. We first present a new multi-output prediction model that…
▽ More
Most recent network failure diagnosis systems focused on data center networks where complex measurement systems can be deployed to derive routing information and ensure network coverage in order to achieve accurate and fast fault localization. In this paper, we target wide-area networks that support data-intensive distributed applications. We first present a new multi-output prediction model that directly maps the application level observations to localize the system component failures. In reality, this application-centric approach may face the missing data challenge as some input (feature) data to the inference models may be missing due to incomplete or lost measurements in wide area networks. We show that the presented prediction model naturally allows the {\it multivariate} imputation to recover the missing data. We evaluate multiple imputation algorithms and show that the prediction performance can be improved significantly in a large-scale network. As far as we know, this is the first study on the missing data issue and applying imputation techniques in network failure localization.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
ORDSIM: Ordinal Regression for E-Commerce Query Similarity Prediction
Authors:
Md. Ahsanul Kabir,
Mohammad Al Hasan,
Aritra Mandal,
Daniel Tunkelang,
Zhe Wu
Abstract:
Query similarity prediction task is generally solved by regression based models with square loss. Such a model is agnostic of absolute similarity values and it penalizes the regression error at all ranges of similarity values at the same scale. However, to boost e-commerce platform's monetization, it is important to predict high-level similarity more accurately than low-level similarity, as highly…
▽ More
Query similarity prediction task is generally solved by regression based models with square loss. Such a model is agnostic of absolute similarity values and it penalizes the regression error at all ranges of similarity values at the same scale. However, to boost e-commerce platform's monetization, it is important to predict high-level similarity more accurately than low-level similarity, as highly similar queries retrieves items according to user-intents, whereas moderately similar item retrieves related items, which may not lead to a purchase. Regression models fail to customize its loss function to concentrate around the high-similarity band, resulting poor performance in query similarity prediction task. We address the above challenge by considering the query prediction as an ordinal regression problem, and thereby propose a model, ORDSIM (ORDinal Regression for SIMilarity Prediction). ORDSIM exploits variable-width buckets to model ordinal loss, which penalizes errors in high-level similarity harshly, and thus enable the regression model to obtain better prediction results for high similarity values. We evaluate ORDSIM on a dataset of over 10 millions e-commerce queries from eBay platform and show that ORDSIM achieves substantially smaller prediction error compared to the competing regression methods on this dataset.
△ Less
Submitted 13 March, 2022;
originally announced March 2022.
-
Understanding User Perspectives on Prompts for Brief Reflection on Troubling Emotions
Authors:
Ananya Bhattacharjee,
Pan Chen,
Linjia Zhou,
Abhijoy Mandal,
Jai Aggarwal,
Katie O'Leary,
Anne Hsu,
Alex Mariakakis,
Joseph Jay Williams
Abstract:
We investigate users' perspectives on an online reflective question activity (RQA) that prompts people to externalize their underlying emotions on a troubling situation. Inspired by principles of cognitive behavioral therapy, our 15-minute activity encourages self-reflection without a human or automated conversational partner. A deployment of our RQA on Amazon Mechanical Turk suggests that people…
▽ More
We investigate users' perspectives on an online reflective question activity (RQA) that prompts people to externalize their underlying emotions on a troubling situation. Inspired by principles of cognitive behavioral therapy, our 15-minute activity encourages self-reflection without a human or automated conversational partner. A deployment of our RQA on Amazon Mechanical Turk suggests that people perceive several benefits from our RQA, including structured awareness of their thoughts and problem-solving around managing their emotions. Quantitative evidence from a randomized experiment suggests people find that our RQA makes them feel less worried by their selected situation and worth the minimal time investment. A further two-week technology probe deployment with 11 participants indicates that people see benefits to doing this activity repeatedly, although the activity may get monotonous over time. In summary, this work demonstrates the promise of online reflection activities that carefully leverage principles of psychology in their design.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Representation Learning for Conversational Data using Discourse Mutual Information Maximization
Authors:
Bishal Santra,
Sumegh Roychowdhury,
Aishik Mandal,
Vasu Gurram,
Atharva Naik,
Manish Gupta,
Pawan Goyal
Abstract:
Although many pretrained models exist for text or images, there have been relatively fewer attempts to train representations specifically for dialog understanding. Prior works usually relied on finetuned representations based on generic text representation models like BERT or GPT-2. But such language modeling pretraining objectives do not take the structural information of conversational text into…
▽ More
Although many pretrained models exist for text or images, there have been relatively fewer attempts to train representations specifically for dialog understanding. Prior works usually relied on finetuned representations based on generic text representation models like BERT or GPT-2. But such language modeling pretraining objectives do not take the structural information of conversational text into consideration. Although generative dialog models can learn structural features too, we argue that the structure-unaware word-by-word generation is not suitable for effective conversation modeling. We empirically demonstrate that such representations do not perform consistently across various dialog understanding tasks. Hence, we propose a structure-aware Mutual Information based loss-function DMI (Discourse Mutual Information) for training dialog-representation models, that additionally captures the inherent uncertainty in response prediction. Extensive evaluation on nine diverse dialog modeling tasks shows that our proposed DMI-based models outperform strong baselines by significant margins.
△ Less
Submitted 3 May, 2022; v1 submitted 4 December, 2021;
originally announced December 2021.
-
Physics-based Mesh Deformation with Haptic Feedback and Material Anisotropy
Authors:
Avirup Mandal,
Parag Chaudhuri,
Subhasis Chaudhuri
Abstract:
We present a physics-based framework to simulate porous, deformable materials and interactive tools with haptic feedback that can reshape it. In order to allow the material to be moulded non-homogeneously, we propose an algorithm to change the material properties of the object depending on its water content. We present a multi-resolution, multi-timescale simulation framework to enable stable visua…
▽ More
We present a physics-based framework to simulate porous, deformable materials and interactive tools with haptic feedback that can reshape it. In order to allow the material to be moulded non-homogeneously, we propose an algorithm to change the material properties of the object depending on its water content. We present a multi-resolution, multi-timescale simulation framework to enable stable visual and haptic feedback at interactive rates. We test our model for physical consistency, accuracy, interactivity and appeal through a user study and quantitative performance evaluation.
△ Less
Submitted 23 February, 2022; v1 submitted 8 December, 2021;
originally announced December 2021.
-
Is Attention always needed? A Case Study on Language Identification from Speech
Authors:
Atanu Mandal,
Santanu Pal,
Indranil Dutta,
Mahidas Bhattacharya,
Sudip Kumar Naskar
Abstract:
Language Identification (LID) is a crucial preliminary process in the field of Automatic Speech Recognition (ASR) that involves the identification of a spoken language from audio samples. Contemporary systems that can process speech in multiple languages require users to expressly designate one or more languages prior to utilization. The LID task assumes a significant role in scenarios where ASR s…
▽ More
Language Identification (LID) is a crucial preliminary process in the field of Automatic Speech Recognition (ASR) that involves the identification of a spoken language from audio samples. Contemporary systems that can process speech in multiple languages require users to expressly designate one or more languages prior to utilization. The LID task assumes a significant role in scenarios where ASR systems are unable to comprehend the spoken language in multilingual settings, leading to unsuccessful speech recognition outcomes. The present study introduces convolutional recurrent neural network (CRNN) based LID, designed to operate on the Mel-frequency Cepstral Coefficient (MFCC) characteristics of audio samples. Furthermore, we replicate certain state-of-the-art methodologies, specifically the Convolutional Neural Network (CNN) and Attention-based Convolutional Recurrent Neural Network (CRNN with attention), and conduct a comparative analysis with our CRNN-based approach. We conducted comprehensive evaluations on thirteen distinct Indian languages and our model resulted in over 98\% classification accuracy. The LID model exhibits high-performance levels ranging from 97% to 100% for languages that are linguistically similar. The proposed LID model exhibits a high degree of extensibility to additional languages and demonstrates a strong resistance to noise, achieving 91.2% accuracy in a noisy setting when applied to a European Language (EU) dataset.
△ Less
Submitted 25 October, 2023; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Knowledge-Aware Neural Networks for Medical Forum Question Classification
Authors:
Soumyadeep Roy,
Sudip Chakraborty,
Aishik Mandal,
Gunjan Balde,
Prakhar Sharma,
Anandhavelu Natarajan,
Megha Khosla,
Shamik Sural,
Niloy Ganguly
Abstract:
Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develo…
▽ More
Online medical forums have become a predominant platform for answering health-related information needs of consumers. However, with a significant rise in the number of queries and the limited availability of experts, it is necessary to automatically classify medical queries based on a consumer's intention, so that these questions may be directed to the right set of medical experts. Here, we develop a novel medical knowledge-aware BERT-based model (MedBERT) that explicitly gives more weightage to medical concept-bearing words, and utilize domain-specific side information obtained from a popular medical knowledge base. We also contribute a multi-label dataset for the Medical Forum Question Classification (MFQC) task. MedBERT achieves state-of-the-art performance on two benchmark datasets and performs very well in low resource settings.
△ Less
Submitted 27 September, 2021;
originally announced September 2021.
-
Optimizing Age-of-Information in Adversarial Environments with Channel State Information
Authors:
Avijit Mandal,
Rajarshi Bhattacharjee,
Abhishek Sinha
Abstract:
This paper considers a multi-user downlink scheduling problem with access to the channel state information at the transmitter (CSIT) to minimize the Age-of-Information (AoI) in a non-stationary environment. The non-stationary environment is modelled using a novel adversarial framework. In this setting, we propose a greedy scheduling policy, called MA-CSIT, that takes into account the current chann…
▽ More
This paper considers a multi-user downlink scheduling problem with access to the channel state information at the transmitter (CSIT) to minimize the Age-of-Information (AoI) in a non-stationary environment. The non-stationary environment is modelled using a novel adversarial framework. In this setting, we propose a greedy scheduling policy, called MA-CSIT, that takes into account the current channel state information. We establish a finite upper bound on the competitive ratio achieved by the MA-CSIT policy for a small number of users and show that the proposed policy has a better performance guarantee than a recently proposed greedy scheduler that operates without CSIT. In particular, we show that access to the additional channel state information improves the competitive ratio from 8 to 2 in the two-user case and from 18 to 8/3 in the three-user case. Finally, we carry out extensive numerical simulations to quantify the advantage of knowing CSIT in order to minimize the Age-of-Information for an arbitrary number of users.
△ Less
Submitted 2 October, 2021; v1 submitted 25 September, 2021;
originally announced September 2021.
-
Alexa Conversations: An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems
Authors:
Anish Acharya,
Suranjit Adhikari,
Sanchit Agarwal,
Vincent Auvray,
Nehal Belgamwar,
Arijit Biswas,
Shubhra Chandra,
Tagyoung Chung,
Maryam Fazel-Zarandi,
Raefer Gabriel,
Shuyang Gao,
Rahul Goel,
Dilek Hakkani-Tur,
Jan Jezabek,
Abhay Jha,
Jiun-Yu Kao,
Prakash Krishnan,
Peter Ku,
Anuj Goyal,
Chien-Wei Lin,
Qing Liu,
Arindam Mandal,
Angeliki Metallinou,
Vishal Naik,
Yi Pan
, et al. (6 additional authors not shown)
Abstract:
Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation. Training each component requires annotations which are hard to obtain for every new domain, limiting scalability of such systems. Similarly, rule-based dialogue systems require extensive writing and maintenance of rules and…
▽ More
Traditional goal-oriented dialogue systems rely on various components such as natural language understanding, dialogue state tracking, policy learning and response generation. Training each component requires annotations which are hard to obtain for every new domain, limiting scalability of such systems. Similarly, rule-based dialogue systems require extensive writing and maintenance of rules and do not scale either. End-to-End dialogue systems, on the other hand, do not require module-specific annotations but need a large amount of data for training. To overcome these problems, in this demo, we present Alexa Conversations, a new approach for building goal-oriented dialogue systems that is scalable, extensible as well as data efficient. The components of this system are trained in a data-driven manner, but instead of collecting annotated conversations for training, we generate them using a novel dialogue simulator based on a few seed dialogues and specifications of APIs and entities provided by the developer. Our approach provides out-of-the-box support for natural conversational phenomena like entity sharing across turns or users changing their mind during conversation without requiring developers to provide any such dialogue flows. We exemplify our approach using a simple pizza ordering task and showcase its value in reducing the developer burden for creating a robust experience. Finally, we evaluate our system using a typical movie ticket booking task and show that the dialogue simulator is an essential component of the system that leads to over $50\%$ improvement in turn-level action signature prediction accuracy.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
Remeshing-Free Graph-Based Finite Element Method for Ductile and Brittle Fracture
Authors:
Avirup Mandal,
Parag Chaudhuri,
Subhasis Chaudhuri
Abstract:
Fracture produces new mesh fragments that introduce additional degrees of freedom in the system dynamics. Existing finite element method (FEM) based solutions suffer from an explosion in computational cost as the system matrix size increases. We solve this problem by presenting a graph-based FEM model for fracture simulation that is remeshing-free and easily scales to high-resolution meshes. Our a…
▽ More
Fracture produces new mesh fragments that introduce additional degrees of freedom in the system dynamics. Existing finite element method (FEM) based solutions suffer from an explosion in computational cost as the system matrix size increases. We solve this problem by presenting a graph-based FEM model for fracture simulation that is remeshing-free and easily scales to high-resolution meshes. Our algorithm models fracture on the graph induced in a volumetric mesh with tetrahedral elements. We relabel the edges of the graph using a computed damage variable to initialize and propagate fracture. We prove that non-linear, hyper-elastic strain energy is expressible entirely in terms of the edge lengths of the induced graph. This allows us to reformulate the system dynamics for the relabeled graph without changing the size of system dynamics matrix and thus prevents the computational cost from blowing up. The fractured surface has to be reconstructed explicitly only for visualization purposes. We simulate standard laboratory experiments from structural mechanics and compare the results with corresponding real-world experiments. We fracture objects made of a variety of brittle and ductile materials, and show that our technique offers stability and speed that is unmatched in current literature.
△ Less
Submitted 8 January, 2022; v1 submitted 27 March, 2021;
originally announced March 2021.
-
Mining Scientific Workflows for Anomalous Data Transfers
Authors:
Huy Tu,
George Papadimitriou,
Mariam Kiran,
Cong Wang,
Anirban Mandal,
Ewa Deelman,
Tim Menzies
Abstract:
Modern scientific workflows are data-driven and are often executed on distributed, heterogeneous, high-performance computing infrastructures. Anomalies and failures in the workflow execution cause loss of scientific productivity and inefficient use of the infrastructure. Hence, detecting, diagnosing, and mitigating these anomalies are immensely important for reliable and performant scientific work…
▽ More
Modern scientific workflows are data-driven and are often executed on distributed, heterogeneous, high-performance computing infrastructures. Anomalies and failures in the workflow execution cause loss of scientific productivity and inefficient use of the infrastructure. Hence, detecting, diagnosing, and mitigating these anomalies are immensely important for reliable and performant scientific workflows. Since these workflows rely heavily on high-performance network transfers that require strict QoS constraints, accurately detecting anomalous network performance is crucial to ensure reliable and efficient workflow execution. To address this challenge, we have developed X-FLASH, a network anomaly detection tool for faulty TCP workflow transfers. X-FLASH incorporates novel hyperparameter tuning and data mining approaches for improving the performance of the machine learning algorithms to accurately classify the anomalous TCP packets. X-FLASH leverages XGBoost as an ensemble model and couples XGBoost with a sequential optimizer, FLASH, borrowed from search-based Software Engineering to learn the optimal model parameters. X-FLASH found configurations that outperformed the existing approach up to 28\%, 29\%, and 40\% relatively for F-measure, G-score, and recall in less than 30 evaluations. From (1) large improvement and (2) simple tuning, we recommend future research to have additional tuning study as a new standard, at least in the area of scientific workflow anomaly detection.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
Blueprint: Cyberinfrastructure Center of Excellence
Authors:
Ewa Deelman,
Anirban Mandal,
Angela P. Murillo,
Jarek Nabrzyski,
Valerio Pascucci,
Robert Ricci,
Ilya Baldin,
Susan Sons,
Laura Christopherson,
Charles Vardeman,
Rafael Ferreira da Silva,
Jane Wyngaard,
Steve Petruzza,
Mats Rynge,
Karan Vahi,
Wendy R. Whitcup,
Josh Drake,
Erik Scott
Abstract:
In 2018, NSF funded an effort to pilot a Cyberinfrastructure Center of Excellence (CI CoE or Center) that would serve the cyberinfrastructure (CI) needs of the NSF Major Facilities (MFs) and large projects with advanced CI architectures. The goal of the CI CoE Pilot project (Pilot) effort was to develop a model and a blueprint for such a CoE by engaging with the MFs, understanding their CI needs,…
▽ More
In 2018, NSF funded an effort to pilot a Cyberinfrastructure Center of Excellence (CI CoE or Center) that would serve the cyberinfrastructure (CI) needs of the NSF Major Facilities (MFs) and large projects with advanced CI architectures. The goal of the CI CoE Pilot project (Pilot) effort was to develop a model and a blueprint for such a CoE by engaging with the MFs, understanding their CI needs, understanding the contributions the MFs are making to the CI community, and exploring opportunities for building a broader CI community. This document summarizes the results of community engagements conducted during the first two years of the project and describes the identified CI needs of the MFs. To better understand MFs' CI, the Pilot has developed and validated a model of the MF data lifecycle that follows the data generation and management within a facility and gained an understanding of how this model captures the fundamental stages that the facilities' data passes through from the scientific instruments to the principal investigators and their teams, to the broader collaborations and the public. The Pilot also aimed to understand what CI workforce development challenges the MFs face while designing, constructing, and operating their CI and what solutions they are exploring and adopting within their projects. Based on the needs of the MFs in the data lifecycle and workforce development areas, this document outlines a blueprint for a CI CoE that will learn about and share the CI solutions designed, developed, and/or adopted by the MFs, provide expertise to the largest NSF projects with advanced and complex CI architectures, and foster a community of CI practitioners and researchers.
△ Less
Submitted 6 March, 2021;
originally announced March 2021.
-
Ising-Based Louvain Method: Clustering Large Graphs with Specialized Hardware
Authors:
Pouya Rezazadeh Kalehbasti,
Hayato Ushijima-Mwesigwa,
Avradip Mandal,
Indradeep Ghosh
Abstract:
Recent advances in specialized hardware for solving optimization problems such quantum computers, quantum annealers, and CMOS annealers give rise to new ways for solving real-word complex problems. However, given current and near-term hardware limitations, the number of variables required to express a large real-world problem easily exceeds the hardware capabilities, thus hybrid methods are usuall…
▽ More
Recent advances in specialized hardware for solving optimization problems such quantum computers, quantum annealers, and CMOS annealers give rise to new ways for solving real-word complex problems. However, given current and near-term hardware limitations, the number of variables required to express a large real-world problem easily exceeds the hardware capabilities, thus hybrid methods are usually developed in order to utilize the hardware. In this work, we advocate for the development of hybrid methods that are built on top of the frameworks of existing state-of-art heuristics, thereby improving these methods. We demonstrate this by building on the so called Louvain method, which is one of the most popular algorithms for the Community detection problem and develop and Ising-based Louvain method. The proposed method outperforms two state-of-the-art community detection algorithms in clustering several small to large-scale graphs. The results show promise in adapting the same optimization approach to other unsupervised learning heuristics to improve their performance.
△ Less
Submitted 6 December, 2020;
originally announced December 2020.
-
Binary matrix factorization on special purpose hardware
Authors:
Osman Asif Malik,
Hayato Ushijima-Mwesigwa,
Arnab Roy,
Avradip Mandal,
Indradeep Ghosh
Abstract:
Many fundamental problems in data mining can be reduced to one or more NP-hard combinatorial optimization problems. Recent advances in novel technologies such as quantum and quantum-inspired hardware promise a substantial speedup for solving these problems compared to when using general purpose computers but often require the problem to be modeled in a special form, such as an Ising or quadratic u…
▽ More
Many fundamental problems in data mining can be reduced to one or more NP-hard combinatorial optimization problems. Recent advances in novel technologies such as quantum and quantum-inspired hardware promise a substantial speedup for solving these problems compared to when using general purpose computers but often require the problem to be modeled in a special form, such as an Ising or quadratic unconstrained binary optimization (QUBO) model, in order to take advantage of these devices. In this work, we focus on the important binary matrix factorization (BMF) problem which has many applications in data mining. We propose two QUBO formulations for BMF. We show how clustering constraints can easily be incorporated into these formulations. The special purpose hardware we consider is limited in the number of variables it can handle which presents a challenge when factorizing large matrices. We propose a sampling based approach to overcome this challenge, allowing us to factorize large rectangular matrices. In addition to these methods, we also propose a simple baseline algorithm which outperforms our more sophisticated methods in a few situations. We run experiments on the Fujitsu Digital Annealer, a quantum-inspired complementary metal-oxide-semiconductor (CMOS) annealer, on both synthetic and real data, including gene expression data. These experiments show that our approach is able to produce more accurate BMFs than competing methods.
△ Less
Submitted 7 January, 2022; v1 submitted 16 October, 2020;
originally announced October 2020.
-
Novel Randomized Placement for FPGA Based Robust ROPUF with Improved Uniqueness
Authors:
Arjun Singh Chauhan,
Vineet Sahula,
Atanendu Sekhar Mandal
Abstract:
The physical unclonable functions (PUF) are used to provide software as well as hardware security for the cyber-physical systems. They have been used for performing significant cryptography tasks such as generating keys, device authentication, securing against IP piracy, and to produce the root of trust as well. However, they lack in reliability metric. We present a novel approach for improving th…
▽ More
The physical unclonable functions (PUF) are used to provide software as well as hardware security for the cyber-physical systems. They have been used for performing significant cryptography tasks such as generating keys, device authentication, securing against IP piracy, and to produce the root of trust as well. However, they lack in reliability metric. We present a novel approach for improving the reliability as well as the uniqueness of the field programmable gated arrays (FPGAs) based ring oscillator PUF and derive a random number, consuming very small area (< 1%) concerning look-up tables (LUTs). We use frequency profiling method for distributing frequency variations in ring oscillators (RO), spatially placed all across the FPGA floor. We are able to spot suitable locations for RO map**, which leads to enhanced ROPUF reliability. We have evaluated the proposed methodology on ** has been observed on average, and (iii) in randomness, signified by passing NIST test suite. The response generated through the ROPUF passes all the applicable relevant tests of NIST uniformity statistical test suite.
△ Less
Submitted 7 June, 2020;
originally announced June 2020.
-
Periodicity of lively quantum walks on cycles with generalized Grover coin
Authors:
Rohit Sarma Sarkar,
Amrita Mandal,
Bibhas Adhikari
Abstract:
In this paper we extend the study of three state lively quantum walks on cycles by considering the coin operator as a linear sum of permutation matrices, which is a generalization of the Grover matrix. First we provide a complete characterization of orthogonal matrices of order $3\times 3$ which are linear sum of permutation matrices. Consequently, we determine several groups of complex, real and…
▽ More
In this paper we extend the study of three state lively quantum walks on cycles by considering the coin operator as a linear sum of permutation matrices, which is a generalization of the Grover matrix. First we provide a complete characterization of orthogonal matrices of order $3\times 3$ which are linear sum of permutation matrices. Consequently, we determine several groups of complex, real and rational orthogonal matrices. We establish that an orthogonal matrix of order $3\times 3$ is a linear sum of permutation matrices if and only if it is permutative. Finally we determine period of lively quantum walk on cycles when the coin operator belongs to the group of orthogonal (real) linear sum of permutation matrices.
△ Less
Submitted 30 March, 2020; v1 submitted 29 March, 2020;
originally announced March 2020.
-
Ising-based Consensus Clustering on Specialized Hardware
Authors:
Eldan Cohen,
Avradip Mandal,
Hayato Ushijima-Mwesigwa,
Arnab Roy
Abstract:
The emergence of specialized optimization hardware such as CMOS annealers and adiabatic quantum computers carries the promise of solving hard combinatorial optimization problems more efficiently in hardware. Recent work has focused on formulating different combinatorial optimization problems as Ising models, the core mathematical abstraction used by a large number of these hardware platforms, and…
▽ More
The emergence of specialized optimization hardware such as CMOS annealers and adiabatic quantum computers carries the promise of solving hard combinatorial optimization problems more efficiently in hardware. Recent work has focused on formulating different combinatorial optimization problems as Ising models, the core mathematical abstraction used by a large number of these hardware platforms, and evaluating the performance of these models when solved on specialized hardware. An interesting area of application is data mining, where combinatorial optimization problems underlie many core tasks. In this work, we focus on consensus clustering (clustering aggregation), an important combinatorial problem that has received much attention over the last two decades. We present two Ising models for consensus clustering and evaluate them using the Fujitsu Digital Annealer, a quantum-inspired CMOS annealer. Our empirical evaluation shows that our approach outperforms existing techniques and is a promising direction for future research.
△ Less
Submitted 3 March, 2020;
originally announced March 2020.
-
Compressed Quadratization of Higher Order Binary Optimization Problems
Authors:
Avradip Mandal,
Arnab Roy,
Sarvagya Upadhyay,
Hayato Ushijima-Mwesigwa
Abstract:
Recent hardware advances in quantum and quantum-inspired annealers promise substantial speedup for solving NP-hard combinatorial optimization problems compared to general-purpose computers. These special-purpose hardware are built for solving hard instances of Quadratic Unconstrained Binary Optimization (QUBO) problems. In terms of number of variables and precision of these hardware are usually re…
▽ More
Recent hardware advances in quantum and quantum-inspired annealers promise substantial speedup for solving NP-hard combinatorial optimization problems compared to general-purpose computers. These special-purpose hardware are built for solving hard instances of Quadratic Unconstrained Binary Optimization (QUBO) problems. In terms of number of variables and precision of these hardware are usually resource-constrained and they work either in Ising space {-1,1} or in Boolean space {0,1}. Many naturally occurring problem instances are higher-order in nature. The known method to reduce the degree of a higher-order optimization problem uses Rosenberg's polynomial. The method works in Boolean space by reducing the degree of one term by introducing one extra variable. In this work, we prove that in Ising space the degree reduction of one term requires the introduction of two variables. Our proposed method of degree reduction works directly in Ising space, as opposed to converting an Ising polynomial to Boolean space and applying previously known Rosenberg's polynomial. For sparse higher-order Ising problems, this results in a more compact representation of the resultant QUBO problem, which is crucial for utilizing resource-constrained QUBO solvers.
△ Less
Submitted 2 January, 2020;
originally announced January 2020.
-
Leveraging Special-Purpose Hardware for Local Search Heuristics
Authors:
Xiaoyuan Liu,
Hayato Ushijima-Mwesigwa,
Avradip Mandal,
Sarvagya Upadhyay,
Ilya Safro,
Arnab Roy
Abstract:
As we approach the physical limits predicted by Moore's law, a variety of specialized hardware is emerging to tackle specialized tasks in different domains. Within combinatorial optimization, adiabatic quantum computers, CMOS annealers, and optical parametric oscillators are few of the emerging specialized hardware technology aimed at solving optimization problems. In terms of mathematical framewo…
▽ More
As we approach the physical limits predicted by Moore's law, a variety of specialized hardware is emerging to tackle specialized tasks in different domains. Within combinatorial optimization, adiabatic quantum computers, CMOS annealers, and optical parametric oscillators are few of the emerging specialized hardware technology aimed at solving optimization problems. In terms of mathematical framework, the Ising optimization model unifies all of these emerging special-purpose hardware. In other words, they are all designed to solve optimization problems expressed in the Ising model or equivalently as a quadratic unconstrained binary optimization model. Due to variety of constraints specific to each type of hardware, they usually suffer from a major challenge: the number of variables that the hardware can manage to solve is very limited. Given that large-scale practical problems, including problems in operations research, combinatorial scientific computing, data science and network science require significantly more variables to model than these devices provide, we are likely to witness that cloud-based deployments of these devices will be available for parallel and shared access. Thus hybrid techniques in combination with both hardware and software must be developed to utilize these technologies. Local search meta-heuristics is one of the approaches to tackle large scale problems. However, a general optimization step within local search is not traditionally formulated in the Ising form. In this work, we propose a new meta-heuristic to model local search in the Ising form for the special-purpose hardware devices. As such, we demonstrate that our method takes the limitations of the Ising model and current hardware into account, utilizes a given hardware more efficiently compared to previous approaches, while also producing high quality solutions compared to other well-known meta-heuristics.
△ Less
Submitted 28 November, 2020; v1 submitted 21 November, 2019;
originally announced November 2019.
-
Improving IT Support by Enhancing Incident Management Process with Multi-modal Analysis
Authors:
Atri Mandal,
Shivali Agarwal,
Nikhil Malhotra,
Giriprasad Sridhara,
Anupama Ray,
Daivik Swarup
Abstract:
IT support services industry is going through a major transformation with AI becoming commonplace. There has been a lot of effort in the direction of automation at every human touchpoint in the IT support processes. Incident management is one such process which has been a beacon process for AI based automation. The vision is to automate the process from the time an incident/ticket arrives till it…
▽ More
IT support services industry is going through a major transformation with AI becoming commonplace. There has been a lot of effort in the direction of automation at every human touchpoint in the IT support processes. Incident management is one such process which has been a beacon process for AI based automation. The vision is to automate the process from the time an incident/ticket arrives till it is resolved and closed. While text is the primary mode of communicating the incidents, there has been a growing trend of using alternate modalities like image to communicate the problem. A large fraction of IT support tickets today contain attached image data in the form of screenshots, log messages, invoices and so on. These attachments help in better explanation of the problem which aids in faster resolution. Anybody who aspires to provide AI based IT support, it is essential to build systems which can handle multi-modal content. In this paper we present how incident management in IT support domain can be made much more effective using multi-modal analysis. The information extracted from different modalities are correlated to enrich the information in the ticket and used for better ticket routing and resolution. We evaluate our system using about 25000 real tickets containing attachments from selected problem areas. Our results demonstrate significant improvements in both routing and resolution with the use of multi-modal ticket analysis compared to only text based analysis.
△ Less
Submitted 4 August, 2019;
originally announced August 2019.
-
Custom Execution Environments with Containers in Pegasus-enabled Scientific Workflows
Authors:
Karan Vahi,
Mats Rynge,
George Papadimitriou,
Duncan A. Brown,
Rajiv Mayani,
Rafael Ferreira da Silva,
Ewa Deelman,
Anirban Mandal,
Eric Lyons,
Michael Zink
Abstract:
Science reproducibility is a cornerstone feature in scientific workflows. In most cases, this has been implemented as a way to exactly reproduce the computational steps taken to reach the final results. While these steps are often completely described, including the input parameters, datasets, and codes, the environment in which these steps are executed is only described at a higher level with end…
▽ More
Science reproducibility is a cornerstone feature in scientific workflows. In most cases, this has been implemented as a way to exactly reproduce the computational steps taken to reach the final results. While these steps are often completely described, including the input parameters, datasets, and codes, the environment in which these steps are executed is only described at a higher level with endpoints and operating system name and versions. Though this may be sufficient for reproducibility in the short term, systems evolve and are replaced over time, breaking the underlying workflow reproducibility. A natural solution to this problem is containers, as they are well defined, have a lifetime independent of the underlying system, and can be user-controlled so that they can provide custom environments if needed. This paper highlights some unique challenges that may arise when using containers in distributed scientific workflows. Further, this paper explores how the Pegasus Workflow Management System implements container support to address such challenges.
△ Less
Submitted 20 May, 2019;
originally announced May 2019.
-
Personalized Ranking in eCommerce Search
Authors:
Grigor Aslanyan,
Aritra Mandal,
Prathyusha Senthil Kumar,
Amit Jaiswal,
Manojkumar Rangasamy Kannadasan
Abstract:
We address the problem of personalization in the context of eCommerce search. Specifically, we develop personalization ranking features that use in-session context to augment a generic ranker optimized for conversion and relevance. We use a combination of latent features learned from item co-clicks in historic sessions and content-based features that use item title and price. Personalization in se…
▽ More
We address the problem of personalization in the context of eCommerce search. Specifically, we develop personalization ranking features that use in-session context to augment a generic ranker optimized for conversion and relevance. We use a combination of latent features learned from item co-clicks in historic sessions and content-based features that use item title and price. Personalization in search has been discussed extensively in the existing literature. The novelty of our work is combining and comparing content-based and content-agnostic features and showing that they complement each other to result in a significant improvement of the ranker. Moreover, our technique does not require an explicit re-ranking step, does not rely on learning user profiles from long term search behavior, and does not involve complex modeling of query-item-user features. Our approach captures item co-click propensity using lightweight item embeddings. We experimentally show that our technique significantly outperforms a generic ranker in terms of Mean Reciprocal Rank (MRR). We also provide anecdotal evidence for the semantic similarity captured by the item embeddings on the eBay search engine.
△ Less
Submitted 30 April, 2019;
originally announced May 2019.
-
Advancing the State of the Art in Open Domain Dialog Systems through the Alexa Prize
Authors:
Chandra Khatri,
Behnam Hedayatnia,
Anu Venkatesh,
Jeff Nunn,
Yi Pan,
Qing Liu,
Han Song,
Anna Gottardi,
Sanjeev Kwatra,
Sanju Pancholi,
Ming Cheng,
Qinglang Chen,
Lauren Stubel,
Karthik Gopalakrishnan,
Kate Bland,
Raefer Gabriel,
Arindam Mandal,
Dilek Hakkani-Tur,
Gene Hwang,
Nate Michel,
Eric King,
Rohit Prasad
Abstract:
Building open domain conversational systems that allow users to have engaging conversations on topics of their choice is a challenging task. Alexa Prize was launched in 2016 to tackle the problem of achieving natural, sustained, coherent and engaging open-domain dialogs. In the second iteration of the competition in 2018, university teams advanced the state of the art by using context in dialog mo…
▽ More
Building open domain conversational systems that allow users to have engaging conversations on topics of their choice is a challenging task. Alexa Prize was launched in 2016 to tackle the problem of achieving natural, sustained, coherent and engaging open-domain dialogs. In the second iteration of the competition in 2018, university teams advanced the state of the art by using context in dialog models, leveraging knowledge graphs for language understanding, handling complex utterances, building statistical and hierarchical dialog managers, and leveraging model-driven signals from user responses. The 2018 competition also included the provision of a suite of tools and models to the competitors including the CoBot (conversational bot) toolkit, topic and dialog act detection models, conversation evaluators, and a sensitive content detection model so that the competing teams could focus on building knowledge-rich, coherent and engaging multi-turn dialog systems. This paper outlines the advances developed by the university teams as well as the Alexa Prize team to achieve the common goal of advancing the science of Conversational AI. We address several key open-ended problems such as conversational speech recognition, open domain natural language understanding, commonsense reasoning, statistical dialog management, and dialog evaluation. These collaborative efforts have driven improved experiences by Alexa users to an average rating of 3.61, the median duration of 2 mins 18 seconds, and average turns to 14.6, increases of 14%, 92%, 54% respectively since the launch of the 2018 competition. For conversational speech recognition, we have improved our relative Word Error Rate by 55% and our relative Entity Error Rate by 34% since the launch of the Alexa Prize. Socialbots improved in quality significantly more rapidly in 2018, in part due to the release of the CoBot toolkit.
△ Less
Submitted 27 December, 2018;
originally announced December 2018.
-
Detecting Offensive Content in Open-domain Conversations using Two Stage Semi-supervision
Authors:
Chandra Khatri,
Behnam Hedayatnia,
Rahul Goel,
Anushree Venkatesh,
Raefer Gabriel,
Arindam Mandal
Abstract:
As open-ended human-chatbot interaction becomes commonplace, sensitive content detection gains importance. In this work, we propose a two stage semi-supervised approach to bootstrap large-scale data for automatic sensitive language detection from publicly available web resources. We explore various data selection methods including 1) using a blacklist to rank online discussion forums by the level…
▽ More
As open-ended human-chatbot interaction becomes commonplace, sensitive content detection gains importance. In this work, we propose a two stage semi-supervised approach to bootstrap large-scale data for automatic sensitive language detection from publicly available web resources. We explore various data selection methods including 1) using a blacklist to rank online discussion forums by the level of their sensitiveness followed by randomly sampling utterances and 2) training a weakly supervised model in conjunction with the blacklist for scoring sentences from online discussion forums to curate a dataset. Our data collection strategy is flexible and allows the models to detect implicit sensitive content for which manual annotations may be difficult. We train models using publicly available annotated datasets as well as using the proposed large-scale semi-supervised datasets. We evaluate the performance of all the models on Twitter and Toxic Wikipedia comments testsets as well as on a manually annotated spoken language dataset collected during a large scale chatbot competition. Results show that a model trained on this collected data outperforms the baseline models by a large margin on both in-domain and out-of-domain testsets, achieving an F1 score of 95.5% on an out-of-domain testset compared to a score of 75% for models trained on public datasets. We also showcase that large scale two stage semi-supervision generalizes well across multiple classes of sensitivities such as hate speech, racism, sexual and pornographic content, etc. without even providing explicit labels for these classes, leading to an average recall of 95.5% versus the models trained using annotated public datasets which achieve an average recall of 73.2% across seven sensitive classes on out-of-domain testsets.
△ Less
Submitted 30 November, 2018;
originally announced November 2018.
-
Flexible and Scalable State Tracking Framework for Goal-Oriented Dialogue Systems
Authors:
Rahul Goel,
Shachi Paul,
Tagyoung Chung,
Jeremie Lecomte,
Arindam Mandal,
Dilek Hakkani-Tur
Abstract:
Goal-oriented dialogue systems typically rely on components specifically developed for a single task or domain. This limits such systems in two different ways: If there is an update in the task domain, the dialogue system usually needs to be updated or completely re-trained. It is also harder to extend such dialogue systems to different and multiple domains. The dialogue state tracker in conventio…
▽ More
Goal-oriented dialogue systems typically rely on components specifically developed for a single task or domain. This limits such systems in two different ways: If there is an update in the task domain, the dialogue system usually needs to be updated or completely re-trained. It is also harder to extend such dialogue systems to different and multiple domains. The dialogue state tracker in conventional dialogue systems is one such component - it is usually designed to fit a well-defined application domain. For example, it is common for a state variable to be a categorical distribution over a manually-predefined set of entities (Henderson et al., 2013), resulting in an inflexible and hard-to-extend dialogue system. In this paper, we propose a new approach for dialogue state tracking that can generalize well over multiple domains without incorporating any domain-specific knowledge. Under this framework, discrete dialogue state variables are learned independently and the information of a predefined set of possible values for dialogue state variables is not required. Furthermore, it enables adding arbitrary dialogue context as features and allows for multiple values to be associated with a single state variable. These characteristics make it much easier to expand the dialogue state space. We evaluate our framework using the widely used dialogue state tracking challenge data set (DSTC2) and show that our framework yields competitive results with other state-of-the-art results despite incorporating little domain knowledge. We also show that this framework can benefit from widely available external resources such as pre-trained word embeddings.
△ Less
Submitted 30 November, 2018;
originally announced November 2018.