-
Mobile Network Configuration Recommendation using Deep Generative Graph Neural Network
Authors:
Shirwan Piroti,
Ashima Chawla,
Tahar Zanouda
Abstract:
There are vast number of configurable parameters in a Radio Access Telecom Network. A significant amount of these parameters is configured by Radio Node or cell based on their deployment setting. Traditional methods rely on domain knowledge for individual parameter configuration, often leading to sub-optimal results. To improve this, a framework using a Deep Generative Graph Neural Network (GNN) i…
▽ More
There are vast number of configurable parameters in a Radio Access Telecom Network. A significant amount of these parameters is configured by Radio Node or cell based on their deployment setting. Traditional methods rely on domain knowledge for individual parameter configuration, often leading to sub-optimal results. To improve this, a framework using a Deep Generative Graph Neural Network (GNN) is proposed. It encodes the network into a graph, extracts subgraphs for each RAN node, and employs a Siamese GNN (S-GNN) to learn embeddings. The framework recommends configuration parameters for a multitude of parameters and detects misconfigurations, handling both network expansion and existing cell reconfiguration. Tested on real-world data, the model surpasses baselines, demonstrating accuracy, generalizability, and robustness against concept drift.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
PhonologyBench: Evaluating Phonological Skills of Large Language Models
Authors:
Ashima Suvarna,
Harshita Khandelwal,
Nanyun Peng
Abstract:
Phonology, the study of speech's structure and pronunciation rules, is a critical yet often overlooked component in Large Language Model (LLM) research. LLMs are widely used in various downstream applications that leverage phonology such as educational tools and poetry generation. Moreover, LLMs can potentially learn imperfect associations between orthographic and phonological forms from the train…
▽ More
Phonology, the study of speech's structure and pronunciation rules, is a critical yet often overlooked component in Large Language Model (LLM) research. LLMs are widely used in various downstream applications that leverage phonology such as educational tools and poetry generation. Moreover, LLMs can potentially learn imperfect associations between orthographic and phonological forms from the training data. Thus, it is imperative to benchmark the phonological skills of LLMs. To this end, we present PhonologyBench, a novel benchmark consisting of three diagnostic tasks designed to explicitly test the phonological skills of LLMs in English: grapheme-to-phoneme conversion, syllable counting, and rhyme word generation. Despite having no access to speech data, LLMs showcased notable performance on the PhonologyBench tasks. However, we observe a significant gap of 17% and 45% on Rhyme Word Generation and Syllable counting, respectively, when compared to humans. Our findings underscore the importance of studying LLM performance on phonological tasks that inadvertently impact real-world applications. Furthermore, we encourage researchers to choose LLMs that perform well on the phonological task that is closely related to the downstream application since we find that no single model consistently outperforms the others on all the tasks.
△ Less
Submitted 5 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation
Authors:
Yixin Wan,
Arjun Subramonian,
Anaelia Ovalle,
Zongyu Lin,
Ashima Suvarna,
Christina Chance,
Hritik Bansal,
Rebecca Pattichis,
Kai-Wei Chang
Abstract:
The recent advancement of large and powerful models with Text-to-Image (T2I) generation abilities -- such as OpenAI's DALLE-3 and Google's Gemini -- enables users to generate high-quality images from textual prompts. However, it has become increasingly evident that even simple prompts could cause T2I models to exhibit conspicuous social bias in generated images. Such bias might lead to both alloca…
▽ More
The recent advancement of large and powerful models with Text-to-Image (T2I) generation abilities -- such as OpenAI's DALLE-3 and Google's Gemini -- enables users to generate high-quality images from textual prompts. However, it has become increasingly evident that even simple prompts could cause T2I models to exhibit conspicuous social bias in generated images. Such bias might lead to both allocational and representational harms in society, further marginalizing minority groups. Noting this problem, a large body of recent works has been dedicated to investigating different dimensions of bias in T2I systems. However, an extensive review of these studies is lacking, hindering a systematic understanding of current progress and research gaps. We present the first extensive survey on bias in T2I generative models. In this survey, we review prior studies on dimensions of bias: Gender, Skintone, and Geo-Culture. Specifically, we discuss how these works define, evaluate, and mitigate different aspects of bias. We found that: (1) while gender and skintone biases are widely studied, geo-cultural bias remains under-explored; (2) most works on gender and skintone bias investigated occupational association, while other aspects are less frequently studied; (3) almost all gender bias works overlook non-binary identities in their studies; (4) evaluation datasets and metrics are scattered, with no unified framework for measuring biases; and (5) current mitigation methods fail to resolve biases comprehensively. Based on current limitations, we point out future research directions that contribute to human-centric definitions, evaluations, and mitigation of biases. We hope to highlight the importance of studying biases in T2I systems, as well as encourage future efforts to holistically understand and tackle biases, building fair and trustworthy T2I technologies for everyone.
△ Less
Submitted 1 May, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization
Authors:
Hritik Bansal,
Ashima Suvarna,
Gantavya Bhatt,
Nanyun Peng,
Kai-Wei Chang,
Aditya Grover
Abstract:
A common technique for aligning large language models (LLMs) relies on acquiring human preferences by comparing multiple generations conditioned on a fixed context. This only leverages the pairwise comparisons when the generations are placed in an identical context. However, such conditional rankings often fail to capture the complex and multidimensional aspects of human preferences. In this work,…
▽ More
A common technique for aligning large language models (LLMs) relies on acquiring human preferences by comparing multiple generations conditioned on a fixed context. This only leverages the pairwise comparisons when the generations are placed in an identical context. However, such conditional rankings often fail to capture the complex and multidimensional aspects of human preferences. In this work, we revisit the traditional paradigm of preference acquisition and propose a new axis that is based on eliciting preferences jointly over the instruction-response pairs. While prior preference optimizations are designed for conditional ranking protocols (e.g., DPO), our proposed preference acquisition protocol introduces DOVE, a new preference optimization objective that upweights the joint probability of the chosen instruction-response pair over the rejected instruction-response pair. Interestingly, we find that the LLM trained with joint instruction-response preference data using DOVE outperforms the LLM trained with DPO by 5.2% and 3.3% win-rate for the summarization and open-ended dialogue datasets, respectively. Our findings reveal that joint preferences over instruction and response pairs can significantly enhance the alignment of LLMs by tap** into a broader spectrum of human preference elicitation. The data and code is available at https://github.com/Hritikbansal/dove.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Improving Event Definition Following For Zero-Shot Event Detection
Authors:
Zefan Cai,
Po-Nien Kung,
Ashima Suvarna,
Mingyu Derek Ma,
Hritik Bansal,
Baobao Chang,
P. Jeffrey Brantingham,
Wei Wang,
Nanyun Peng
Abstract:
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of ev…
▽ More
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types, and prompt them with unseen event definitions. These approaches yield sporadic successes, yet generally fall short of expectations. In this work, we aim to improve zero-shot event detection by training models to better follow event definitions. We hypothesize that a diverse set of event types and definitions are the key for models to learn to follow event definitions while existing event extraction datasets focus on annotating many high-quality examples for a few event types. To verify our hypothesis, we construct an automatically generated Diverse Event Definition (DivED) dataset and conduct comparative studies. Our experiments reveal that a large number of event types (200) and diverse event definitions can significantly boost event extraction performance; on the other hand, the performance does not scale with over ten examples per event type. Beyond scaling, we incorporate event ontology information and hard-negative samples during training, further boosting the performance. Based on these findings, we fine-tuned a LLaMA-2-7B model on our DivED dataset, yielding performance that surpasses SOTA large language models like GPT-3.5 across three open benchmarks on zero-shot event detection.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Examining the simulation-to-reality gap of a wheel loader digging in deformable terrain
Authors:
Koji Aoshima,
Martin Servin
Abstract:
We investigate how well a physics-based simulator can replicate a real wheel loader performing bucket filling in a pile of soil. The comparison is made using field test time series of the vehicle motion and actuation forces, loaded mass, and total work. The vehicle was modeled as a rigid multibody system with frictional contacts, driveline, and linear actuators. For the soil, we tested discrete el…
▽ More
We investigate how well a physics-based simulator can replicate a real wheel loader performing bucket filling in a pile of soil. The comparison is made using field test time series of the vehicle motion and actuation forces, loaded mass, and total work. The vehicle was modeled as a rigid multibody system with frictional contacts, driveline, and linear actuators. For the soil, we tested discrete element models of different resolutions, with and without multiscale acceleration. The spatio-temporal resolution ranged between 50-400 mm and 2-500 ms, and the computational speed was between 1/10,000 to 5 times faster than real-time. The simulation-to-reality gap was found to be around 10% and exhibited a weak dependence on the level of fidelity, e.g., compatible with real-time simulation. Furthermore, the sensitivity of an optimized force feedback controller under transfer between different simulation domains was investigated. The domain bias was observed to cause a performance reduction of 5% despite the domain gap being about 15%.
△ Less
Submitted 27 April, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
World Modeling for Autonomous Wheel Loaders
Authors:
Koji Aoshima,
Arvid Fälldin,
Eddie Wadbro,
Martin Servin
Abstract:
This paper presents a method for learning world models for wheel loaders performing automatic loading actions on a pile of soil. Data-driven models were learned to output the resulting pile state, loaded mass, time, and work for a single loading cycle given inputs that include a heightmap of the initial pile shape and action parameters for an automatic bucket-filling controller. Long-horizon plann…
▽ More
This paper presents a method for learning world models for wheel loaders performing automatic loading actions on a pile of soil. Data-driven models were learned to output the resulting pile state, loaded mass, time, and work for a single loading cycle given inputs that include a heightmap of the initial pile shape and action parameters for an automatic bucket-filling controller. Long-horizon planning of sequential loading in a dynamically changing environment is thus enabled as repeated model inference. The models, consisting of deep neural networks, were trained on data from 3D multibody dynamics simulation of over 10,000 random loading actions in gravel piles of different shapes. The accuracy and inference time for predicting the loading performance and the resulting pile state were, on average, 95% in 1.2 ms and 97% in 4.5 ms, respectively. Long-horizon predictions were found feasible over 40 sequential loading actions.
△ Less
Submitted 28 May, 2024; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model
Authors:
Takehiro Aoshima,
Takashi Matsubara
Abstract:
Semantic editing of images is the fundamental goal of computer vision. Although deep learning methods, such as generative adversarial networks (GANs), are capable of producing high-quality images, they often do not have an inherent way of editing generated images semantically. Recent studies have investigated a way of manipulating the latent variable to determine the images to be generated. Howeve…
▽ More
Semantic editing of images is the fundamental goal of computer vision. Although deep learning methods, such as generative adversarial networks (GANs), are capable of producing high-quality images, they often do not have an inherent way of editing generated images semantically. Recent studies have investigated a way of manipulating the latent variable to determine the images to be generated. However, methods that assume linear semantic arithmetic have certain limitations in terms of the quality of image editing, whereas methods that discover nonlinear semantic pathways provide non-commutative editing, which is inconsistent when applied in different orders. This study proposes a novel method called deep curvilinear editing (DeCurvEd) to determine semantic commuting vector fields on the latent space. We theoretically demonstrate that owing to commutativity, the editing of multiple attributes depends only on the quantities and not on the order. Furthermore, we experimentally demonstrate that compared to previous methods, the nonlinear and commutative nature of DeCurvEd facilitates the disentanglement of image attributes and provides higher-quality editing.
△ Less
Submitted 29 August, 2023; v1 submitted 26 November, 2022;
originally announced November 2022.
-
Learning Hierarchy Aware Features for Reducing Mistake Severity
Authors:
Ashima Garg,
Depanshu Sani,
Saket Anand
Abstract:
Label hierarchies are often available apriori as part of biological taxonomy or language datasets WordNet. Several works exploit these to learn hierarchy aware features in order to improve the classifier to make semantically meaningful mistakes while maintaining or reducing the overall error. In this paper, we propose a novel approach for learning Hierarchy Aware Features (HAF) that leverages clas…
▽ More
Label hierarchies are often available apriori as part of biological taxonomy or language datasets WordNet. Several works exploit these to learn hierarchy aware features in order to improve the classifier to make semantically meaningful mistakes while maintaining or reducing the overall error. In this paper, we propose a novel approach for learning Hierarchy Aware Features (HAF) that leverages classifiers at each level of the hierarchy that are constrained to generate predictions consistent with the label hierarchy. The classifiers are trained by minimizing a Jensen-Shannon Divergence with target soft labels obtained from the fine-grained classifiers. Additionally, we employ a simple geometric loss that constrains the feature space geometry to capture the semantic structure of the label space. HAF is a training time approach that improves the mistakes while maintaining top-1 error, thereby, addressing the problem of cross-entropy loss that treats all mistakes as equal. We evaluate HAF on three hierarchical datasets and achieve state-of-the-art results on the iNaturalist-19 and CIFAR-100 datasets. The source code is available at https://github.com/07Agarg/HAF
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
ETMA: Efficient Transformer Based Multilevel Attention framework for Multimodal Fake News Detection
Authors:
Ashima Yadav,
Shivani Gaba,
Haneef Khan,
Ishan Budhiraja,
Akansha Singh,
Krishan Kant Singh
Abstract:
In this new digital era, social media has created a severe impact on the lives of people. In recent times, fake news content on social media has become one of the major challenging problems for society. The dissemination of fabricated and false news articles includes multimodal data in the form of text and images. The previous methods have mainly focused on unimodal analysis. Moreover, for multimo…
▽ More
In this new digital era, social media has created a severe impact on the lives of people. In recent times, fake news content on social media has become one of the major challenging problems for society. The dissemination of fabricated and false news articles includes multimodal data in the form of text and images. The previous methods have mainly focused on unimodal analysis. Moreover, for multimodal analysis, researchers fail to keep the unique characteristics corresponding to each modality. This paper aims to overcome these limitations by proposing an Efficient Transformer based Multilevel Attention (ETMA) framework for multimodal fake news detection, which comprises the following components: visual attention-based encoder, textual attention-based encoder, and joint attention-based learning. Each component utilizes the different forms of attention mechanism and uniquely deals with multimodal data to detect fraudulent content. The efficacy of the proposed network is validated by conducting several experiments on four real-world fake news datasets: Twitter, Jruvika Fake News Dataset, Pontes Fake News Dataset, and Risdal Fake News Dataset using multiple evaluation metrics. The results show that the proposed method outperforms the baseline methods on all four datasets. Further, the computation time of the model is also lower than the state-of-the-art methods.
△ Less
Submitted 13 March, 2023; v1 submitted 15 June, 2022;
originally announced June 2022.
-
HIERMATCH: Leveraging Label Hierarchies for Improving Semi-Supervised Learning
Authors:
Ashima Garg,
Shaurya Bagga,
Yashvardhan Singh,
Saket Anand
Abstract:
Semi-supervised learning approaches have emerged as an active area of research to combat the challenge of obtaining large amounts of annotated data. Towards the goal of improving the performance of semi-supervised learning methods, we propose a novel framework, HIERMATCH, a semi-supervised approach that leverages hierarchical information to reduce labeling costs and performs as well as a vanilla s…
▽ More
Semi-supervised learning approaches have emerged as an active area of research to combat the challenge of obtaining large amounts of annotated data. Towards the goal of improving the performance of semi-supervised learning methods, we propose a novel framework, HIERMATCH, a semi-supervised approach that leverages hierarchical information to reduce labeling costs and performs as well as a vanilla semi-supervised learning method. Hierarchical information is often available as prior knowledge in the form of coarse labels (e.g., woodpeckers) for images with fine-grained labels (e.g., downy woodpeckers or golden-fronted woodpeckers). However, the use of supervision using coarse category labels to improve semi-supervised techniques has not been explored. In the absence of fine-grained labels, HIERMATCH exploits the label hierarchy and uses coarse class labels as a weak supervisory signal. Additionally, HIERMATCH is a generic-approach to improve any semisupervised learning framework, we demonstrate this using our results on recent state-of-the-art techniques MixMatch and FixMatch. We evaluate the efficacy of HIERMATCH on two benchmark datasets, namely CIFAR-100 and NABirds. HIERMATCH can reduce the usage of fine-grained labels by 50% on CIFAR-100 with only a marginal drop of 0.59% in top-1 accuracy as compared to MixMatch. Code: https://github.com/07Agarg/HIERMATCH
△ Less
Submitted 21 December, 2021; v1 submitted 29 October, 2021;
originally announced November 2021.
-
Simulation-Based Optimization of High-Performance Wheel Loading
Authors:
Koji Aoshima,
Martin Servin,
Eddie Wadbro
Abstract:
Having smart and autonomous earthmoving in mind, we explore high-performance wheel loading in a simulated environment. This paper introduces a wheel loader simulator that combines contacting 3D multibody dynamics with a hybrid continuum-particle terrain model, supporting realistic digging forces and soil displacements at real-time performance. A total of 270,000 simulations are run with different…
▽ More
Having smart and autonomous earthmoving in mind, we explore high-performance wheel loading in a simulated environment. This paper introduces a wheel loader simulator that combines contacting 3D multibody dynamics with a hybrid continuum-particle terrain model, supporting realistic digging forces and soil displacements at real-time performance. A total of 270,000 simulations are run with different loading actions, pile slopes, and soil to analyze how they affect the loading performance. The results suggest that the preferred digging actions should preserve and exploit a steep pile slope. High digging speed favors high productivity, while energy-efficient loading requires a lower dig speed.
△ Less
Submitted 29 September, 2021; v1 submitted 30 July, 2021;
originally announced July 2021.
-
A Survey of Knowledge Graph Embedding and Their Applications
Authors:
Shivani Choudhary,
Tarun Luthra,
Ashima Mittal,
Rajat Singh
Abstract:
Knowledge Graph embedding provides a versatile technique for representing knowledge. These techniques can be used in a variety of applications such as completion of knowledge graph to predict missing information, recommender systems, question answering, query expansion, etc. The information embedded in Knowledge graph though being structured is challenging to consume in a real-world application. K…
▽ More
Knowledge Graph embedding provides a versatile technique for representing knowledge. These techniques can be used in a variety of applications such as completion of knowledge graph to predict missing information, recommender systems, question answering, query expansion, etc. The information embedded in Knowledge graph though being structured is challenging to consume in a real-world application. Knowledge graph embedding enables the real-world application to consume information to improve performance. Knowledge graph embedding is an active research area. Most of the embedding methods focus on structure-based information. Recent research has extended the boundary to include text-based information and image-based information in entity embedding. Efforts have been made to enhance the representation with context information. This paper introduces growth in the field of KG embedding from simple translation-based models to enrichment-based models. This paper includes the utility of the Knowledge graph in real-world applications.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Mining Trends of COVID-19 Vaccine Beliefs on Twitter with Lexical Embeddings
Authors:
Harshita Chopra,
Aniket Vashishtha,
Ridam Pal,
Ashima,
Ananya Tyagi,
Tavpritesh Sethi
Abstract:
Social media plays a pivotal role in disseminating news globally and acts as a platform for people to express their opinions on various topics. A wide variety of views accompanies COVID-19 vaccination drives across the globe, often colored by emotions, which change along with rising cases, approval of vaccines, and multiple factors discussed online. This study aims at analyzing the temporal evolut…
▽ More
Social media plays a pivotal role in disseminating news globally and acts as a platform for people to express their opinions on various topics. A wide variety of views accompanies COVID-19 vaccination drives across the globe, often colored by emotions, which change along with rising cases, approval of vaccines, and multiple factors discussed online. This study aims at analyzing the temporal evolution of different Emotion categories: Hesitation, Rage, Sorrow, Anticipation, Faith, and Contentment with Influencing Factors: Vaccine Rollout, Misinformation, Health Effects, and Inequities as lexical categories created from Tweets belonging to five countries with vital vaccine roll-out programs, namely, India, United States of America, Brazil, United Kingdom, and Australia. We extracted a corpus of nearly 1.8 million Twitter posts related to COVID-19 vaccination. Using cosine distance from selected seed words, we expanded the vocabulary of each category and tracked the longitudinal change in their strength from June 2020 to April 2021. We used community detection algorithms to find modules in positive correlation networks. Our findings suggest that tweets expressing hesitancy towards vaccines contain the highest mentions of health-related effects in all countries. Our results indicated that the patterns of hesitancy were variable across geographies and can help us learn targeted interventions. We also observed a significant change in the linear trends of categories like hesitation and contentment before and after approval of vaccines. Negative emotions like rage and sorrow gained the highest importance in the alluvial diagram. They formed a significant module with all the influencing factors in April 2021, when India observed the second wave of COVID-19 cases. The relationship between Emotions and Influencing Factors was found to be variable across the countries.
△ Less
Submitted 20 July, 2021; v1 submitted 2 April, 2021;
originally announced April 2021.
-
A Deep Multi-Level Attentive network for Multimodal Sentiment Analysis
Authors:
Ashima Yadav,
Dinesh Kumar Vishwakarma
Abstract:
Multimodal sentiment analysis has attracted increasing attention with broad application prospects. The existing methods focuses on single modality, which fails to capture the social media content for multiple modalities. Moreover, in multi-modal learning, most of the works have focused on simply combining the two modalities, without exploring the complicated correlations between them. This resulte…
▽ More
Multimodal sentiment analysis has attracted increasing attention with broad application prospects. The existing methods focuses on single modality, which fails to capture the social media content for multiple modalities. Moreover, in multi-modal learning, most of the works have focused on simply combining the two modalities, without exploring the complicated correlations between them. This resulted in dissatisfying performance for multimodal sentiment classification. Motivated by the status quo, we propose a Deep Multi-Level Attentive network, which exploits the correlation between image and text modalities to improve multimodal learning. Specifically, we generate the bi-attentive visual map along the spatial and channel dimensions to magnify CNNs representation power. Then we model the correlation between the image regions and semantics of the word by extracting the textual features related to the bi-attentive visual features by applying semantic attention. Finally, self-attention is employed to automatically fetch the sentiment-rich multimodal features for the classification. We conduct extensive evaluations on four real-world datasets, namely, MVSA-Single, MVSA-Multiple, Flickr, and Getty Images, which verifies the superiority of our method.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
A Deep Language-independent Network to analyze the impact of COVID-19 on the World via Sentiment Analysis
Authors:
Ashima Yadav,
Dinesh Kumar Vishwakarma
Abstract:
Towards the end of 2019, Wuhan experienced an outbreak of novel coronavirus, which soon spread all over the world, resulting in a deadly pandemic that infected millions of people around the globe. The government and public health agencies followed many strategies to counter the fatal virus. However, the virus severely affected the social and economic lives of the people. In this paper, we extract…
▽ More
Towards the end of 2019, Wuhan experienced an outbreak of novel coronavirus, which soon spread all over the world, resulting in a deadly pandemic that infected millions of people around the globe. The government and public health agencies followed many strategies to counter the fatal virus. However, the virus severely affected the social and economic lives of the people. In this paper, we extract and study the opinion of people from the top five worst affected countries by the virus, namely USA, Brazil, India, Russia, and South Africa. We propose a deep language-independent Multilevel Attention-based Conv-BiGRU network (MACBiG-Net), which includes embedding layer, word-level encoded attention, and sentence-level encoded attention mechanism to extract the positive, negative, and neutral sentiments. The embedding layer encodes the sentence sequence into a real-valued vector. The word-level and sentence-level encoding is performed by a 1D Conv-BiGRU based mechanism, followed by word-level and sentence-level attention, respectively. We further develop a COVID-19 Sentiment Dataset by crawling the tweets from Twitter. Extensive experiments on our proposed dataset demonstrate the effectiveness of the proposed MACBiG-Net. Also, attention-weights visualization and in-depth results analysis shows that the proposed network has effectively captured the sentiments of the people.
△ Less
Submitted 20 November, 2020;
originally announced November 2020.
-
DeCaf: Diagnosing and Triaging Performance Issues in Large-Scale Cloud Services
Authors:
Chetan Bansal,
Sundararajan Renganathan,
Ashima Asudani,
Olivier Midy,
Mathru Janakiraman
Abstract:
Large scale cloud services use Key Performance Indicators (KPIs) for tracking and monitoring performance. They usually have Service Level Objectives (SLOs) baked into the customer agreements which are tied to these KPIs. Dependency failures, code bugs, infrastructure failures, and other problems can cause performance regressions. It is critical to minimize the time and manual effort in diagnosing…
▽ More
Large scale cloud services use Key Performance Indicators (KPIs) for tracking and monitoring performance. They usually have Service Level Objectives (SLOs) baked into the customer agreements which are tied to these KPIs. Dependency failures, code bugs, infrastructure failures, and other problems can cause performance regressions. It is critical to minimize the time and manual effort in diagnosing and triaging such issues to reduce customer impact. Large volume of logs and mixed type of attributes (categorical, continuous) in the logs makes diagnosis of regressions non-trivial.
In this paper, we present the design, implementation and experience from building and deploying DeCaf, a system for automated diagnosis and triaging of KPI issues using service logs. It uses machine learning along with pattern mining to help service owners automatically root cause and triage performance issues. We present the learnings and results from case studies on two large scale cloud services in Microsoft where DeCaf successfully diagnosed 10 known and 31 unknown issues. DeCaf also automatically triages the identified issues by leveraging historical data. Our key insights are that for any such diagnosis tool to be effective in practice, it should a) scale to large volumes of service logs and attributes, b) support different types of KPIs and ranking functions, c) be integrated into the DevOps processes.
△ Less
Submitted 2 February, 2020; v1 submitted 11 October, 2019;
originally announced October 2019.
-
Swarm Robots Inspired by Friendship Formation Process
Authors:
Takeshi Kano,
Naoki matsui,
Eiichi Naito,
Takenobu Aoshima,
Akio Ishiguro
Abstract:
Swarm robotic systems are systems in which multiple robots having simple functionality perform tasks through their cooperation, and are advantageous in that they can exhibit non-trivial macroscopic functions such as adaptability, fault tolerance, and scalability. We previously proposed a simple model of swarm formation inspired by friendship formation process in human society, and demonstrated via…
▽ More
Swarm robotic systems are systems in which multiple robots having simple functionality perform tasks through their cooperation, and are advantageous in that they can exhibit non-trivial macroscopic functions such as adaptability, fault tolerance, and scalability. We previously proposed a simple model of swarm formation inspired by friendship formation process in human society, and demonstrated via simulation that various non-trivial patterns emerge. In this study, we examine the applicability of the proposed model to a swarm robotic system. As a first step, we developed five robots and demonstrated via real-world experiments that the simulation results can be largely reproduced.
△ Less
Submitted 11 August, 2018;
originally announced August 2018.
-
Judging a Book by its Description : Analyzing Gender Stereotypes in the Man Bookers Prize Winning Fiction
Authors:
Nishtha Madaan,
Sameep Mehta,
Shravika Mittal,
Ashima Suvarna
Abstract:
The presence of gender stereotypes in many aspects of society is a well-known phenomenon. In this paper, we focus on studying and quantifying such stereotypes and bias in the Man Bookers Prize winning fiction. We consider 275 books shortlisted for Man Bookers Prize between 1969 and 2017. The gender bias is analyzed by semantic modeling of book descriptions on Goodreads. This reveals the pervasiven…
▽ More
The presence of gender stereotypes in many aspects of society is a well-known phenomenon. In this paper, we focus on studying and quantifying such stereotypes and bias in the Man Bookers Prize winning fiction. We consider 275 books shortlisted for Man Bookers Prize between 1969 and 2017. The gender bias is analyzed by semantic modeling of book descriptions on Goodreads. This reveals the pervasiveness of gender bias and stereotype in the books on different features like occupation, introductions and actions associated to the characters in the book.
△ Less
Submitted 25 July, 2018;
originally announced July 2018.
-
Revisiting the Vector Space Model: Sparse Weighted Nearest-Neighbor Method for Extreme Multi-Label Classification
Authors:
Tatsuhiro Aoshima,
Kei Kobayashi,
Mihoko Minami
Abstract:
Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million);…
▽ More
Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million); thus, the classifier must handle many labels. This problem is referred to as extreme multi-label classification (XMLC). In this paper, we propose a novel approach to XMLC termed the Sparse Weighted Nearest-Neighbor Method. This technique can be derived as a fast implementation of state-of-the-art (SOTA) one-versus-rest linear classifiers for very sparse datasets. In addition, we show that the classifier can be written as a sparse generalization of a representer theorem with a linear kernel. Furthermore, our method can be viewed as the vector space model used in IR. Finally, we show that the Sparse Weighted Nearest-Neighbor Method can process data points in real time on XMLC datasets with equivalent performance to SOTA models, with a single thread and smaller storage footprint. In particular, our method exhibits superior performance to the SOTA models on a dataset with 3 million labels.
△ Less
Submitted 12 February, 2018;
originally announced February 2018.
-
Path Integral Networks: End-to-End Differentiable Optimal Control
Authors:
Masashi Okada,
Luca Rigazio,
Takenobu Aoshima
Abstract:
In this paper, we introduce Path Integral Networks (PI-Net), a recurrent network representation of the Path Integral optimal control algorithm. The network includes both system dynamics and cost models, used for optimal control based planning. PI-Net is fully differentiable, learning both dynamics and cost models end-to-end by back-propagation and stochastic gradient descent. Because of this, PI-N…
▽ More
In this paper, we introduce Path Integral Networks (PI-Net), a recurrent network representation of the Path Integral optimal control algorithm. The network includes both system dynamics and cost models, used for optimal control based planning. PI-Net is fully differentiable, learning both dynamics and cost models end-to-end by back-propagation and stochastic gradient descent. Because of this, PI-Net can learn to plan. PI-Net has several advantages: it can generalize to unseen states thanks to planning, it can be applied to continuous control tasks, and it allows for a wide variety learning schemes, including imitation and reinforcement learning. Preliminary experiment results show that PI-Net, trained by imitation learning, can mimic control demonstrations for two simulated problems; a linear system and a pendulum swing-up problem. We also show that PI-Net is able to learn dynamics and cost models latent in the demonstrations.
△ Less
Submitted 29 June, 2017;
originally announced June 2017.
-
Support vector machine and its bias correction in high-dimension, low-sample-size settings
Authors:
Yugo Nakayama,
Kazuyoshi Yata,
Makoto Aoshima
Abstract:
In this paper, we consider asymptotic properties of the support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. We show that the hard-margin linear SVM holds a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under certain severe conditions. We show that the SVM is very biased in HDLSS settings and its performance is aff…
▽ More
In this paper, we consider asymptotic properties of the support vector machine (SVM) in high-dimension, low-sample-size (HDLSS) settings. We show that the hard-margin linear SVM holds a consistency property in which misclassification rates tend to zero as the dimension goes to infinity under certain severe conditions. We show that the SVM is very biased in HDLSS settings and its performance is affected by the bias directly. In order to overcome such difficulties, we propose a bias-corrected SVM (BC-SVM). We show that the BC-SVM gives preferable performances in HDLSS settings. We also discuss the SVMs in multiclass HDLSS settings. Finally, we check the performance of the classifiers in actual data analyses.
△ Less
Submitted 26 February, 2017;
originally announced February 2017.
-
Improvised Broadcast Algorithm for Wireless Networks
Authors:
Ashima Goel,
Debasis Das
Abstract:
Broadcasting problem is an important issue in the wireless networks, especially in dynamic wireless networks. In dynamic wireless networks the node density and mobility is high, due to several problems which arise during broadcasting. Two major problems faced are namely, Broadcast Storm Problem and Disconnected network problem. In a highly dense network, if information is being flooded in a loop,…
▽ More
Broadcasting problem is an important issue in the wireless networks, especially in dynamic wireless networks. In dynamic wireless networks the node density and mobility is high, due to several problems which arise during broadcasting. Two major problems faced are namely, Broadcast Storm Problem and Disconnected network problem. In a highly dense network, if information is being flooded in a loop, it could lead to broadcast storm. The broadcast storm may eventually crash the entire network and lead to loss of information. Mobility of the nodes may lead to the problem of Disconnected Network. If the two nodes sending and receiving information are mobile with different speeds, it could lead to a disconnection between them as soon as the receiver moves out of the communication range. In this paper, we are trying to solve both the problems based on our proposed algorithms.
△ Less
Submitted 27 October, 2015;
originally announced October 2015.
-
Software Cloning in Extreme Programming Environment
Authors:
Ginika Mahajan,
Ashima
Abstract:
Software systems are evolving by adding new functions and modifying existing functions over time. Through the evolution, the structure of software is becoming more complex and so the understandability and maintainability of software systems is deteriorating day by day. These are not only important but one of the most expensive activities in software development. Refactoring has often been applied…
▽ More
Software systems are evolving by adding new functions and modifying existing functions over time. Through the evolution, the structure of software is becoming more complex and so the understandability and maintainability of software systems is deteriorating day by day. These are not only important but one of the most expensive activities in software development. Refactoring has often been applied to the software to improve them. One of the targets of refactoring is to limit Code Cloning because it hinders software maintenance and affects its quality. And in order to cope with the constant changes, refactoring is seen as an essential component of Extreme Programming. Agile Methods use refactoring as important key practice and are first choice for develo** clone-free code. This paper summarizes my overview talk on software cloning analysis. It first discusses the notion of code cloning, types of clones, reasons, its consequences and analysis. It highlights Code Cloning in Extreme Programming Environment and finds Clone Detection as effective tool for Refactoring.
△ Less
Submitted 21 August, 2014;
originally announced August 2014.
-
On the Existence of Hamiltonian Paths for History Based Pivot Rules on Acyclic Unique Sink Orientations of Hypercubes
Authors:
Yoshikazu Aoshima,
David Avis,
Theresa Deering,
Yoshitake Matsumoto,
Sonoko Moriyama
Abstract:
An acyclic USO on a hypercube is formed by directing its edges in such as way that the digraph is acyclic and each face of the hypercube has a unique sink and a unique source. A path to the global sink of an acyclic USO can be modeled as pivoting in a unit hypercube of the same dimension with an abstract objective function, and vice versa. In such a way, Zadeh's 'least entered rule' and other hist…
▽ More
An acyclic USO on a hypercube is formed by directing its edges in such as way that the digraph is acyclic and each face of the hypercube has a unique sink and a unique source. A path to the global sink of an acyclic USO can be modeled as pivoting in a unit hypercube of the same dimension with an abstract objective function, and vice versa. In such a way, Zadeh's 'least entered rule' and other history based pivot rules can be applied to the problem of finding the global sink of an acyclic USO. In this paper we present some theoretical and empirical results on the existence of acyclic USOs for which the various history based pivot rules can be made to follow a Hamiltonian path. In particular, we develop an algorithm that can enumerate all such paths up to dimension 6 using efficient pruning techniques. We show that Zadeh's original rule admits Hamiltonian paths up to dimension 9 at least, and prove that most of the other rules do not for all dimensions greater than 5.
△ Less
Submitted 24 May, 2012; v1 submitted 13 October, 2011;
originally announced October 2011.