-
Deep Oscillatory Neural Network
Authors:
Nurani Rajagopal Rohan,
Vigneswaran C,
Sayan Ghosh,
Kishore Rajendran,
Gaurav A,
V Srinivasa Chakravarthy
Abstract:
We propose a novel, brain-inspired deep neural network model known as the Deep Oscillatory Neural Network (DONN). Deep neural networks like the Recurrent Neural Networks indeed possess sequence processing capabilities but the internal states of the network are not designed to exhibit brain-like oscillatory activity. With this motivation, the DONN is designed to have oscillatory internal dynamics.…
▽ More
We propose a novel, brain-inspired deep neural network model known as the Deep Oscillatory Neural Network (DONN). Deep neural networks like the Recurrent Neural Networks indeed possess sequence processing capabilities but the internal states of the network are not designed to exhibit brain-like oscillatory activity. With this motivation, the DONN is designed to have oscillatory internal dynamics. Neurons of the DONN are either nonlinear neural oscillators or traditional neurons with sigmoidal or ReLU activation. The neural oscillator used in the model is the Hopf oscillator, with the dynamics described in the complex domain. Input can be presented to the neural oscillator in three possible modes. The sigmoid and ReLU neurons also use complex-valued extensions. All the weight stages are also complex-valued. Training follows the general principle of weight change by minimizing the output error and therefore has an overall resemblance to complex backpropagation. A generalization of DONN to convolutional networks known as the Oscillatory Convolutional Neural Network is also proposed. The two proposed oscillatory networks are applied to a variety of benchmark problems in signal and image/video processing. The performance of the proposed models is either comparable or superior to published results on the same data sets.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
RE-GrievanceAssist: Enhancing Customer Experience through ML-Powered Complaint Management
Authors:
Venkatesh C,
Harshit Oberoi,
Anurag Kumar Pandey,
Anil Goyal,
Nikhil Sikka
Abstract:
In recent years, digital platform companies have faced increasing challenges in managing customer complaints, driven by widespread consumer adoption. This paper introduces an end-to-end pipeline, named RE-GrievanceAssist, designed specifically for real estate customer complaint management. The pipeline consists of three key components: i) response/no-response ML model using TF-IDF vectorization an…
▽ More
In recent years, digital platform companies have faced increasing challenges in managing customer complaints, driven by widespread consumer adoption. This paper introduces an end-to-end pipeline, named RE-GrievanceAssist, designed specifically for real estate customer complaint management. The pipeline consists of three key components: i) response/no-response ML model using TF-IDF vectorization and XGBoost classifier ; ii) user type classifier using fasttext classifier; iii) issue/sub-issue classifier using TF-IDF vectorization and XGBoost classifier. Finally, it has been deployed as a batch job in Databricks, resulting in a remarkable 40% reduction in overall manual effort with monthly cost reduction of Rs 1,50,000 since August 2023.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
RE-RecSys: An End-to-End system for recommending properties in Real-Estate domain
Authors:
Venkatesh C,
Harshit Oberoi,
Anil Goyal,
Nikhil Sikka
Abstract:
We propose an end-to-end real-estate recommendation system, RE-RecSys, which has been productionized in real-world industry setting. We categorize any user into 4 categories based on available historical data: i) cold-start users; ii) short-term users; iii) long-term users; and iv) short-long term users. For cold-start users, we propose a novel rule-based engine that is based on the popularity of…
▽ More
We propose an end-to-end real-estate recommendation system, RE-RecSys, which has been productionized in real-world industry setting. We categorize any user into 4 categories based on available historical data: i) cold-start users; ii) short-term users; iii) long-term users; and iv) short-long term users. For cold-start users, we propose a novel rule-based engine that is based on the popularity of locality and user preferences. For short-term users, we propose to use content-filtering model which recommends properties based on recent interactions of users. For long-term and short-long term users, we propose a novel combination of content and collaborative filtering based approach which can be easily productionized in the real-world scenario. Moreover, based on the conversion rate, we have designed a novel weighing scheme for different impressions done by users on the platform for the training of content and collaborative models. Finally, we show the efficiency of the proposed pipeline, RE-RecSys, on a real-world property and clickstream dataset collected from leading real-estate platform in India. We show that the proposed pipeline is deployable in real-world scenario with an average latency of <40 ms serving 1000 rpm.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
CodePlan: Repository-level Coding using LLMs and Planning
Authors:
Ramakrishna Bairi,
Atharv Sonwane,
Aditya Kanade,
Vageesh D C,
Arun Iyer,
Suresh Parthasarathy,
Sriram Rajamani,
B. Ashok,
Shashank Shet
Abstract:
Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks.
Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succ…
▽ More
Software engineering activities such as package migration, fixing errors reports from static analysis or testing, and adding type annotations or other specifications to a codebase, involve pervasively editing the entire repository of code. We formulate these activities as repository-level coding tasks.
Recent tools like GitHub Copilot, which are powered by Large Language Models (LLMs), have succeeded in offering high-quality solutions to localized coding problems. Repository-level coding tasks are more involved and cannot be solved directly using LLMs, since code within a repository is inter-dependent and the entire repository may be too large to fit into the prompt. We frame repository-level coding as a planning problem and present a task-agnostic framework, called CodePlan to solve it. CodePlan synthesizes a multi-step chain of edits (plan), where each step results in a call to an LLM on a code location with context derived from the entire repository, previous code changes and task-specific instructions. CodePlan is based on a novel combination of an incremental dependency analysis, a change may-impact analysis and an adaptive planning algorithm.
We evaluate the effectiveness of CodePlan on two repository-level tasks: package migration (C#) and temporal code edits (Python). Each task is evaluated on multiple code repositories, each of which requires inter-dependent changes to many files (between 2-97 files). Coding tasks of this level of complexity have not been automated using LLMs before. Our results show that CodePlan has better match with the ground truth compared to baselines. CodePlan is able to get 5/6 repositories to pass the validity checks (e.g., to build without errors and make correct code edits) whereas the baselines (without planning but with the same type of contextual information as CodePlan) cannot get any of the repositories to pass them.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
When Do Neural Nets Outperform Boosted Trees on Tabular Data?
Authors:
Duncan McElfresh,
Sujay Khandagale,
Jonathan Valverde,
Vishak Prasad C,
Benjamin Feuer,
Chinmay Hegde,
Ganesh Ramakrishnan,
Micah Goldblum,
Colin White
Abstract:
Tabular data is one of the most commonly used types of data in machine learning. Despite recent advances in neural nets (NNs) for tabular data, there is still an active discussion on whether or not NNs generally outperform gradient-boosted decision trees (GBDTs) on tabular data, with several recent works arguing either that GBDTs consistently outperform NNs on tabular data, or vice versa. In this…
▽ More
Tabular data is one of the most commonly used types of data in machine learning. Despite recent advances in neural nets (NNs) for tabular data, there is still an active discussion on whether or not NNs generally outperform gradient-boosted decision trees (GBDTs) on tabular data, with several recent works arguing either that GBDTs consistently outperform NNs on tabular data, or vice versa. In this work, we take a step back and question the importance of this debate. To this end, we conduct the largest tabular data analysis to date, comparing 19 algorithms across 176 datasets, and we find that the 'NN vs. GBDT' debate is overemphasized: for a surprisingly high number of datasets, either the performance difference between GBDTs and NNs is negligible, or light hyperparameter tuning on a GBDT is more important than choosing between NNs and GBDTs. A remarkable exception is the recently-proposed prior-data fitted network, TabPFN: although it is effectively limited to training sets of size 3000, we find that it outperforms all other algorithms on average, even when randomly sampling 3000 training datapoints. Next, we analyze dozens of metafeatures to determine what properties of a dataset make NNs or GBDTs better-suited to perform well. For example, we find that GBDTs are much better than NNs at handling skewed or heavy-tailed feature distributions and other forms of dataset irregularities. Our insights act as a guide for practitioners to determine which techniques may work best on their dataset. Finally, with the goal of accelerating tabular data research, we release the TabZilla Benchmark Suite: a collection of the 36 'hardest' of the datasets we study. Our benchmark suite, codebase, and all raw results are available at https://github.com/naszilla/tabzilla.
△ Less
Submitted 30 October, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
An end-to-end, interactive Deep Learning based Annotation system for cursive and print English handwritten text
Authors:
Pranav Guruprasad,
Sujith Kumar S,
Vigneswaran C,
V. Srinivasa Chakravarthy
Abstract:
With the surging inclination towards carrying out tasks on computational devices and digital mediums, any method that converts a task that was previously carried out manually, to a digitized version, is always welcome. Irrespective of the various documentation tasks that can be done online today, there are still many applications and domains where handwritten text is inevitable, which makes the di…
▽ More
With the surging inclination towards carrying out tasks on computational devices and digital mediums, any method that converts a task that was previously carried out manually, to a digitized version, is always welcome. Irrespective of the various documentation tasks that can be done online today, there are still many applications and domains where handwritten text is inevitable, which makes the digitization of handwritten documents a very essential task. Over the past decades, there has been extensive research on offline handwritten text recognition. In the recent past, most of these attempts have shifted to Machine learning and Deep learning based approaches. In order to design more complex and deeper networks, and ensure stellar performances, it is essential to have larger quantities of annotated data. Most of the databases present for offline handwritten text recognition today, have either been manually annotated or semi automatically annotated with a lot of manual involvement. These processes are very time consuming and prone to human errors. To tackle this problem, we present an innovative, complete end-to-end pipeline, that annotates offline handwritten manuscripts written in both print and cursive English, using Deep Learning and User Interaction techniques. This novel method, which involves an architectural combination of a detection system built upon a state-of-the-art text detection model, and a custom made Deep Learning model for the recognition system, is combined with an easy-to-use interactive interface, aiming to improve the accuracy of the detection, segmentation, serialization and recognition phases, in order to ensure high quality annotated data with minimal human interaction.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Speeding up NAS with Adaptive Subset Selection
Authors:
Vishak Prasad C,
Colin White,
Paarth Jain,
Sibasis Nayak,
Ganesh Ramakrishnan
Abstract:
A majority of recent developments in neural architecture search (NAS) have been aimed at decreasing the computational cost of various techniques without affecting their final performance. Towards this goal, several low-fidelity and performance prediction methods have been considered, including those that train only on subsets of the training data. In this work, we present an adaptive subset select…
▽ More
A majority of recent developments in neural architecture search (NAS) have been aimed at decreasing the computational cost of various techniques without affecting their final performance. Towards this goal, several low-fidelity and performance prediction methods have been considered, including those that train only on subsets of the training data. In this work, we present an adaptive subset selection approach to NAS and present it as complementary to state-of-the-art NAS approaches. We uncover a natural connection between one-shot NAS algorithms and adaptive subset selection and devise an algorithm that makes use of state-of-the-art techniques from both areas. We use these techniques to substantially reduce the runtime of DARTS-PT (a leading one-shot NAS algorithm), as well as BOHB and DEHB (leading multifidelity optimization algorithms), without sacrificing accuracy. Our results are consistent across multiple datasets, and towards full reproducibility, we release our code at https: //anonymous.4open.science/r/SubsetSelection NAS-B132.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
Exploring Alternatives to Softmax Function
Authors:
Kunal Banerjee,
Vishak Prasad C,
Rishi Raj Gupta,
Karthik Vyas,
Anushree H,
Biswajit Mishra
Abstract:
Softmax function is widely used in artificial neural networks for multiclass classification, multilabel classification, attention mechanisms, etc. However, its efficacy is often questioned in literature. The log-softmax loss has been shown to belong to a more generic class of loss functions, called spherical family, and its member log-Taylor softmax loss is arguably the best alternative in this cl…
▽ More
Softmax function is widely used in artificial neural networks for multiclass classification, multilabel classification, attention mechanisms, etc. However, its efficacy is often questioned in literature. The log-softmax loss has been shown to belong to a more generic class of loss functions, called spherical family, and its member log-Taylor softmax loss is arguably the best alternative in this class. In another approach which tries to enhance the discriminative nature of the softmax function, soft-margin softmax (SM-softmax) has been proposed to be the most suitable alternative. In this work, we investigate Taylor softmax, SM-softmax and our proposed SM-Taylor softmax, an amalgamation of the earlier two functions, as alternatives to softmax function. Furthermore, we explore the effect of expanding Taylor softmax up to ten terms (original work proposed expanding only to two terms) along with the ramifications of considering Taylor softmax to be a finite or infinite series during backpropagation. Our experiments for the image classification task on different datasets reveal that there is always a configuration of the SM-Taylor softmax function that outperforms the normal softmax function and its other alternatives.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
Spanning Tree Enumeration in 2-trees: Sequential and Parallel Perspective
Authors:
Vandhana. C,
S. Hima Bindhu,
P. Renjith,
N. Sadagopan,
B. Supraja
Abstract:
For a connected graph, a vertex separator is a set of vertices whose removal creates at least two components. A vertex separator $S$ is minimal if it contains no other separator as a strict subset and a minimum vertex separator is a minimal vertex separator of least cardinality. A {\em clique} is a set of mutually adjacent vertices. A 2-tree is a connected graph in which every maximal clique is of…
▽ More
For a connected graph, a vertex separator is a set of vertices whose removal creates at least two components. A vertex separator $S$ is minimal if it contains no other separator as a strict subset and a minimum vertex separator is a minimal vertex separator of least cardinality. A {\em clique} is a set of mutually adjacent vertices. A 2-tree is a connected graph in which every maximal clique is of size three and every minimal vertex separator is of size two. A spanning tree of a graph $G$ is a connected and an acyclic subgraph of $G$. In this paper, we focus our attention on two enumeration problems, both from sequential and parallel perspective. In particular, we consider listing all possible spanning trees of a 2-tree and listing all perfect elimination orderings of a chordal graph. As far as enumeration of spanning trees is concerned, our approach is incremental in nature and towards this end, we work with the construction order of the 2-tree, i.e. enumeration of $n$-vertex trees are from $n-1$ vertex trees, $n \geq 4$. Further, we also present a parallel algorithm for spanning tree enumeration using $O(2^n)$ processors. To our knowledge, this paper makes the first attempt in designing a parallel algorithm for this problem. We conclude this paper by presenting a sequential and parallel algorithm for enumerating all Perfect Elimination Orderings of a chordal graph.
△ Less
Submitted 18 August, 2014;
originally announced August 2014.