-
WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents
Authors:
Michael Lutz,
Arth Bohra,
Manvel Saroyan,
Artem Harutyunyan,
Giovanni Campagna
Abstract:
In the realm of web agent research, achieving both generalization and accuracy remains a challenging problem. Due to high variance in website structure, existing approaches often fail. Moreover, existing fine-tuning and in-context learning techniques fail to generalize across multiple websites. We introduce Wilbur, an approach that uses a differentiable ranking model and a novel instruction synthe…
▽ More
In the realm of web agent research, achieving both generalization and accuracy remains a challenging problem. Due to high variance in website structure, existing approaches often fail. Moreover, existing fine-tuning and in-context learning techniques fail to generalize across multiple websites. We introduce Wilbur, an approach that uses a differentiable ranking model and a novel instruction synthesis technique to optimally populate a black-box large language model's prompt with task demonstrations from previous runs. To maximize end-to-end success rates, we also propose an intelligent backtracking mechanism that learns and recovers from its mistakes. Finally, we show that our ranking model can be trained on data from a generative auto-curriculum which samples representative goals from an LLM, runs the agent, and automatically evaluates it, with no manual annotation. Wilbur achieves state-of-the-art results on the WebVoyager benchmark, beating text-only models by 8% overall, and up to 36% on certain websites. On the same benchmark, Wilbur is within 5% of a strong multi-modal model despite only receiving textual inputs, and further analysis reveals a substantial number of failures are due to engineering challenges of operating the web.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
BYOC: Personalized Few-Shot Classification with Co-Authored Class Descriptions
Authors:
Arth Bohra,
Govert Verkes,
Artem Harutyunyan,
Pascal Weinberger,
Giovanni Campagna
Abstract:
Text classification is a well-studied and versatile building block for many NLP applications. Yet, existing approaches require either large annotated corpora to train a model with or, when using large language models as a base, require carefully crafting the prompt as well as using a long context that can fit many examples. As a result, it is not possible for end-users to build classifiers for the…
▽ More
Text classification is a well-studied and versatile building block for many NLP applications. Yet, existing approaches require either large annotated corpora to train a model with or, when using large language models as a base, require carefully crafting the prompt as well as using a long context that can fit many examples. As a result, it is not possible for end-users to build classifiers for themselves. To address this issue, we propose a novel approach to few-shot text classification using an LLM. Rather than few-shot examples, the LLM is prompted with descriptions of the salient features of each class. These descriptions are coauthored by the user and the LLM interactively: while the user annotates each few-shot example, the LLM asks relevant questions that the user answers. Examples, questions, and answers are summarized to form the classification prompt. Our experiments show that our approach yields high accuracy classifiers, within 82% of the performance of models trained with significantly larger datasets while using only 1% of their training sets. Additionally, in a study with 30 participants, we show that end-users are able to build classifiers to suit their specific needs. The personalized classifiers show an average accuracy of 90%, which is 15% higher than the state-of-the-art approach.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
AutoWS: Automated Weak Supervision Framework for Text Classification
Authors:
Abhinav Bohra,
Huy Nguyen,
Devashish Khatwani
Abstract:
Creating large, good quality labeled data has become one of the major bottlenecks for develo** machine learning applications. Multiple techniques have been developed to either decrease the dependence of labeled data (zero/few-shot learning, weak supervision) or to improve the efficiency of labeling process (active learning). Among those, Weak Supervision has been shown to reduce labeling costs b…
▽ More
Creating large, good quality labeled data has become one of the major bottlenecks for develo** machine learning applications. Multiple techniques have been developed to either decrease the dependence of labeled data (zero/few-shot learning, weak supervision) or to improve the efficiency of labeling process (active learning). Among those, Weak Supervision has been shown to reduce labeling costs by employing hand crafted labeling functions designed by domain experts. We propose AutoWS -- a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts. Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data. Noisy labels can then be aggregated into probabilistic labels used by a downstream discriminative classifier. Our framework is fully automatic and requires no hyper-parameter specification by users. We compare our approach with different state-of-the-art work on weak supervision and noisy training. Experimental results show that our method outperforms competitive baselines.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts
Authors:
Rajdeep Mukherjee,
Abhinav Bohra,
Akash Banerjee,
Soumya Sharma,
Manjunath Hegde,
Afreen Shaikh,
Shivani Shrivastava,
Koustuv Dasgupta,
Niloy Ganguly,
Saptarshi Ghosh,
Pawan Goyal
Abstract:
Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports. Efficient techniques to summarize financial documents, including facts and figures, have largely been unexplored, majorly due to the unavailability of sui…
▽ More
Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports. Efficient techniques to summarize financial documents, including facts and figures, have largely been unexplored, majorly due to the unavailability of suitable datasets. In this work, we present ECTSum, a new dataset with transcripts of earnings calls (ECTs), hosted by publicly traded companies, as documents, and short experts-written telegram-style bullet point summaries derived from corresponding Reuters articles. ECTs are long unstructured documents without any prescribed length limit or format. We benchmark our dataset with state-of-the-art summarizers across various metrics evaluating the content quality and factual consistency of the generated summaries. Finally, we present a simple-yet-effective approach, ECT-BPS, to generate a set of bullet points that precisely capture the important facts discussed in the calls.
△ Less
Submitted 26 October, 2022; v1 submitted 22 October, 2022;
originally announced October 2022.
-
Sustainability of large scale waste heat harvesting using thermoelectric
Authors:
Anilkumar Bohra,
Satish Vitta
Abstract:
The amount of waste heat exergy generated globally is 69.058 EJ which can be divided into, low temperature 373 K, 30.496 EJ, medium temperature 373 K to 573 K, 14.431 EJ and high temperature 573 K, 24.131 EJ. These values of exergy have been used to determine the minimum number of pn junctions required to convert the exergy into electrical power. It is found that the number of junctions required t…
▽ More
The amount of waste heat exergy generated globally is 69.058 EJ which can be divided into, low temperature 373 K, 30.496 EJ, medium temperature 373 K to 573 K, 14.431 EJ and high temperature 573 K, 24.131 EJ. These values of exergy have been used to determine the minimum number of pn junctions required to convert the exergy into electrical power. It is found that the number of junctions required to convert high temperature exergy increases from 8.22x10^11 to 24.66x10^11 when the aspect ratio of the legs increases from 0.5 cm^1 to 1.5 cm^1. To convert the low temperature exergy, 81.76x10^11 to 245.25x10^11 junctions will be required depending on the legs aspect ratio. The quantity of alloys containing elements such as Pb, Bi, Te, Sb, Se and Sn required to synthesize these junctions therefore is of the order of millions of tons which means the elements required is also of similar magnitude. The current world production of these elements however falls far short of this requirement, indicating significant supply chain risk. The production of these elements, even if resources are available, will emit millions of tons of CO2 showing that current alloys are non-sustainable for waste heat recovery.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
COVID-19 Detection through Deep Feature Extraction
Authors:
Jash Dalvi,
Aziz Bohra
Abstract:
The SARS-CoV2 virus has caused a lot of tribulation to the human population. Predictive modeling that can accurately determine whether a person is infected with COVID-19 is imperative. The study proposes a novel approach that utilizes deep feature extraction technique, pre-trained ResNet50 acting as the backbone of the network, combined with Logistic Regression as the head model. The proposed mode…
▽ More
The SARS-CoV2 virus has caused a lot of tribulation to the human population. Predictive modeling that can accurately determine whether a person is infected with COVID-19 is imperative. The study proposes a novel approach that utilizes deep feature extraction technique, pre-trained ResNet50 acting as the backbone of the network, combined with Logistic Regression as the head model. The proposed model has been trained on Kaggle COVID-19 Radiography Dataset. The proposed model achieves a cross-validation accuracy of 100% on the COVID-19 and Normal X-Ray image classes. Similarly, when tested on combined three classes, the proposed model achieves 98.84% accuracy.
△ Less
Submitted 21 November, 2021;
originally announced November 2021.
-
Permanents of $3\times3$ Invertible Matrices Modulo $n$
Authors:
Ayush Bohra,
A. Satyanarayana Reddy
Abstract:
We count the number of elements in the set $$G_{3}(n,x) = \{M \in GL_3(\mathbb{Z}_n) \mid perm(M) \equiv x \pmod{n} \}.$$
We count the number of elements in the set $$G_{3}(n,x) = \{M \in GL_3(\mathbb{Z}_n) \mid perm(M) \equiv x \pmod{n} \}.$$
△ Less
Submitted 8 May, 2021;
originally announced May 2021.
-
Permanents of $2\times 2$ Matrices Modulo $n$
Authors:
Ayush Bohra,
A. Satyanarayana Reddy
Abstract:
In this article we compute the number of invertible $2\times 2$ matrices with integer entries modulo $n$ whose permanents are congruent modulo $n$ to a given integer $x$.
In this article we compute the number of invertible $2\times 2$ matrices with integer entries modulo $n$ whose permanents are congruent modulo $n$ to a given integer $x$.
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
Boxicity of Series Parallel Graphs
Authors:
Ankur Bohra,
L. Sunil Chandran,
J. Krishnam Raju
Abstract:
The three well-known graph classes, planar graphs (P), series-parallel graphs(SP) and outer planar graphs(OP) satisfy the following proper inclusion relation: OP C SP C P. It is known that box(G) <= 3 if G belongs to P and box(G) <= 2 if G belongs to OP. Thus it is interesting to decide whether the maximum possible value of the boxicity of series-parallel graphs is 2 or 3. In this paper we const…
▽ More
The three well-known graph classes, planar graphs (P), series-parallel graphs(SP) and outer planar graphs(OP) satisfy the following proper inclusion relation: OP C SP C P. It is known that box(G) <= 3 if G belongs to P and box(G) <= 2 if G belongs to OP. Thus it is interesting to decide whether the maximum possible value of the boxicity of series-parallel graphs is 2 or 3. In this paper we construct a series-parallel graph with boxicity 3, thus resolving this question. Recently Chandran and Sivadasan showed that for any G, box(G) <= treewidth(G)+2. They conjecture that for any k, there exists a k-tree with boxicity k+1. (This would show that their upper bound is tight but for an additive factor of 1, since the treewidth of any k-tree equals k.) The series-parallel graph we construct in this paper is a 2-tree with boxicity 3 and is thus a first step towards proving their conjecture.
△ Less
Submitted 24 September, 2005;
originally announced September 2005.