Search | arXiv e-print repository

WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents

Authors: Michael Lutz, Arth Bohra, Manvel Saroyan, Artem Harutyunyan, Giovanni Campagna

Abstract: In the realm of web agent research, achieving both generalization and accuracy remains a challenging problem. Due to high variance in website structure, existing approaches often fail. Moreover, existing fine-tuning and in-context learning techniques fail to generalize across multiple websites. We introduce Wilbur, an approach that uses a differentiable ranking model and a novel instruction synthe… ▽ More In the realm of web agent research, achieving both generalization and accuracy remains a challenging problem. Due to high variance in website structure, existing approaches often fail. Moreover, existing fine-tuning and in-context learning techniques fail to generalize across multiple websites. We introduce Wilbur, an approach that uses a differentiable ranking model and a novel instruction synthesis technique to optimally populate a black-box large language model's prompt with task demonstrations from previous runs. To maximize end-to-end success rates, we also propose an intelligent backtracking mechanism that learns and recovers from its mistakes. Finally, we show that our ranking model can be trained on data from a generative auto-curriculum which samples representative goals from an LLM, runs the agent, and automatically evaluates it, with no manual annotation. Wilbur achieves state-of-the-art results on the WebVoyager benchmark, beating text-only models by 8% overall, and up to 36% on certain websites. On the same benchmark, Wilbur is within 5% of a strong multi-modal model despite only receiving textual inputs, and further analysis reveals a substantial number of failures are due to engineering challenges of operating the web. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2310.06111 [pdf, other]

BYOC: Personalized Few-Shot Classification with Co-Authored Class Descriptions

Authors: Arth Bohra, Govert Verkes, Artem Harutyunyan, Pascal Weinberger, Giovanni Campagna

Abstract: Text classification is a well-studied and versatile building block for many NLP applications. Yet, existing approaches require either large annotated corpora to train a model with or, when using large language models as a base, require carefully crafting the prompt as well as using a long context that can fit many examples. As a result, it is not possible for end-users to build classifiers for the… ▽ More Text classification is a well-studied and versatile building block for many NLP applications. Yet, existing approaches require either large annotated corpora to train a model with or, when using large language models as a base, require carefully crafting the prompt as well as using a long context that can fit many examples. As a result, it is not possible for end-users to build classifiers for themselves. To address this issue, we propose a novel approach to few-shot text classification using an LLM. Rather than few-shot examples, the LLM is prompted with descriptions of the salient features of each class. These descriptions are coauthored by the user and the LLM interactively: while the user annotates each few-shot example, the LLM asks relevant questions that the user answers. Examples, questions, and answers are summarized to form the classification prompt. Our experiments show that our approach yields high accuracy classifiers, within 82% of the performance of models trained with significantly larger datasets while using only 1% of their training sets. Additionally, in a study with 30 participants, we show that end-users are able to build classifiers to suit their specific needs. The personalized classifiers show an average accuracy of 90%, which is 15% higher than the state-of-the-art approach. △ Less

Submitted 9 October, 2023; originally announced October 2023.

Comments: Accepted at EMNLP 2023 (Findings)

arXiv:2302.03297 [pdf, other]

AutoWS: Automated Weak Supervision Framework for Text Classification

Authors: Abhinav Bohra, Huy Nguyen, Devashish Khatwani

Abstract: Creating large, good quality labeled data has become one of the major bottlenecks for develo** machine learning applications. Multiple techniques have been developed to either decrease the dependence of labeled data (zero/few-shot learning, weak supervision) or to improve the efficiency of labeling process (active learning). Among those, Weak Supervision has been shown to reduce labeling costs b… ▽ More Creating large, good quality labeled data has become one of the major bottlenecks for develo** machine learning applications. Multiple techniques have been developed to either decrease the dependence of labeled data (zero/few-shot learning, weak supervision) or to improve the efficiency of labeling process (active learning). Among those, Weak Supervision has been shown to reduce labeling costs by employing hand crafted labeling functions designed by domain experts. We propose AutoWS -- a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts. Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data. Noisy labels can then be aggregated into probabilistic labels used by a downstream discriminative classifier. Our framework is fully automatic and requires no hyper-parameter specification by users. We compare our approach with different state-of-the-art work on weak supervision and noisy training. Experimental results show that our method outperforms competitive baselines. △ Less

Submitted 7 February, 2023; originally announced February 2023.

arXiv:2210.12467 [pdf, other]

ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts

Authors: Rajdeep Mukherjee, Abhinav Bohra, Akash Banerjee, Soumya Sharma, Manjunath Hegde, Afreen Shaikh, Shivani Shrivastava, Koustuv Dasgupta, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal

Abstract: Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports. Efficient techniques to summarize financial documents, including facts and figures, have largely been unexplored, majorly due to the unavailability of sui… ▽ More Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports. Efficient techniques to summarize financial documents, including facts and figures, have largely been unexplored, majorly due to the unavailability of suitable datasets. In this work, we present ECTSum, a new dataset with transcripts of earnings calls (ECTs), hosted by publicly traded companies, as documents, and short experts-written telegram-style bullet point summaries derived from corresponding Reuters articles. ECTs are long unstructured documents without any prescribed length limit or format. We benchmark our dataset with state-of-the-art summarizers across various metrics evaluating the content quality and factual consistency of the generated summaries. Finally, we present a simple-yet-effective approach, ECT-BPS, to generate a set of bullet points that precisely capture the important facts discussed in the calls. △ Less

Submitted 26 October, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

Comments: 14 pages; Accepted as a Long Paper in EMNLP 2022 (Main Conference); Codes: https://github.com/rajdeep345/ECTSum

ACM Class: I.2.7

arXiv:2208.00616 [pdf]

Sustainability of large scale waste heat harvesting using thermoelectric

Authors: Anilkumar Bohra, Satish Vitta

Abstract: The amount of waste heat exergy generated globally is 69.058 EJ which can be divided into, low temperature 373 K, 30.496 EJ, medium temperature 373 K to 573 K, 14.431 EJ and high temperature 573 K, 24.131 EJ. These values of exergy have been used to determine the minimum number of pn junctions required to convert the exergy into electrical power. It is found that the number of junctions required t… ▽ More The amount of waste heat exergy generated globally is 69.058 EJ which can be divided into, low temperature 373 K, 30.496 EJ, medium temperature 373 K to 573 K, 14.431 EJ and high temperature 573 K, 24.131 EJ. These values of exergy have been used to determine the minimum number of pn junctions required to convert the exergy into electrical power. It is found that the number of junctions required to convert high temperature exergy increases from 8.22x10^11 to 24.66x10^11 when the aspect ratio of the legs increases from 0.5 cm^1 to 1.5 cm^1. To convert the low temperature exergy, 81.76x10^11 to 245.25x10^11 junctions will be required depending on the legs aspect ratio. The quantity of alloys containing elements such as Pb, Bi, Te, Sb, Se and Sn required to synthesize these junctions therefore is of the order of millions of tons which means the elements required is also of similar magnitude. The current world production of these elements however falls far short of this requirement, indicating significant supply chain risk. The production of these elements, even if resources are available, will emit millions of tons of CO2 showing that current alloys are non-sustainable for waste heat recovery. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: 32 pages, 5 figures

arXiv:2111.10762 [pdf, other]

COVID-19 Detection through Deep Feature Extraction

Authors: Jash Dalvi, Aziz Bohra

Abstract: The SARS-CoV2 virus has caused a lot of tribulation to the human population. Predictive modeling that can accurately determine whether a person is infected with COVID-19 is imperative. The study proposes a novel approach that utilizes deep feature extraction technique, pre-trained ResNet50 acting as the backbone of the network, combined with Logistic Regression as the head model. The proposed mode… ▽ More The SARS-CoV2 virus has caused a lot of tribulation to the human population. Predictive modeling that can accurately determine whether a person is infected with COVID-19 is imperative. The study proposes a novel approach that utilizes deep feature extraction technique, pre-trained ResNet50 acting as the backbone of the network, combined with Logistic Regression as the head model. The proposed model has been trained on Kaggle COVID-19 Radiography Dataset. The proposed model achieves a cross-validation accuracy of 100% on the COVID-19 and Normal X-Ray image classes. Similarly, when tested on combined three classes, the proposed model achieves 98.84% accuracy. △ Less

Submitted 21 November, 2021; originally announced November 2021.

arXiv:2105.03602 [pdf, ps, other]

Permanents of $3\times3$ Invertible Matrices Modulo $n$

Authors: Ayush Bohra, A. Satyanarayana Reddy

Abstract: We count the number of elements in the set $$G_{3}(n,x) = \{M \in GL_3(\mathbb{Z}_n) \mid perm(M) \equiv x \pmod{n} \}.$$ We count the number of elements in the set $$G_{3}(n,x) = \{M \in GL_3(\mathbb{Z}_n) \mid perm(M) \equiv x \pmod{n} \}.$$ △ Less

Submitted 8 May, 2021; originally announced May 2021.

Comments: 13 pages

MSC Class: 05B10; 15A15

arXiv:2103.11077 [pdf, ps, other]

Permanents of $2\times 2$ Matrices Modulo $n$

Authors: Ayush Bohra, A. Satyanarayana Reddy

Abstract: In this article we compute the number of invertible $2\times 2$ matrices with integer entries modulo $n$ whose permanents are congruent modulo $n$ to a given integer $x$. In this article we compute the number of invertible $2\times 2$ matrices with integer entries modulo $n$ whose permanents are congruent modulo $n$ to a given integer $x$. △ Less

Submitted 19 March, 2021; originally announced March 2021.

Comments: 5 pages

MSC Class: 05B10; 15A15

Journal ref: PUMP Journal of Undergraduate Research, Vol4, 2021, 141-145

arXiv:math/0509581 [pdf, ps, other]

Boxicity of Series Parallel Graphs

Authors: Ankur Bohra, L. Sunil Chandran, J. Krishnam Raju

Abstract: The three well-known graph classes, planar graphs (P), series-parallel graphs(SP) and outer planar graphs(OP) satisfy the following proper inclusion relation: OP C SP C P. It is known that box(G) <= 3 if G belongs to P and box(G) <= 2 if G belongs to OP. Thus it is interesting to decide whether the maximum possible value of the boxicity of series-parallel graphs is 2 or 3. In this paper we const… ▽ More The three well-known graph classes, planar graphs (P), series-parallel graphs(SP) and outer planar graphs(OP) satisfy the following proper inclusion relation: OP C SP C P. It is known that box(G) <= 3 if G belongs to P and box(G) <= 2 if G belongs to OP. Thus it is interesting to decide whether the maximum possible value of the boxicity of series-parallel graphs is 2 or 3. In this paper we construct a series-parallel graph with boxicity 3, thus resolving this question. Recently Chandran and Sivadasan showed that for any G, box(G) <= treewidth(G)+2. They conjecture that for any k, there exists a k-tree with boxicity k+1. (This would show that their upper bound is tight but for an additive factor of 1, since the treewidth of any k-tree equals k.) The series-parallel graph we construct in this paper is a 2-tree with boxicity 3 and is thus a first step towards proving their conjecture. △ Less

Submitted 24 September, 2005; originally announced September 2005.

Comments: 10 pages, 0 figures

MSC Class: 05C62

Showing 1–9 of 9 results for author: Bohra, A