-
In-Vivo Hyperspectral Human Brain Image Database for Brain Cancer Detection
Authors:
H. Fabelo,
S. Ortega,
A. Szolna,
D. Bulters,
J. F. Pineiro,
S. Kabwama,
A. Shanahan,
H. Bulstrode,
S. Bisshopp,
B. R. Kiran,
D. Ravi,
R. Lazcano,
D. Madronal,
C. Sosa,
C. Espino,
M. Marquez,
M. De la Luz Plaza,
R. Camacho,
D. Carrera,
M. Hernandez,
G. M. Callico,
J. Morera,
B. Stanciulescu,
G. Z. Yang,
R. Salvador
, et al. (3 additional authors not shown)
Abstract:
The use of hyperspectral imaging for medical applications is becoming more common in recent years. One of the main obstacles that researchers find when develo** hyperspectral algorithms for medical applications is the lack of specific, publicly available, and hyperspectral medical data. The work described in this paper was developed within the framework of the European project HELICoiD (HypErspe…
▽ More
The use of hyperspectral imaging for medical applications is becoming more common in recent years. One of the main obstacles that researchers find when develo** hyperspectral algorithms for medical applications is the lack of specific, publicly available, and hyperspectral medical data. The work described in this paper was developed within the framework of the European project HELICoiD (HypErspectraL Imaging Cancer Detection), which had as a main goal the application of hyperspectral imaging to the delineation of brain tumors in real-time during neurosurgical operations. In this paper, the methodology followed to generate the first hyperspectral database of in-vivo human brain tissues is presented. Data was acquired employing a customized hyperspectral acquisition system capable of capturing information in the Visual and Near InfraRed (VNIR) range from 400 to 1000 nm. Repeatability was assessed for the cases where two images of the same scene were captured consecutively. The analysis reveals that the system works more efficiently in the spectral range between 450 and 900 nm. A total of 36 hyperspectral images from 22 different patients were obtained. From these data, more than 300 000 spectral signatures were labeled employing a semi-automatic methodology based on the spectral angle mapper algorithm. Four different classes were defined: normal tissue, tumor tissue, blood vessel, and background elements. All the hyperspectral data has been made available in a public repository.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
3D printer-controlled syringe pumps for dual, active, regulable and simultaneous dispensing of reagents. Manufacturing of immunochromatographic test strips
Authors:
Gabriel Siano,
Leandro Peretti,
Juan Manuel Marquez,
Nazarena Pujato,
Leonardo Giovanini,
Claudio Berli
Abstract:
Lateral flow immunoassays (LFIA) are widely used worldwide for the detection of different analytes because they combine multiple advantages such as low production cost, simplicity, and portability, which allows biomarkers detection without requiring infrastructure or highly trained personnel. Here we propose to provide solutions to the manufacturing process of LFIA at laboratory-scale, particularl…
▽ More
Lateral flow immunoassays (LFIA) are widely used worldwide for the detection of different analytes because they combine multiple advantages such as low production cost, simplicity, and portability, which allows biomarkers detection without requiring infrastructure or highly trained personnel. Here we propose to provide solutions to the manufacturing process of LFIA at laboratory-scale, particularly to the controlled and active dispensing of the reagents in the form the Test Lines (TL) and the Control Lines (CL). To accomplish this task, we adapted a 3D printer to also control Syringe Pumps (SP), since the proposed adaptation of a 3D printer is easy, free and many laboratories already have it in their infrastructure. In turn, the standard function of the 3D printer can be easily restored by disconnecting the SPs and reconnecting the extruder. Additionally, the unified control of the 3D printer enables dual, active, regulable and simultaneous dispensing, four features that are typically found only in certain high-cost commercial equipment. With the proposed setup, the challenge of dispensing simultaneously at least 2 lines (CL and TL) with SPs controlled by a 3D printer was addressed, including regulation in the width of dispensed lines within experimental limits. Also, the construction of a LFIA for the detection of leptospirosis is shown as a practical example of automatized reagent dispensing.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems
Authors:
Kaya Stechly,
Matthew Marquez,
Subbarao Kambhampati
Abstract:
There has been considerable divergence of opinion on the reasoning abilities of Large Language Models (LLMs). While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples, a wide spread belief in their iterative self-critique capabilities persists. In this paper, we set out to systematically investigate the effectiveness of i…
▽ More
There has been considerable divergence of opinion on the reasoning abilities of Large Language Models (LLMs). While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples, a wide spread belief in their iterative self-critique capabilities persists. In this paper, we set out to systematically investigate the effectiveness of iterative prompting of LLMs in the context of Graph Coloring, a canonical NP-complete reasoning problem that is related to propositional satisfiability as well as practical problems like scheduling and allocation. We present a principled empirical study of the performance of GPT4 in solving graph coloring instances or verifying the correctness of candidate colorings. In iterative modes, we experiment with the model critiquing its own answers and an external correct reasoner verifying proposed solutions. In both cases, we analyze whether the content of the criticisms actually affects bottom line performance. The study seems to indicate that (i) LLMs are bad at solving graph coloring instances (ii) they are no better at verifying a solution--and thus are not effective in iterative modes with LLMs critiquing LLM-generated solutions (iii) the correctness and content of the criticisms--whether by LLMs or external solvers--seems largely irrelevant to the performance of iterative prompting. We show that the observed increase in effectiveness is largely due to the correct solution being fortuitously present in the top-k completions of the prompt (and being recognized as such by an external verifier). Our results thus call into question claims about the self-critiquing capabilities of state of the art LLMs.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Can Large Language Models Really Improve by Self-critiquing Their Own Plans?
Authors:
Karthik Valmeekam,
Matthew Marquez,
Subbarao Kambhampati
Abstract:
There have been widespread claims about Large Language Models (LLMs) being able to successfully verify or self-critique their candidate solutions in reasoning problems in an iterative mode. Intrigued by those claims, in this paper we set out to investigate the verification/self-critiquing abilities of large language models in the context of planning. We evaluate a planning system that employs LLMs…
▽ More
There have been widespread claims about Large Language Models (LLMs) being able to successfully verify or self-critique their candidate solutions in reasoning problems in an iterative mode. Intrigued by those claims, in this paper we set out to investigate the verification/self-critiquing abilities of large language models in the context of planning. We evaluate a planning system that employs LLMs for both plan generation and verification. We assess the verifier LLM's performance against ground-truth verification, the impact of self-critiquing on plan generation, and the influence of varying feedback levels on system performance. Using GPT-4, a state-of-the-art LLM, for both generation and verification, our findings reveal that self-critiquing appears to diminish plan generation performance, especially when compared to systems with external, sound verifiers and the LLM verifiers in that system produce a notable number of false positives, compromising the system's reliability. Additionally, the nature of feedback, whether binary or detailed, showed minimal impact on plan generation. Collectively, our results cast doubt on the effectiveness of LLMs in a self-critiquing, iterative framework for planning tasks.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Graph-Based Analysis and Visualisation of Mobility Data
Authors:
Rafael Martínez Márquez,
Giuseppe Patanè
Abstract:
Urban mobility forecast and analysis can be addressed through grid-based and graph-based models. However, graph-based representations have the advantage of more realistically depicting the mobility networks and being more robust since they allow the implementation of Graph Theory machinery, enhancing the analysis and visualisation of mobility flows. We define two types of mobility graphs: Region A…
▽ More
Urban mobility forecast and analysis can be addressed through grid-based and graph-based models. However, graph-based representations have the advantage of more realistically depicting the mobility networks and being more robust since they allow the implementation of Graph Theory machinery, enhancing the analysis and visualisation of mobility flows. We define two types of mobility graphs: Region Adjacency graphs and Origin-Destination graphs. Several node centrality metrics of graphs are applied to identify the most relevant nodes of the network in terms of graph connectivity. Additionally, the Perron vector associated with a strongly connected graph is applied to define a circulation function on the mobility graph. Such node values are visualised in the geographically embedded graphs, showing clustering patterns within the network. Since mobility graphs can be directed or undirected, we define several Graph Laplacian for both cases and show that these matrices and their spectral properties provide insightful information for network analysis. The computation of node centrality metrics and Perron-induced circulation functions for three different geographical regions demonstrate that basic elements from Graph Theory applied to mobility networks can lead to structure analysis for graphs of different connectivity, size, and orientation properties.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
On the Planning Abilities of Large Language Models : A Critical Investigation
Authors:
Karthik Valmeekam,
Matthew Marquez,
Sarath Sreedharan,
Subbarao Kambhampati
Abstract:
Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) the effectiveness of LLMs in generating plans autonomously in commonsense planning tasks and (2) the potential of LLMs in LLM-Modulo settings where they act as a source of heuristic guidance for external plan…
▽ More
Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) the effectiveness of LLMs in generating plans autonomously in commonsense planning tasks and (2) the potential of LLMs in LLM-Modulo settings where they act as a source of heuristic guidance for external planners and verifiers. We conduct a systematic study by generating a suite of instances on domains similar to the ones employed in the International Planning Competition and evaluate LLMs in two distinct modes: autonomous and heuristic. Our findings reveal that LLMs' ability to generate executable plans autonomously is rather limited, with the best model (GPT-4) having an average success rate of ~12% across the domains. However, the results in the LLM-Modulo setting show more promise. In the LLM-Modulo setting, we demonstrate that LLM-generated plans can improve the search process for underlying sound planners and additionally show that external verifiers can help provide feedback on the generated plans and back-prompt the LLM for better plan generation.
△ Less
Submitted 6 November, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark)
Authors:
Karthik Valmeekam,
Sarath Sreedharan,
Matthew Marquez,
Alberto Olmo,
Subbarao Kambhampati
Abstract:
Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) how good LLMs are by themselves in generating and validating simple plans in commonsense planning tasks (of the type that humans are generally quite good at) and (2) how good LLMs are in being a source of heu…
▽ More
Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) how good LLMs are by themselves in generating and validating simple plans in commonsense planning tasks (of the type that humans are generally quite good at) and (2) how good LLMs are in being a source of heuristic guidance for other agents--either AI planners or human planners--in their planning tasks. To investigate these questions in a systematic rather than anecdotal manner, we start by develo** a benchmark suite based on the kinds of domains employed in the International Planning Competition. On this benchmark, we evaluate LLMs in three modes: autonomous, heuristic and human-in-the-loop. Our results show that LLM's ability to autonomously generate executable plans is quite meager, averaging only about 3% success rate. The heuristic and human-in-the-loop modes show slightly more promise. In addition to these results, we also make our benchmark and evaluation tools available to support investigations by research community.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Towards customizable reinforcement learning agents: Enabling preference specification through online vocabulary expansion
Authors:
Utkarsh Soni,
Nupur Thakur,
Sarath Sreedharan,
Lin Guan,
Mudit Verma,
Matthew Marquez,
Subbarao Kambhampati
Abstract:
There is a growing interest in develo** automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function…
▽ More
There is a growing interest in develo** automated agents that can work alongside humans. In addition to completing the assigned task, such an agent will undoubtedly be expected to behave in a manner that is preferred by the human. This requires the human to communicate their preferences to the agent. To achieve this, the current approaches either require the users to specify the reward function or the preference is interactively learned from queries that ask the user to compare behavior. The former approach can be challenging if the internal representation used by the agent is inscrutable to the human while the latter is unnecessarily cumbersome for the user if their preference can be specified more easily in symbolic terms. In this work, we propose PRESCA (PREference Specification through Concept Acquisition), a system that allows users to specify their preferences in terms of concepts that they understand. PRESCA maintains a set of such concepts in a shared vocabulary. If the relevant concept is not in the shared vocabulary, then it is learned. To make learning a new concept more feedback efficient, PRESCA leverages causal associations between the target concept and concepts that are already known. In addition, we use a novel data augmentation approach to further reduce required feedback. We evaluate PRESCA by using it on a Minecraft environment and show that it can effectively align the agent with the user's preference.
△ Less
Submitted 31 January, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
Deep Optical Coding Design in Computational Imaging
Authors:
Henry Arguello,
Jorge Bacca,
Hasindu Kariyawasam,
Edwin Vargas,
Miguel Marquez,
Ramith Hettiarachchi,
Hans Garcia,
Kithmini Herath,
Udith Haputhanthri,
Balpreet Singh Ahluwalia,
Peter So,
Dushan N. Wadduwage,
Chamira U. S. Edussooriya
Abstract:
Computational optical imaging (COI) systems leverage optical coding elements (CE) in their setups to encode a high-dimensional scene in a single or multiple snapshots and decode it by using computational algorithms. The performance of COI systems highly depends on the design of its main components: the CE pattern and the computational method used to perform a given task. Conventional approaches re…
▽ More
Computational optical imaging (COI) systems leverage optical coding elements (CE) in their setups to encode a high-dimensional scene in a single or multiple snapshots and decode it by using computational algorithms. The performance of COI systems highly depends on the design of its main components: the CE pattern and the computational method used to perform a given task. Conventional approaches rely on random patterns or analytical designs to set the distribution of the CE. However, the available data and algorithm capabilities of deep neural networks (DNNs) have opened a new horizon in CE data-driven designs that jointly consider the optical encoder and computational decoder. Specifically, by modeling the COI measurements through a fully differentiable image formation model that considers the physics-based propagation of light and its interaction with the CEs, the parameters that define the CE and the computational decoder can be optimized in an end-to-end (E2E) manner. Moreover, by optimizing just CEs in the same framework, inference tasks can be performed from pure optics. This work surveys the recent advances on CE data-driven design and provides guidelines on how to parametrize different optical elements to include them in the E2E framework. Since the E2E framework can handle different inference applications by changing the loss function and the DNN, we present low-level tasks such as spectral imaging reconstruction or high-level tasks such as pose estimation with privacy preserving enhanced by using optimal task-based optical architectures. Finally, we illustrate classification and 3D object recognition applications performed at the speed of the light using all-optics DNN.
△ Less
Submitted 17 August, 2022; v1 submitted 27 June, 2022;
originally announced July 2022.
-
PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change
Authors:
Karthik Valmeekam,
Matthew Marquez,
Alberto Olmo,
Sarath Sreedharan,
Subbarao Kambhampati
Abstract:
Generating plans of action, and reasoning about change have long been considered a core competence of intelligent agents. It is thus no surprise that evaluating the planning and reasoning capabilities of large language models (LLMs) has become a hot topic of research. Most claims about LLM planning capabilities are however based on common sense tasks-where it becomes hard to tell whether LLMs are…
▽ More
Generating plans of action, and reasoning about change have long been considered a core competence of intelligent agents. It is thus no surprise that evaluating the planning and reasoning capabilities of large language models (LLMs) has become a hot topic of research. Most claims about LLM planning capabilities are however based on common sense tasks-where it becomes hard to tell whether LLMs are planning or merely retrieving from their vast world knowledge. There is a strong need for systematic and extensible planning benchmarks with sufficient diversity to evaluate whether LLMs have innate planning capabilities. Motivated by this, we propose PlanBench, an extensible benchmark suite based on the kinds of domains used in the automated planning community, especially in the International Planning Competition, to test the capabilities of LLMs in planning or reasoning about actions and change. PlanBench provides sufficient diversity in both the task domains and the specific planning capabilities. Our studies also show that on many critical capabilities-including plan generation-LLM performance falls quite short, even with the SOTA models. PlanBench can thus function as a useful marker of progress of LLMs in planning and reasoning.
△ Less
Submitted 25 November, 2023; v1 submitted 21 June, 2022;
originally announced June 2022.
-
PrivHAR: Recognizing Human Actions From Privacy-preserving Lens
Authors:
Carlos Hinojosa,
Miguel Marquez,
Henry Arguello,
Ehsan Adeli,
Li Fei-Fei,
Juan Carlos Niebles
Abstract:
The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition. In this paper, we propose an optimizing framework to provide robust visual privacy protection along the human action recognition pipeline. Our framework parameterizes the camera lens to successfully degrade the quality of the videos to inhibit pr…
▽ More
The accelerated use of digital cameras prompts an increasing concern about privacy and security, particularly in applications such as action recognition. In this paper, we propose an optimizing framework to provide robust visual privacy protection along the human action recognition pipeline. Our framework parameterizes the camera lens to successfully degrade the quality of the videos to inhibit privacy attributes and protect against adversarial attacks while maintaining relevant features for activity recognition. We validate our approach with extensive simulations and hardware experiments.
△ Less
Submitted 29 January, 2023; v1 submitted 8 June, 2022;
originally announced June 2022.