Skip to main content

Showing 1–50 of 77 results for author: Castro, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01302  [pdf, other

    cs.CV cs.AI cs.RO

    Robot Instance Segmentation with Few Annotations for Gras**

    Authors: Moshe Kimhi, David Vainshtein, Chaim Baskin, Dotan Di Castro

    Abstract: The ability of robots to manipulate objects relies heavily on their aptitude for visual perception. In domains characterized by cluttered scenes and high object variability, most methods call for vast labeled datasets, laboriously hand-annotated, with the aim of training capable models. Once deployed, the challenge of generalizing to unfamiliar objects implies that the model must evolve alongside… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.16093  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Towards Natural Language-Driven Assembly Using Foundation Models

    Authors: Omkar Joglekar, Tal Lancewicki, Shir Kozlovsky, Vladimir Tchuiev, Zohar Feldman, Dotan Di Castro

    Abstract: Large Language Models (LLMs) and strong vision models have enabled rapid research and development in the field of Vision-Language-Action models that enable robotic control. The main objective of these methods is to develop a generalist policy that can control robots with various embodiments. However, in industrial robotic applications such as automated assembly and disassembly, some tasks, such as… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  3. arXiv:2406.04449  [pdf, other

    cs.CL cs.CV

    MAIRA-2: Grounded Radiology Report Generation

    Authors: Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Anton Schwaighofer, Sam Bond-Taylor, Maximilian Ilse, Fernando Pérez-García, Valentina Salvatelli, Harshita Sharma, Felix Meissen, Mercy Ranjit, Shaury Srivastav, Julia Gong, Fabian Falck, Ozan Oktay, Anja Thieme, Matthew P. Lungren, Maria Teodora Wetscherek, Javier Alvarez-Valle, Stephanie L. Hyland

    Abstract: Radiology reporting is a complex task that requires detailed image understanding, integration of multiple inputs, including comparison with prior imaging, and precise language generation. This makes it ideal for the development and use of generative multimodal models. Here, we extend report generation to include the localisation of individual findings on the image - a task we call grounded report… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 44 pages, 20 figures

  4. arXiv:2406.02158  [pdf, other

    cs.CV cs.LG

    Radar Spectra-Language Model for Automotive Scene Parsing

    Authors: Mariia Pushkareva, Yuri Feldman, Csaba Domokos, Kilian Rambach, Dotan Di Castro

    Abstract: Radar sensors are low cost, long-range, and weather-resilient. Therefore, they are widely used for driver assistance functions, and are expected to be crucial for the success of autonomous driving in the future. In many perception tasks only pre-processed radar point clouds are considered. In contrast, radar spectra are a raw form of radar measurements and contain more information than radar point… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  5. arXiv:2405.11065  [pdf, other

    cs.MS cs.DC cs.SE

    Enabling mixed-precision with the help of tools: A Nekbone case study

    Authors: Yanxiang Chen, Pablo de Oliveira Castro, Paolo Bientinesi, Roman Iakymchuk

    Abstract: Mixed-precision computing has the potential to significantly reduce the cost of exascale computations, but determining when and how to implement it in programs can be challenging. In this article, we consider Nekbone, a mini-application for the CFD solver Nek5000, as a case study, and propose a methodology for enabling mixed-precision with the help of computer arithmetic tools and roofline model.… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  6. arXiv:2405.05299  [pdf, other

    cs.HC cs.AI

    Challenges for Responsible AI Design and Workflow Integration in Healthcare: A Case Study of Automatic Feeding Tube Qualification in Radiology

    Authors: Anja Thieme, Abhijith Rajamohan, Benjamin Cooper, Heather Groombridge, Robert Simister, Barney Wong, Nicholas Woznitza, Mark Ames Pinnock, Maria Teodora Wetscherek, Cecily Morrison, Hannah Richardson, Fernando Pérez-García, Stephanie L. Hyland, Shruthi Bannur, Daniel C. Castro, Kenza Bouzid, Anton Schwaighofer, Mercy Ranjit, Harshita Sharma, Matthew P. Lungren, Ozan Oktay, Javier Alvarez-Valle, Aditya Nori, Stephen Harris, Joseph Jacob

    Abstract: Nasogastric tubes (NGTs) are feeding tubes that are inserted through the nose into the stomach to deliver nutrition or medication. If not placed correctly, they can cause serious harm, even death to patients. Recent AI developments demonstrate the feasibility of robustly detecting NGT placement from Chest X-ray images to reduce risks of sub-optimally or critically placed NGTs being missed or delay… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    ACM Class: H.5.m; I.2.m

  7. arXiv:2403.15821  [pdf, other

    cs.SE

    Local Features: Enhancing Variability Modeling in Software Product Lines

    Authors: David de Castro, Alejandro Cortiñas, Miguel R. Luaces, Oscar Pedreira, Ángeles Saavedra Places

    Abstract: Context and motivation: Software Product Lines (SPL) enable the creation of software product families with shared core components using feature models to model variability. Choosing features from a feature model to generate a product may not be sufficient in certain situations because the application engineer may need to be able to decide on configuration time the system's elements to which a cert… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  8. Multimodal Healthcare AI: Identifying and Designing Clinically Relevant Vision-Language Applications for Radiology

    Authors: Nur Yildirim, Hannah Richardson, Maria T. Wetscherek, Junaid Bajwa, Joseph Jacob, Mark A. Pinnock, Stephen Harris, Daniel Coelho de Castro, Shruthi Bannur, Stephanie L. Hyland, Pratik Ghosh, Mercy Ranjit, Kenza Bouzid, Anton Schwaighofer, Fernando Pérez-García, Harshita Sharma, Ozan Oktay, Matthew Lungren, Javier Alvarez-Valle, Aditya Nori, Anja Thieme

    Abstract: Recent advances in AI combine large language models (LLMs) with vision encoders that bring forward unprecedented technical capabilities to leverage for a wide range of healthcare applications. Focusing on the domain of radiology, vision-language models (VLMs) achieve good performance results for tasks such as generating radiology findings based on a patient's medical image, or answering visual que… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: to appear at CHI 2024

  9. arXiv:2402.11996  [pdf, other

    cs.CV cs.LG

    ISCUTE: Instance Segmentation of Cables Using Text Embedding

    Authors: Shir Kozlovsky, Omkar Joglekar, Dotan Di Castro

    Abstract: In the field of robotics and automation, conventional object recognition and instance segmentation methods face a formidable challenge when it comes to perceiving Deformable Linear Objects (DLOs) like wires, cables, and flexible tubes. This challenge arises primarily from the lack of distinct attributes such as shape, color, and texture, which calls for tailored solutions to achieve precise identi… ▽ More

    Submitted 27 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  10. arXiv:2402.04046  [pdf, other

    cs.SI cs.AI cs.LG

    Generative Modeling of Graphs via Joint Diffusion of Node and Edge Attributes

    Authors: Nimrod Berman, Eitan Kosman, Dotan Di Castro, Omri Azencot

    Abstract: Graph generation is integral to various engineering and scientific disciplines. Nevertheless, existing methodologies tend to overlook the generation of edge attributes. However, we identify critical applications where edge attributes are essential, making prior methods potentially unsuitable in such contexts. Moreover, while trivial adaptations are available, empirical investigations reveal their… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  11. arXiv:2401.10815  [pdf, other

    cs.CV

    RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision

    Authors: Fernando Pérez-García, Harshita Sharma, Sam Bond-Taylor, Kenza Bouzid, Valentina Salvatelli, Maximilian Ilse, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Matthew P. Lungren, Maria Wetscherek, Noel Codella, Stephanie L. Hyland, Javier Alvarez-Valle, Ozan Oktay

    Abstract: Language-supervised pre-training has proven to be a valuable method for extracting semantically meaningful features from images, serving as a foundational element in multimodal systems within the computer vision and medical imaging domains. However, resulting features are limited by the information contained within the text. This is particularly problematic in medical imaging, where radiologists'… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  12. PIM-STM: Software Transactional Memory for Processing-In-Memory Systems

    Authors: André Lopes, Daniel Castro, Paolo Romano

    Abstract: Processing-In-Memory (PIM) is a novel approach that augments existing DRAM memory chips with lightweight logic. By allowing to offload computations to the PIM system, this architecture allows for circumventing the data-bottleneck problem that affects many modern workloads. This work tackles the problem of how to build efficient software implementations of the Transactional Memory (TM) abstraction… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: To be published in 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (ASPLOS '24), April 27-May 1, 2024, La Jolla, CA, USA

    ACM Class: B.3.3; B.8.2; C.4; D.1.3

  13. arXiv:2401.06890  [pdf, other

    cs.LG

    An Axiomatic Approach to Model-Agnostic Concept Explanations

    Authors: Zhili Feng, Michal Moshkovitz, Dotan Di Castro, J. Zico Kolter

    Abstract: Concept explanation is a popular approach for examining how human-interpretable concepts impact the predictions of a model. However, most existing methods for concept explanations are tailored to specific models. To address this issue, this paper focuses on model-agnostic measures. Specifically, we propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivit… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  14. arXiv:2312.12865  [pdf, other

    cs.CV cs.AI

    RadEdit: stress-testing biomedical vision models via diffusion image editing

    Authors: Fernando Pérez-García, Sam Bond-Taylor, Pedro P. Sanchez, Boris van Breugel, Daniel C. Castro, Harshita Sharma, Valentina Salvatelli, Maria T. A. Wetscherek, Hannah Richardson, Matthew P. Lungren, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay, Maximilian Ilse

    Abstract: Biomedical imaging datasets are often small and biased, meaning that real-world performance of predictive models can be substantially lower than expected from internal testing. This work proposes using generative image editing to simulate dataset shifts and diagnose failure modes of biomedical vision models; this can be used in advance of deployment to assess readiness, potentially reducing cost a… ▽ More

    Submitted 3 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

  15. arXiv:2311.13668  [pdf, other

    cs.CL cs.AI cs.CV

    MAIRA-1: A specialised large multimodal model for radiology report generation

    Authors: Stephanie L. Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Mercy Ranjit, Anton Schwaighofer, Fernando Pérez-García, Valentina Salvatelli, Shaury Srivastav, Anja Thieme, Noel Codella, Matthew P. Lungren, Maria Teodora Wetscherek, Ozan Oktay, Javier Alvarez-Valle

    Abstract: We present a radiology-specific multimodal model for the task for generating radiological reports from chest X-rays (CXRs). Our work builds on the idea that large language model(s) can be equipped with multimodal capabilities through alignment with pre-trained vision encoders. On natural images, this has been shown to allow multimodal models to gain image understanding and description capabilities… ▽ More

    Submitted 26 April, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

    Comments: 18 pages, 9 tables, 5 figures. v2 adds test IDs and image encoder citation. v3 fixes error in NPV/specificity

  16. arXiv:2310.14573  [pdf, other

    cs.CL

    Exploring the Boundaries of GPT-4 in Radiology

    Authors: Qianchu Liu, Stephanie Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C. Castro, Maria Teodora Wetscherek, Robert Tinn, Harshita Sharma, Fernando Pérez-García, Anton Schwaighofer, Pranav Rajpurkar, Sameer Tajdin Khanna, Hoifung Poon, Naoto Usuyama, Anja Thieme, Aditya V. Nori, Matthew P. Lungren, Ozan Oktay, Javier Alvarez-Valle

    Abstract: The recent success of general-domain large language models (LLMs) has significantly changed the natural language processing paradigm towards a unified foundation model across domains and applications. In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-s… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 main

  17. arXiv:2308.01990  [pdf, other

    cs.CR

    From Prompt Injections to SQL Injection Attacks: How Protected is Your LLM-Integrated Web Application?

    Authors: Rodrigo Pedro, Daniel Castro, Paulo Carreira, Nuno Santos

    Abstract: Large Language Models (LLMs) have found widespread applications in various domains, including web applications, where they facilitate human interaction via chatbots with natural language interfaces. Internally, aided by an LLM-integration middleware such as Langchain, user prompts are translated into SQL queries used by the LLM to provide meaningful responses to users. However, unsanitized user pr… ▽ More

    Submitted 15 August, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

    Comments: 12 pages, 3 figures, 3 tables, 5 listings

  18. arXiv:2307.16526  [pdf, other

    cs.LG cs.AI cs.CV

    No Fair Lunch: A Causal Perspective on Dataset Bias in Machine Learning for Medical Imaging

    Authors: Charles Jones, Daniel C. Castro, Fabio De Sousa Ribeiro, Ozan Oktay, Melissa McCradden, Ben Glocker

    Abstract: As machine learning methods gain prominence within clinical decision-making, addressing fairness concerns becomes increasingly urgent. Despite considerable work dedicated to detecting and ameliorating algorithmic bias, today's methods are deficient with potentially harmful consequences. Our causal perspective sheds new light on algorithmic bias, highlighting how different sources of dataset bias m… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  19. arXiv:2306.13630  [pdf, other

    cs.RO cs.AI cs.LG

    Offline Skill Graph (OSG): A Framework for Learning and Planning using Offline Reinforcement Learning Skills

    Authors: Ben-ya Halevy, Yehudit Aperstein, Dotan Di Castro

    Abstract: Reinforcement Learning has received wide interest due to its success in competitive games. Yet, its adoption in everyday applications is limited (e.g. industrial, home, healthcare, etc.). In this paper, we address this limitation by presenting a framework for planning over offline skills and solving complex tasks in real-world environments. Our framework is comprised of three modules that together… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  20. arXiv:2304.05763  [pdf, ps, other

    cs.GT math.DS

    Learning coordination through new actions

    Authors: Sofia B. S. D. Castro

    Abstract: We provide a novel approach to achieving a desired outcome in a coordination game: the original 2x2 game is embedded in a 2x3 game where one of the players may use a third action. For a large set of payoff values only one of the Nash equilibria of the original 2x2 game is stable under replicator dynamics. We show that this Nash equilibrium is the ω-limit of all initial conditions in the interior o… ▽ More

    Submitted 19 January, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

    MSC Class: 34C99; 37C75; 91A05; 91A10; 91A22

  21. arXiv:2303.15827  [pdf, other

    cs.LG math.NA stat.ML

    CONFIDE: Contextual Finite Differences Modelling of PDEs

    Authors: Ori Linial, Orly Avner, Dotan Di Castro

    Abstract: We introduce a method for inferring an explicit PDE from a data sample generated by previously unseen dynamics, based on a learned context. The training phase integrates knowledge of the form of the equation with a differential scheme, while the inference phase yields a PDE that fits the data sample and enables both signal prediction and data explanation. We include results of extensive experiment… ▽ More

    Submitted 7 June, 2024; v1 submitted 28 March, 2023; originally announced March 2023.

  22. arXiv:2303.01274  [pdf

    cs.CV cs.LG

    Measuring axiomatic soundness of counterfactual image models

    Authors: Miguel Monteiro, Fabio De Sousa Ribeiro, Nick Pawlowski, Daniel C. Castro, Ben Glocker

    Abstract: We present a general framework for evaluating image counterfactuals. The power and flexibility of deep generative models make them valuable tools for learning mechanisms in structural causal models. However, their flexibility makes counterfactual identifiability impossible in the general case. Motivated by these issues, we revisit Pearl's axiomatic definition of counterfactuals to determine the ne… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: Counterfactual inference, Generative Models, Computer Vision, Published in ICLR 2023

    Journal ref: The Eleventh International Conference on Learning Representations, ICLR 2023

  23. arXiv:2301.04558  [pdf, other

    cs.CV cs.CL

    Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing

    Authors: Shruthi Bannur, Stephanie Hyland, Qianchu Liu, Fernando Pérez-García, Maximilian Ilse, Daniel C. Castro, Benedikt Boecking, Harshita Sharma, Kenza Bouzid, Anja Thieme, Anton Schwaighofer, Maria Wetscherek, Matthew P. Lungren, Aditya Nori, Javier Alvarez-Valle, Ozan Oktay

    Abstract: Self-supervised learning in vision-language processing exploits semantic alignment between imaging and text modalities. Prior work in biomedical VLP has mostly relied on the alignment of single image and report pairs even though clinical notes commonly refer to prior images. This does not only introduce poor alignment between the modalities but also a missed opportunity to exploit rich self-superv… ▽ More

    Submitted 16 March, 2023; v1 submitted 11 January, 2023; originally announced January 2023.

    Comments: To appear in CVPR 2023

  24. arXiv:2211.01886  [pdf, other

    cs.CV

    Analysing the effectiveness of a generative model for semi-supervised medical image segmentation

    Authors: Margherita Rosnati, Fabio De Sousa Ribeiro, Miguel Monteiro, Daniel Coelho de Castro, Ben Glocker

    Abstract: Image segmentation is important in medical imaging, providing valuable, quantitative information for clinical decision-making in diagnosis, therapy, and intervention. The state-of-the-art in automated segmentation remains supervised learning, employing discriminative models such as U-Net. However, training these models requires access to large amounts of manually labelled data which is often diffi… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Accepted at ML4H 2022

    Report number: https://proceedings.mlr.press/v193/rosnati22a.html

  25. arXiv:2208.10950  [pdf, other

    cs.CV cs.LG

    Deep Structural Causal Shape Models

    Authors: Rajat Rasal, Daniel C. Castro, Nick Pawlowski, Ben Glocker

    Abstract: Causal reasoning provides a language to ask important interventional and counterfactual questions beyond purely statistical association. In medical imaging, for example, we may want to study the causal effect of genetic, environmental, or lifestyle factors on the normal and pathological variation of anatomical phenotypes. However, while anatomical shape models of 3D surface meshes, extracted from… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: Accepted in 2nd Causality in Vision Workshop at ECCV 2022

  26. AG2U -- Autonomous Grading Under Uncertainties

    Authors: Yakov Miron, Yuval Goldfracht, Chana Ross, Dotan Di Castro, Itzik Klein

    Abstract: Surface grading, the process of leveling an uneven area containing pre-dumped sand piles, is an important task in the construction site pipeline. This labour-intensive process is often carried out by a dozer, a key machinery tool at any construction site. Current attempts to automate surface grading assume perfect localization. However, in real-world scenarios, this assumption fails, as agents are… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: 8 Pages

    Report number: ras.ral.22-2218.3966ab9e

    Journal ref: in IEEE Robotics and Automation Letters, vol. 8, no. 1, pp. 65-72, Jan. 2023

  27. arXiv:2207.01375  [pdf, other

    cs.CV cs.AI

    GraphVid: It Only Takes a Few Nodes to Understand a Video

    Authors: Eitan Kosman, Dotan Di Castro

    Abstract: We propose a concise representation of videos that encode perceptually meaningful features into graphs. With this representation, we aim to leverage the large amount of redundancies in videos and save computations. First, we construct superpixel-based graph representations of videos by considering superpixels as graph nodes and create spatial and temporal connections between adjacent superpixels.… ▽ More

    Submitted 20 July, 2022; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV2022 (Oral)

  28. arXiv:2206.06091  [pdf, other

    cs.RO cs.AI cs.LG

    Towards Autonomous Grading In The Real World

    Authors: Yakov Miron, Chana Ross, Yuval Goldfracht, Chen Tessler, Dotan Di Castro

    Abstract: In this work, we aim to tackle the problem of autonomous grading, where a dozer is required to flatten an uneven area. In addition, we explore methods for bridging the gap between a simulated environment and real scenarios. We design both a realistic physical simulation and a scaled real prototype environment mimicking the real dozer dynamics and sensory information. We establish heuristics and le… ▽ More

    Submitted 25 July, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: 7 pages, Accepted to IEEE-IROS2022

  29. Making the Most of Text Semantics to Improve Biomedical Vision--Language Processing

    Authors: Benedikt Boecking, Naoto Usuyama, Shruthi Bannur, Daniel C. Castro, Anton Schwaighofer, Stephanie Hyland, Maria Wetscherek, Tristan Naumann, Aditya Nori, Javier Alvarez-Valle, Hoifung Poon, Ozan Oktay

    Abstract: Multi-modal data abounds in biomedicine, such as radiology images and reports. Interpreting this data at scale is essential for improving clinical care and accelerating clinical research. Biomedical text with its complex semantics poses additional challenges in vision--language modelling compared to the general domain, and previous work has used insufficiently adapted models that lack domain-speci… ▽ More

    Submitted 21 July, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: To appear in ECCV 2022. Code: https://aka.ms/biovil-code Dataset: https://aka.ms/ms-cxr Demo Notebook: https://aka.ms/biovil-demo-notebook

    Journal ref: Computer Vision - ECCV 2022, LNCS vol 13696, pp 1-21

  30. arXiv:2203.01153  [pdf, other

    cs.RO cs.AI

    InsertionNet 2.0: Minimal Contact Multi-Step Insertion Using Multimodal Multiview Sensory Input

    Authors: Oren Spector, Vladimir Tchuiev, Dotan Di Castro

    Abstract: We address the problem of devising the means for a robot to rapidly and safely learn insertion skills with just a few human interventions and without hand-crafted rewards or demonstrations. Our InsertionNet version 2.0 provides an improved technique to robustly cope with a wide range of use-cases featuring different shapes, colors, initial poses, etc. In particular, we present a regression-based m… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

    Comments: Accepted to ICRA 2022, InsertionNet 1.0 : https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9420246

  31. arXiv:2112.10877  [pdf, other

    cs.RO cs.AI cs.LG

    AGPNet -- Autonomous Grading Policy Network

    Authors: Chana Ross, Yakov Miron, Yuval Goldfracht, Dotan Di Castro

    Abstract: In this work, we establish heuristics and learning strategies for the autonomous control of a dozer grading an uneven area studded with sand piles. We formalize the problem as a Markov Decision Process, design a simulation which demonstrates agent-environment interactions and finally compare our simulator to a real dozer prototype. We use methods from reinforcement learning, behavior cloning and c… ▽ More

    Submitted 20 December, 2021; originally announced December 2021.

    Comments: 7 pages, paper submitted to IEEE International Conference on Robotics and Automation

  32. arXiv:2111.05694  [pdf, other

    cs.LG cs.AI

    LSP : Acceleration and Regularization of Graph Neural Networks via Locality Sensitive Pruning of Graphs

    Authors: Eitan Kosman, Joel Oren, Dotan Di Castro

    Abstract: Graph Neural Networks (GNNs) have emerged as highly successful tools for graph-related tasks. However, real-world problems involve very large graphs, and the compute resources needed to fit GNNs to those problems grow rapidly. Moreover, the noisy nature and size of real-world graphs cause GNNs to over-fit if not regularized properly. Surprisingly, recent works show that large graphs often involve… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

  33. arXiv:2111.01510  [pdf, other

    cs.RO cs.AI

    A Hybrid Approach for Learning to Shift and Grasp with Elaborate Motion Primitives

    Authors: Zohar Feldman, Hanna Ziesche, Ngo Anh Vien, Dotan Di Castro

    Abstract: Many possible fields of application of robots in real world settings hinge on the ability of robots to grasp objects. As a result, robot gras** has been an active field of research for many years. With our publication we contribute to the endeavor of enabling robots to grasp, with a particular focus on bin picking applications. Bin picking is especially challenging due to the often cluttered and… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

  34. arXiv:2110.00445  [pdf, ps, other

    stat.ML cs.LG

    Sim and Real: Better Together

    Authors: Shirli Di Castro Shashua, Dotan Di Castro, Shie Mannor

    Abstract: Simulation is used extensively in autonomous systems, particularly in robotic manipulation. By far, the most common approach is to train a controller in simulation, and then use it as an initial starting point for the real system. We demonstrate how to learn simultaneously from both simulation and interaction with the real environment. We propose an algorithm for balancing the large number of samp… ▽ More

    Submitted 5 October, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

  35. Active label cleaning for improved dataset quality under resource constraints

    Authors: Melanie Bernhardt, Daniel C. Castro, Ryutaro Tanno, Anton Schwaighofer, Kerem C. Tezcan, Miguel Monteiro, Shruthi Bannur, Matthew Lungren, Aditya Nori, Ben Glocker, Javier Alvarez-Valle, Ozan Oktay

    Abstract: Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have an often-overlooked confounding effect on the assessment of model performance. Nevertheless, employing experts to remove label noise by fully re-annotating large datasets is infeasible in resource-constrained settings, such as healthcare. This work advocates for a data-driven… ▽ More

    Submitted 10 February, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: Accepted for publication in Nature Communications

    Journal ref: Nature Communications 13 (2022) 1161

  36. arXiv:2108.04706  [pdf, other

    cs.CV cs.RO

    BIDCD -- Bosch Industrial Depth Completion Dataset

    Authors: Adam Botach, Yuri Feldman, Yakov Miron, Yoel Shapiro, Dotan Di Castro

    Abstract: We introduce BIDCD -- the Bosch Industrial Depth Completion Dataset. BIDCD is a new RGBD dataset of metallic industrial objects, collected with a depth camera mounted on a robotic manipulator. The main purpose of this dataset is to facilitate the training of domain-specific depth completion models, to be used in logistics and manufacturing tasks. We trained a State-of-the-Art depth completion mode… ▽ More

    Submitted 4 October, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

  37. arXiv:2107.12674  [pdf, other

    cs.CV cs.LG

    Vision-Guided Forecasting -- Visual Context for Multi-Horizon Time Series Forecasting

    Authors: Eitan Kosman, Dotan Di Castro

    Abstract: Autonomous driving gained huge traction in recent years, due to its potential to change the way we commute. Much effort has been put into trying to estimate the state of a vehicle. Meanwhile, learning to forecast the state of a vehicle ahead introduces new capabilities, such as predicting dangerous situations. Moreover, forecasting brings new supervision opportunities by learning to predict richer… ▽ More

    Submitted 26 September, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: To be presented in the ROAD challenge & SRVU workshop (ICCV2021)

  38. arXiv:2107.06618  [pdf, other

    eess.IV cs.CV cs.LG

    Hierarchical Analysis of Visual COVID-19 Features from Chest Radiographs

    Authors: Shruthi Bannur, Ozan Oktay, Melanie Bernhardt, Anton Schwaighofer, Rajesh Jena, Besmira Nushi, Sharan Wadhwani, Aditya Nori, Kal Natarajan, Shazad Ashraf, Javier Alvarez-Valle, Daniel C. Castro

    Abstract: Chest radiography has been a recommended procedure for patient triaging and resource management in intensive care units (ICUs) throughout the COVID-19 pandemic. The machine learning efforts to augment this workflow have been long challenged due to deficiencies in reporting, model evaluation, and failure mode analysis. To address some of those shortcomings, we model radiological features with a hum… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: Presented at ICML 2021 Workshop on Interpretable Machine Learning in Healthcare

  39. arXiv:2104.14223  [pdf, other

    cs.RO cs.AI cs.LG

    InsertionNet -- A Scalable Solution for Insertion

    Authors: Oren Spector, Dotan Di Castro

    Abstract: Complicated assembly processes can be described as a sequence of two main activities: gras** and insertion. While general gras** solutions are common in industry, insertion is still only applicable to small subsets of problems, mainly ones involving simple shapes in fixed locations and in which the variations are not taken into consideration. Recently, RL approaches with prior knowledge (e.g.,… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Comments: Qualitative results can be found in our supplementary video on our website: https://sites.google.com/view/insertionnet/

  40. arXiv:2104.01646  [pdf, other

    cs.LG math.OC

    SOLO: Search Online, Learn Offline for Combinatorial Optimization Problems

    Authors: Joel Oren, Chana Ross, Maksym Lefarov, Felix Richter, Ayal Taitler, Zohar Feldman, Christian Daniel, Dotan Di Castro

    Abstract: We study combinatorial problems with real world applications such as machine scheduling, routing, and assignment. We propose a method that combines Reinforcement Learning (RL) and planning. This method can equally be applied to both the offline, as well as online, variants of the combinatorial problem, in which the problem components (e.g., jobs in scheduling problems) are not known in advance, bu… ▽ More

    Submitted 18 May, 2021; v1 submitted 4 April, 2021; originally announced April 2021.

  41. arXiv:2009.01657  [pdf, other

    eess.IV cs.LG

    A free web service for fast COVID-19 classification of chest X-Ray images

    Authors: Jose David Bermudez Castro, Ricardo Rei, Jose E. Ruiz, Pedro Achanccaray Diaz, Smith Arauco Canchumuni, Cristian Muñoz Villalobos, Felipe Borges Coelho, Leonardo Forero Mendoza, Marco Aurelio C. Pacheco

    Abstract: The coronavirus outbreak became a major concern for society worldwide. Technological innovation and ingenuity are essential to fight COVID-19 pandemic and bring us one step closer to overcome it. Researchers over the world are working actively to find available alternatives in different fields, such as the Healthcare System, pharmaceutic, health prevention, among others. With the rise of artificia… ▽ More

    Submitted 27 August, 2020; originally announced September 2020.

    Comments: 14 pages, 12 figures

  42. arXiv:2008.07861  [pdf, other

    cs.CV

    Depth Completion with RGB Prior

    Authors: Yuri Feldman, Yoel Shapiro, Dotan Di Castro

    Abstract: Depth cameras are a prominent perception system for robotics, especially when operating in natural unstructured environments. Industrial applications, however, typically involve reflective objects under harsh lighting conditions, a challenging scenario for depth cameras, as it induces numerous reflections and deflections, leading to loss of robustness and deteriorated accuracy. Here, we developed… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

    Comments: 17 pages, 4 figures

  43. Image-level Harmonization of Multi-Site Data using Image-and-Spatial Transformer Networks

    Authors: R. Robinson, Q. Dou, D. C. Castro, K. Kamnitsas, M. de Groot, R. M. Summers, D. Rueckert, B. Glocker

    Abstract: We investigate the use of image-and-spatial transformer networks (ISTNs) to tackle domain shift in multi-site medical imaging data. Commonly, domain adaptation (DA) is performed with little regard for explainability of the inter-domain transformation and is often conducted at the feature-level in the latent space. We employ ISTNs for DA at the image-level which constrains transformations to explai… ▽ More

    Submitted 30 June, 2020; originally announced June 2020.

    Comments: Accepted at MICCAI 2020

    Journal ref: Medical Image Computing and Computer-Assisted Intervention (2020), pp. 710-719, LNCS 12267

  44. arXiv:2006.06485  [pdf, other

    stat.ML cs.LG

    Deep Structural Causal Models for Tractable Counterfactual Inference

    Authors: Nick Pawlowski, Daniel C. Castro, Ben Glocker

    Abstract: We formulate a general framework for building structural causal models (SCMs) with deep learning components. The proposed approach employs normalising flows and variational inference to enable tractable inference of exogenous noise variables - a crucial step for counterfactual inference that is missing from existing deep causal learning methods. Our framework is validated on a synthetic dataset bu… ▽ More

    Submitted 22 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Accepted to NeurIPS 2020

  45. arXiv:2006.06015  [pdf, other

    cs.CV cs.LG

    Stochastic Segmentation Networks: Modelling Spatially Correlated Aleatoric Uncertainty

    Authors: Miguel Monteiro, Loïc Le Folgoc, Daniel Coelho de Castro, Nick Pawlowski, Bernardo Marques, Konstantinos Kamnitsas, Mark van der Wilk, Ben Glocker

    Abstract: In image segmentation, there is often more than one plausible solution for a given input. In medical imaging, for example, experts will often disagree about the exact location of object boundaries. Estimating this inherent uncertainty and predicting multiple plausible hypotheses is of great interest in many applications, yet this ability is lacking in most current deep learning methods. In this pa… ▽ More

    Submitted 22 December, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: Published at Neurips2020. 17 pages, 11 figures, 2 tables

  46. arXiv:2005.10638  [pdf, other

    cs.LG eess.IV stat.ML

    Recent Developments Combining Ensemble Smoother and Deep Generative Networks for Facies History Matching

    Authors: Smith W. A. Canchumuni, Jose D. B. Castro, Júlia Potratz, Alexandre A. Emerick, Marco Aurelio C. Pacheco

    Abstract: Ensemble smoothers are among the most successful and efficient techniques currently available for history matching. However, because these methods rely on Gaussian assumptions, their performance is severely degraded when the prior geology is described in terms of complex facies distributions. Inspired by the impressive results obtained by deep generative networks in areas such as image and video g… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: 46 pages, 26 figures

  47. arXiv:2005.02732  [pdf, ps, other

    cs.MS

    Custom-Precision Mathematical Library Explorations for Code Profiling and Optimization

    Authors: David Defour, Pablo de Oliveira Castro, Matei Istoan, Eric Petit

    Abstract: The typical processors used for scientific computing have fixed-width data-paths. This implies that mathematical libraries were specifically developed to target each of these fixed precisions (binary16, binary32, binary64). However, to address the increasing energy consumption and throughput requirements of scientific applications, library and hardware designers are moving beyond this one-size-fit… ▽ More

    Submitted 6 May, 2020; originally announced May 2020.

  48. arXiv:1912.08142  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Causality matters in medical imaging

    Authors: Daniel C. Castro, Ian Walker, Ben Glocker

    Abstract: This article discusses how the language of causality can shed new light on the major challenges in machine learning for medical imaging: 1) data scarcity, which is the limited availability of high-quality annotations, and 2) data mismatch, whereby a trained algorithm may fail to generalize in clinical practice. Looking at these challenges through the lens of causality allows decisions about data c… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

    Comments: 20 pages, 5 figures, 4 tables

    Journal ref: Nature Communications 11 (2020) 3673

  49. arXiv:1910.13580  [pdf, other

    cs.CV

    Domain Generalization via Model-Agnostic Learning of Semantic Features

    Authors: Qi Dou, Daniel C. Castro, Konstantinos Kamnitsas, Ben Glocker

    Abstract: Generalization capability to unseen domains is crucial for machine learning models when deploying to real-world conditions. We investigate the challenging problem of domain generalization, i.e., training a model on multi-domain source data such that it can directly generalize to target domains with unknown statistics. We adopt a model-agnostic learning paradigm with gradient-based meta-train and m… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019

  50. arXiv:1910.04597  [pdf, other

    eess.IV cs.CV cs.LG q-bio.NC

    Machine Learning with Multi-Site Imaging Data: An Empirical Study on the Impact of Scanner Effects

    Authors: Ben Glocker, Robert Robinson, Daniel C. Castro, Qi Dou, Ender Konukoglu

    Abstract: This is an empirical study to investigate the impact of scanner effects when using machine learning on multi-site neuroimaging data. We utilize structural T1-weighted brain MRI obtained from two different studies, Cam-CAN and UK Biobank. For the purpose of our investigation, we construct a dataset consisting of brain scans from 592 age- and sex-matched individuals, 296 subjects from each original… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

    Comments: Presented at the Medical Imaging meets NeurIPS Workshop 2019