Skip to main content

Showing 1–24 of 24 results for author: Fatemi, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14657  [pdf, other

    cs.CL cs.AI cs.LG

    OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset

    Authors: Allen Roush, Yusuf Shabazz, Arvind Balaji, Peter Zhang, Stefano Mezza, Markus Zhang, Sanjay Basu, Sriram Vishwanath, Mehdi Fatemi, Ravid Schwartz-Ziv

    Abstract: We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence. OpenDebateEvidence captures the complexity of arguments in high school and college debates, providing valuable r… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted for Publication to ARGMIN 2024 at ACL2024

  2. arXiv:2403.16092  [pdf, other

    cs.CV cs.RO

    Are NeRFs ready for autonomous driving? Towards closing the real-to-simulation gap

    Authors: Carl Lindström, Georg Hess, Adam Lilja, Maryam Fatemi, Lars Hammarstrand, Christoffer Petersson, Lennart Svensson

    Abstract: Neural Radiance Fields (NeRFs) have emerged as promising tools for advancing autonomous driving (AD) research, offering scalable closed-loop simulation and data augmentation capabilities. However, to trust the results achieved in simulation, one needs to ensure that AD systems perceive real and rendered data in the same way. Although the performance of rendering methods is increasing, many scenari… ▽ More

    Submitted 15 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: Accepted at Workshop on Autonomous Driving, CVPR 2024

  3. arXiv:2402.10240  [pdf, other

    cs.LG cs.AI eess.SY

    A Dynamical View of the Question of Why

    Authors: Mehdi Fatemi, Sindhu Gowda

    Abstract: We address causal reasoning in multivariate time series data generated by stochastic processes. Existing approaches are largely restricted to static settings, ignoring the continuity and emission of variations across time. In contrast, we propose a learning paradigm that directly establishes causation between events in the course of time. We present two key lemmas to compute causal contributions a… ▽ More

    Submitted 27 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted at the Twelfth International Conference on Learning Representations (ICLR'24)

  4. arXiv:2402.06552  [pdf, other

    cs.LG

    Deceptive Path Planning via Reinforcement Learning with Graph Neural Networks

    Authors: Michael Y. Fatemi, Wesley A. Suttle, Brian M. Sadler

    Abstract: Deceptive path planning (DPP) is the problem of designing a path that hides its true goal from an outside observer. Existing methods for DPP rely on unrealistic assumptions, such as global state observability and perfect model knowledge, and are typically problem-specific, meaning that even minor changes to a previously solved problem can force expensive computation of an entirely new solution. Gi… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

    Comments: 11 pages, 14 figures

    MSC Class: 68T05

  5. arXiv:2311.04921  [pdf, other

    cs.CL cs.AI

    Successor Features for Efficient Multisubject Controlled Text Generation

    Authors: Meng Cao, Mehdi Fatemi, Jackie Chi Kit Cheung, Samira Shabanian

    Abstract: While large language models (LLMs) have achieved impressive performance in generating fluent and realistic text, controlling the generated text so that it exhibits properties such as safety, factuality, and non-toxicity remains challenging. % such as DExperts, GeDi, and rectification Existing decoding-based methods are static in terms of the dimension of control; if the target subject is changed,… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  6. arXiv:2302.14003  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Systematic Rectification of Language Models via Dead-end Analysis

    Authors: Meng Cao, Mehdi Fatemi, Jackie Chi Kit Cheung, Samira Shabanian

    Abstract: With adversarial or otherwise normal prompts, existing large language models (LLM) can be pushed to generate toxic discourses. One way to reduce the risk of LLMs generating undesired discourses is to alter the training of the LLM. This can be very restrictive due to demanding computation requirements. Other methods rely on rule-based or prompt-based token elimination, which are limited as they dis… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: The Eleventh International Conference on Learning Representations, ICLR'23

    Journal ref: ICLR 2023

  7. arXiv:2204.01844  [pdf

    cs.LG

    Deep Q-learning of global optimizer of multiply model parameters for viscoelastic imaging

    Authors: Hongmei Zhang, Kai Wang, Yan Zhou, Shadab Momin, Xiaofeng Yang, Mostafa Fatemi, Michael F. Insana

    Abstract: Objective: Estimation of the global optima of multiple model parameters is valuable in imaging to form a reliable diagnostic image. Given non convexity of the objective function, it is challenging to avoid from different local minima. Methods: We first formulate the global searching of multiply parameters to be a k-D move in the parametric space, and convert parameters updating to be state-action… ▽ More

    Submitted 31 March, 2022; originally announced April 2022.

  8. arXiv:2203.09365  [pdf, other

    cs.LG

    Semi-Markov Offline Reinforcement Learning for Healthcare

    Authors: Mehdi Fatemi, Mary Wu, Jeremy Petch, Walter Nelson, Stuart J. Connolly, Alexander Benz, Anthony Carnicelli, Marzyeh Ghassemi

    Abstract: Reinforcement learning (RL) tasks are typically framed as Markov Decision Processes (MDPs), assuming that decisions are made at fixed time intervals. However, many applications of great importance, including healthcare, do not satisfy this assumption, yet they are commonly modelled as MDPs after an artificial resha** of the data. In addition, most healthcare (and similar) problems are offline by… ▽ More

    Submitted 20 March, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Published at Conference on Health, Inference, and Learning (CHIL) 2022

  9. arXiv:2203.07171  [pdf, other

    cs.LG cs.AI

    Orchestrated Value Map** for Reinforcement Learning

    Authors: Mehdi Fatemi, Arash Tavakoli

    Abstract: We present a general convergent class of reinforcement learning algorithms that is founded on two distinct principles: (1) map** value estimates to a different space using arbitrary functions from a broad class, and (2) linearly decomposing the reward signal into multiple channels. The first principle enables incorporating specific properties into the value estimator that can enhance learning. T… ▽ More

    Submitted 16 March, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Published at ICLR 2022

  10. arXiv:2110.04186  [pdf, other

    cs.LG cs.AI

    Medical Dead-ends and Learning to Identify High-risk States and Treatments

    Authors: Mehdi Fatemi, Taylor W. Killian, Jayakumar Subramanian, Marzyeh Ghassemi

    Abstract: Machine learning has successfully framed many sequential decision making problems as either supervised prediction, or optimal decision-making policy identification via reinforcement learning. In data-constrained offline settings, both approaches may fail as they assume fully optimal behavior or rely on exploring alternatives that may not exist. We introduce an inherently different approach that id… ▽ More

    Submitted 17 February, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

  11. arXiv:2107.06405  [pdf, other

    cs.LG cs.AI cs.RO

    Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

    Authors: Sungryull Sohn, Sungtae Lee, Jongwook Choi, Harm van Seijen, Mehdi Fatemi, Honglak Lee

    Abstract: We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on the agent's trajectory that improves the sample efficiency in sparse-reward MDPs. We show that any optimal policy necessarily satisfies the k-SP constraint. Notably, the k-SP constraint prevents the policy from exploring state-action pairs along the non-k-SP trajectories (e.g., going back and forth). However, in practice, excl… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

    Comments: In proceedings of ICML 2021

  12. arXiv:2011.11235  [pdf, other

    cs.LG

    An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare

    Authors: Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Marzyeh Ghassemi

    Abstract: Reinforcement Learning (RL) has recently been applied to sequential estimation and prediction problems identifying and develo** hypothetical treatment strategies for septic patients, with a particular focus on offline learning with observational data. In practice, successful RL relies on informative latent states derived from sequential observations to develop optimal treatment strategies. To da… ▽ More

    Submitted 23 November, 2020; originally announced November 2020.

    Comments: To appear in proceedings of the 2020 Machine Learning for Health workshop at NeurIPS

  13. arXiv:2010.14680  [pdf, other

    cs.LG stat.ML

    Learning to Represent Action Values as a Hypergraph on the Action Vertices

    Authors: Arash Tavakoli, Mehdi Fatemi, Petar Kormushev

    Abstract: Action-value estimation is a critical component of many reinforcement learning (RL) methods whereby sample complexity relies heavily on how fast a good estimator for action value can be learned. By viewing this problem through the lens of representation learning, good representations of both state and action can facilitate action-value estimation. While advances in deep learning have seamlessly dr… ▽ More

    Submitted 20 June, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: ICLR 2021, code: https://github.com/atavakol/action-hypergraph-networks

  14. arXiv:1906.00572  [pdf, other

    cs.LG stat.ML

    Using a Logarithmic Map** to Enable Lower Discount Factors in Reinforcement Learning

    Authors: Harm van Seijen, Mehdi Fatemi, Arash Tavakoli

    Abstract: In an effort to better understand the different ways in which the discount factor affects the optimization process in reinforcement learning, we designed a set of experiments to study each effect in isolation. Our analysis reveals that the common perception that poor performance of low discount factors is caused by (too) small action-gaps requires revision. We propose an alternative hypothesis tha… ▽ More

    Submitted 23 December, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019, code: https://github.com/microsoft/logrl

  15. Poisson Multi-Bernoulli Map** Using Gibbs Sampling

    Authors: Maryam Fatemi, Karl Granström, Lennart Svensson, Francisco J. R. Ruiz, Lars Hammarstrand

    Abstract: This paper addresses the map** problem. Using a conjugate prior form, we derive the exact theoretical batch multi-object posterior density of the map given a set of measurements. The landmarks in the map are modeled as extended objects, and the measurements are described as a Poisson process, conditioned on the map. We use a Poisson process prior on the map and prove that the posterior distribut… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

    Comments: 14 pages, 6 figures

    Journal ref: IEEE Transactions on Signal Processing, Vol. 65, Issue 11, June 2017

  16. Joint Sentiment/Topic Modeling on Text Data Using Boosted Restricted Boltzmann Machine

    Authors: Masoud Fatemi, Mehran Safayani

    Abstract: Recently by the development of the Internet and the Web, different types of social media such as web blogs become an immense source of text data. Through the processing of these data, it is possible to discover practical information about different topics, individuals opinions and a thorough understanding of the society. Therefore, applying models which can automatically extract the subjective inf… ▽ More

    Submitted 10 November, 2017; originally announced November 2017.

  17. arXiv:1706.04208  [pdf, other

    cs.LG

    Hybrid Reward Architecture for Reinforcement Learning

    Authors: Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes, Jeffrey Tsang

    Abstract: One of the main challenges in reinforcement learning (RL) is generalisation. In typical deep RL methods this is achieved by approximating the optimal value function with a low-dimensional representation using a deep network. While this approach works well in many domains, in domains where the optimal value function cannot easily be reduced to a low-dimensional representation, learning can be very… ▽ More

    Submitted 27 November, 2017; v1 submitted 13 June, 2017; originally announced June 2017.

  18. arXiv:1704.00756  [pdf, other

    cs.LG cs.AI stat.ML

    Multi-Advisor Reinforcement Learning

    Authors: Romain Laroche, Mehdi Fatemi, Joshua Romoff, Harm van Seijen

    Abstract: We consider tackling a single-agent RL problem by distributing it to $n$ learners. These learners, called advisors, endeavour to solve the problem from a different focus. Their advice, taking the form of action values, is then communicated to an aggregator, which is in control of the system. We show that the local planning method for the advisors is critical and that none of the ones found in the… ▽ More

    Submitted 14 November, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

    Comments: Submitted at ICLR2018

  19. arXiv:1612.05159  [pdf, other

    cs.LG cs.AI

    Separation of Concerns in Reinforcement Learning

    Authors: Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche

    Abstract: In this paper, we propose a framework for solving a single-agent task by using multiple agents, each focusing on different aspects of the task. This approach has two main advantages: 1) it allows for training specialized agents on different parts of the task, and 2) it provides a new way to transfer knowledge, by transferring trained agents. Our framework generalizes the traditional hierarchical d… ▽ More

    Submitted 28 March, 2017; v1 submitted 15 December, 2016; originally announced December 2016.

  20. arXiv:1606.03152  [pdf, other

    cs.CL cs.AI

    Policy Networks with Two-Stage Training for Dialogue Systems

    Authors: Mehdi Fatemi, Layla El Asri, Hannes Schulz, **g He, Kaheer Suleman

    Abstract: In this paper, we propose to use deep policy networks which are trained with an advantage actor-critic method for statistically optimised dialogue systems. First, we show that, on summary state and action spaces, deep Reinforcement Learning (RL) outperforms Gaussian Processes methods. Summary state and action spaces lead to good performance but require pre-engineering effort, RL knowledge, and dom… ▽ More

    Submitted 12 September, 2016; v1 submitted 9 June, 2016; originally announced June 2016.

    Comments: SIGDial 2016 (Submitted: May 2016; Accepted: Jun 30, 2016)

    Journal ref: Proceedings of the SIGDIAL 2016 Conference, pages 101--110, Los Angeles, USA, 13-15 September 2016. Association for Computational Linguistics

  21. arXiv:1605.06311  [pdf, other

    stat.CO cs.CV eess.SY

    Poisson multi-Bernoulli conjugate prior for multiple extended object filtering

    Authors: Karl Granstrom, Maryam Fatemi, Lennart Svensson

    Abstract: This paper presents a Poisson multi-Bernoulli mixture (PMBM) conjugate prior for multiple extended object filtering. A Poisson point process is used to describe the existence of yet undetected targets, while a multi-Bernoulli mixture describes the distribution of the targets that have been detected. The prediction and update equations are presented for the standard transition density and measureme… ▽ More

    Submitted 6 December, 2019; v1 submitted 20 May, 2016; originally announced May 2016.

  22. Sampling and Reconstruction of Shapes with Algebraic Boundaries

    Authors: Mitra Fatemi, Arash Amini, Martin Vetterli

    Abstract: We present a sampling theory for a class of binary images with finite rate of innovation (FRI). Every image in our model is the restriction of $\mathds{1}_{\{p\leq0\}}$ to the image plane, where $\mathds{1}$ denotes the indicator function and $p$ is some real bivariate polynomial. This particularly means that the boundaries in the image form a subset of an algebraic curve with the implicit polynom… ▽ More

    Submitted 14 December, 2015; originally announced December 2015.

    Comments: 12 pages, 14 figures

  23. Shapes From Pixels

    Authors: Mitra Fatemi, Arash Amini, Loic Baboulaz, Martin Vetterli

    Abstract: Continuous-domain visual signals are usually captured as discrete (digital) images. This operation is not invertible in general, in the sense that the continuous-domain signal cannot be exactly reconstructed based on the discrete image, unless it satisfies certain constraints (\emph{e.g.}, bandlimitedness). In this paper, we study the problem of recovering shape images with smooth boundaries from… ▽ More

    Submitted 24 August, 2015; originally announced August 2015.

    Comments: 13 pages, 14 figures

  24. Deconvolution of vibroacoustic images using a simulation model based on a three dimensional point spread function

    Authors: Talita Perciano, Matthew Urban, Nelson D. A. Mascarenhas, Mostafa Fatemi, Alejandro C. Frery, Glauber T. Silva

    Abstract: Vibro-acoustography (VA) is a medical imaging method based on the difference-frequency generation produced by the mixture of two focused ultrasound beams. VA has been applied to different problems in medical imaging such as imaging bones, microcalcifications in the breast, mass lesions, and calcified arteries. The obtained images may have a resolution of 0.7--0.8 mm. Current VA systems based on co… ▽ More

    Submitted 13 July, 2012; originally announced July 2012.

    Comments: Accepted for publication in Ultrasonics