Search | arXiv e-print repository

CLadder: Assessing Causal Reasoning in Language Models

Authors: Zhi**g **, Yuen Chen, Felix Leeb, Luigi Gresele, Ojasv Kamal, Zhiheng Lyu, Kevin Blin, Fernando Gonzalez Adauto, Max Kleiman-Weiner, Mrinmaya Sachan, Bernhard Schölkopf

Abstract: The ability to perform causal reasoning is widely considered a core feature of intelligence. In this work, we investigate whether large language models (LLMs) can coherently reason about causality. Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordan… ▽ More The ability to perform causal reasoning is widely considered a core feature of intelligence. In this work, we investigate whether large language models (LLMs) can coherently reason about causality. Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordance with a set of well-defined formal rules. To address this, we propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al. We compose a large dataset, CLadder, with 10K samples: based on a collection of causal graphs and queries (associational, interventional, and counterfactual), we obtain symbolic questions and ground-truth answers, through an oracle causal inference engine. These are then translated into natural language. We evaluate multiple LLMs on our dataset, and we introduce and evaluate a bespoke chain-of-thought prompting strategy, CausalCoT. We show that our task is highly challenging for LLMs, and we conduct an in-depth analysis to gain deeper insights into the causal reasoning abilities of LLMs. Our data is open-sourced at https://huggingface.co/datasets/causalNLP/cladder, and our code can be found at https://github.com/causalNLP/cladder. △ Less

Submitted 17 January, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: NeurIPS 2023; updated with CLadder dataset v1.5

arXiv:2303.10747 [pdf]

Linear Analysis of Boundary-Layer Instabilities on a Finned-Cone at Mach 6

Authors: Daniel B. Araya, Neal P. Bitter, Bradley M. Wheaton, Omar Kamal, Tim Colonius, Anthony Knutson, Heath Johnson, Joseph Nichols, Graham V. Candler, Vincenzo Russo, Christoph Brehm

Abstract: Boundary-layer instabilities for a finned cone at Mach=6, $Re=8.4 \times 10^6$ [m$^{-1}$], and zero incidence angle are examined using linear stability methods of varying fidelity and maturity, following earlier analysis presented in [doi.org/10.2514/6.2022-3247]. The geometry and laminar flow conditions correspond to experiments conducted at the Boeing Air Force Mach 6 Quiet Tunnel (BAM6QT) at Pu… ▽ More Boundary-layer instabilities for a finned cone at Mach=6, $Re=8.4 \times 10^6$ [m$^{-1}$], and zero incidence angle are examined using linear stability methods of varying fidelity and maturity, following earlier analysis presented in [doi.org/10.2514/6.2022-3247]. The geometry and laminar flow conditions correspond to experiments conducted at the Boeing Air Force Mach 6 Quiet Tunnel (BAM6QT) at Purdue University. Where possible, a common mean flow is utilized among the stability computations, and comparisons are made along the acreage of the cone where transition is first observed in the experiment. Stability results utilizing Linear Stability Theory (LST), planar Parabolized Stability Equations (planar-PSE), One-Way Navier Stokes (OWNS), forced direct numerical simulation (DNS), and Adaptive Mesh Refinement Wavepacket Tracking (AMR-WPT) are presented. A dominant three-dimensional vortex instability occurring at $\approx$ 250 kHz is identified that correlates well with experimental measurements of transition onset. With the exception of LST, all of the higher-fidelity linear methods considered in this work were consistent in predicting the initial growth and general structure of the vortex instability as it evolved downstream. Some of the challenges, opportunities, and development needs of the stability methods considered are discussed. △ Less

Submitted 19 March, 2023; originally announced March 2023.

arXiv:2301.11757 [pdf, other]

Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion

Authors: Flavio Schneider, Ojasv Kamal, Zhi**g **, Bernhard Schölkopf

Abstract: Recent years have seen the rapid development of large generative models for text; however, much less research has explored the connection between text and another "language" of communication -- music. Music, much like text, can convey emotions, stories, and ideas, and has its own unique structure and syntax. In our work, we bridge text and music via a text-to-music generation model that is highly… ▽ More Recent years have seen the rapid development of large generative models for text; however, much less research has explored the connection between text and another "language" of communication -- music. Music, much like text, can convey emotions, stories, and ideas, and has its own unique structure and syntax. In our work, we bridge text and music via a text-to-music generation model that is highly efficient, expressive, and can handle long-term structure. Specifically, we develop Moûsai, a cascading two-stage latent diffusion model that can generate multiple minutes of high-quality stereo music at 48kHz from textual descriptions. Moreover, our model features high efficiency, which enables real-time inference on a single consumer GPU with a reasonable speed. Through experiments and property analyses, we show our model's competence over a variety of criteria compared with existing music generation models. Lastly, to promote the open-source culture, we provide a collection of open-source libraries with the hope of facilitating future work in the field. We open-source the following: Codes: https://github.com/archinetai/audio-diffusion-pytorch; music samples for this paper: http://bit.ly/44ozWDH; all music samples for all models: https://bit.ly/audio-diffusion. △ Less

Submitted 23 October, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

arXiv:2211.10466 [pdf, ps, other]

doi 10.1017/jfm.2023.48

Global receptivity analysis: physically realizable input-output analysis

Authors: Omar Kamal, Matthew T. Lakebrink, Tim Colonius

Abstract: In the context of transition analysis, linear input-output analysis determines worst-case disturbances to a laminar base flow based on a generic right-hand-side volumetric/boundary forcing term. The worst-case forcing is not physically realizable, and, to our knowledge, a generic framework for posing physically-realizable worst-case disturbance problems is lacking. In natural receptivity analysis,… ▽ More In the context of transition analysis, linear input-output analysis determines worst-case disturbances to a laminar base flow based on a generic right-hand-side volumetric/boundary forcing term. The worst-case forcing is not physically realizable, and, to our knowledge, a generic framework for posing physically-realizable worst-case disturbance problems is lacking. In natural receptivity analysis, disturbances are forced by matching (typically local) solutions within the boundary layer to outer solutions consisting of free-stream vortical, entropic, and acoustic disturbances. We pose a scattering formalism to restrict the input forcing to a set of realizable disturbances associated with plane-wave solutions of the outer problem. The formulation is validated by comparing with direct numerical simulations (DNS) for a Mach 4.5 flat-plate boundary layer. We show that the method provides insight into transition mechanisms by identifying those linear combinations of plane-wave disturbances that maximize energy amplification over a range of frequencies. We also discuss how the framework can be extended to accommodate scattering from shocks and in shock layers for supersonic flow. △ Less

Submitted 18 November, 2022; originally announced November 2022.

arXiv:2210.01478 [pdf, other]

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Authors: Zhi**g **, Sydney Levine, Fernando Gonzalez, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Josh Tenenbaum, Bernhard Schölkopf

Abstract: AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability… ▽ More AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability to determine when a rule should be broken, especially in novel or unusual situations. In this paper, we present a novel challenge set consisting of rule-breaking question answering (RBQA) of cases that involve potentially permissible rule-breaking -- inspired by recent moral psychology studies. Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments. MORALCOT outperforms seven existing LLMs by 6.2% F1, suggesting that modeling human reasoning might be necessary to capture the flexibility of the human moral mind. We also conduct a detailed error analysis to suggest directions for future work to improve AI safety using RBQA. Our data is open-sourced at https://huggingface.co/datasets/feradauto/MoralExceptQA and code at https://github.com/feradauto/MoralCoT △ Less

Submitted 27 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022 Oral

arXiv:2111.09273 [pdf, other]

Efficient global resolvent analysis via the one-way Navier-Stokes equations. Part 2. Optimal response

Authors: Georgios Rigas, Omar Kamal, Aaron Towne, Tim Colonius

Abstract: In this study, we develop an efficient approach for approximating resolvent modes via spatial marching. Building on the methodology from Part 1, we leverage the ability of the projection-based formulation of the one-way Navier-Stokes equations (OWNS) to efficiently and accurately approximate the downstream response of the linearized Navier-Stokes equations to forcing for problems containing a slow… ▽ More In this study, we develop an efficient approach for approximating resolvent modes via spatial marching. Building on the methodology from Part 1, we leverage the ability of the projection-based formulation of the one-way Navier-Stokes equations (OWNS) to efficiently and accurately approximate the downstream response of the linearized Navier-Stokes equations to forcing for problems containing a slowly varying direction. Using an adjoint-based optimization framework, forcings that optimally excite a response in the flow are computed by marching the forward and adjoint OWNS equations in the downstream and upstream directions, respectively. This avoids the need to solve direct and adjoint globally-discretized equations, therefore bypassing the main computational bottleneck of a typical global resolvent calculation. The method is demonstrated for a supersonic turbulent jet at Mach 1.5 and a transitional zero-pressure-gradient flat-plate boundary layer flow at Mach 4.5, and the optimal OWNS results are validated against corresponding global calculations. △ Less

Submitted 17 November, 2021; originally announced November 2021.

arXiv:2101.05494 [pdf, ps, other]

Hostility Detection in Hindi leveraging Pre-Trained Language Models

Authors: Ojasv Kamal, Adarsh Kumar, Tejas Vaidhya

Abstract: Hostile content on social platforms is ever increasing. This has led to the need for proper detection of hostile posts so that appropriate action can be taken to tackle them. Though a lot of work has been done recently in the English Language to solve the problem of hostile content online, similar works in Indian Languages are quite hard to find. This paper presents a transfer learning based appro… ▽ More Hostile content on social platforms is ever increasing. This has led to the need for proper detection of hostile posts so that appropriate action can be taken to tackle them. Though a lot of work has been done recently in the English Language to solve the problem of hostile content online, similar works in Indian Languages are quite hard to find. This paper presents a transfer learning based approach to classify social media (i.e Twitter, Facebook, etc.) posts in Hindi Devanagari script as Hostile or Non-Hostile. Hostile posts are further analyzed to determine if they are Hateful, Fake, Defamation, and Offensive. This paper harnesses attention based pre-trained models fine-tuned on Hindi data with Hostile-Non hostile task as Auxiliary and fusing its features for further sub-tasks classification. Through this approach, we establish a robust and consistent model without any ensembling or complex pre-processing. We have presented the results from our approach in CONSTRAINT-2021 Shared Task on hostile post detection where our model performs extremely well with 3rd runner up in terms of Weighted Fine-Grained F1 Score. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Showing 1–7 of 7 results for author: Kamal, O