-
A diverse Multilingual News Headlines Dataset from around the World
Authors:
Felix Leeb,
Bernhard Schölkopf
Abstract:
Babel Briefings is a novel dataset featuring 4.7 million news headlines from August 2020 to November 2021, across 30 languages and 54 locations worldwide with English translations of all articles included. Designed for natural language processing and media studies, it serves as a high-quality dataset for training or evaluating language models as well as offering a simple, accessible collection of…
▽ More
Babel Briefings is a novel dataset featuring 4.7 million news headlines from August 2020 to November 2021, across 30 languages and 54 locations worldwide with English translations of all articles included. Designed for natural language processing and media studies, it serves as a high-quality dataset for training or evaluating language models as well as offering a simple, accessible collection of articles, for example, to analyze global news coverage and cultural narratives. As a simple demonstration of the analyses facilitated by this dataset, we use a basic procedure using a TF-IDF weighted similarity metric to group articles into clusters about the same event. We then visualize the \emph{event signatures} of the event showing articles of which languages appear over time, revealing intuitive features based on the proximity of the event and unexpectedness of the event. The dataset is available on \href{https://www.kaggle.com/datasets/felixludos/babel-briefings}{Kaggle} and \href{https://huggingface.co/datasets/felixludos/babel-briefings}{HuggingFace} with accompanying \href{https://github.com/felixludos/babel-briefings}{GitHub} code.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
CLadder: Assessing Causal Reasoning in Language Models
Authors:
Zhi**g **,
Yuen Chen,
Felix Leeb,
Luigi Gresele,
Ojasv Kamal,
Zhiheng Lyu,
Kevin Blin,
Fernando Gonzalez Adauto,
Max Kleiman-Weiner,
Mrinmaya Sachan,
Bernhard Schölkopf
Abstract:
The ability to perform causal reasoning is widely considered a core feature of intelligence. In this work, we investigate whether large language models (LLMs) can coherently reason about causality. Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordan…
▽ More
The ability to perform causal reasoning is widely considered a core feature of intelligence. In this work, we investigate whether large language models (LLMs) can coherently reason about causality. Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordance with a set of well-defined formal rules. To address this, we propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al. We compose a large dataset, CLadder, with 10K samples: based on a collection of causal graphs and queries (associational, interventional, and counterfactual), we obtain symbolic questions and ground-truth answers, through an oracle causal inference engine. These are then translated into natural language. We evaluate multiple LLMs on our dataset, and we introduce and evaluate a bespoke chain-of-thought prompting strategy, CausalCoT. We show that our task is highly challenging for LLMs, and we conduct an in-depth analysis to gain deeper insights into the causal reasoning abilities of LLMs. Our data is open-sourced at https://huggingface.co/datasets/causalNLP/cladder, and our code can be found at https://github.com/causalNLP/cladder.
△ Less
Submitted 17 January, 2024; v1 submitted 7 December, 2023;
originally announced December 2023.
-
Exploring the Latent Space of Autoencoders with Interventional Assays
Authors:
Felix Leeb,
Stefan Bauer,
Michel Besserve,
Bernhard Schölkopf
Abstract:
Autoencoders exhibit impressive abilities to embed the data manifold into a low-dimensional latent space, making them a staple of representation learning methods. However, without explicit supervision, which is often unavailable, the representation is usually uninterpretable, making analysis and principled progress challenging. We propose a framework, called latent responses, which exploits the lo…
▽ More
Autoencoders exhibit impressive abilities to embed the data manifold into a low-dimensional latent space, making them a staple of representation learning methods. However, without explicit supervision, which is often unavailable, the representation is usually uninterpretable, making analysis and principled progress challenging. We propose a framework, called latent responses, which exploits the locally contractive behavior exhibited by variational autoencoders to explore the learned manifold. More specifically, we develop tools to probe the representation using interventions in the latent space to quantify the relationships between latent variables. We extend the notion of disentanglement to take the learned generative process into account and consequently avoid the limitations of existing metrics that may rely on spurious correlations. Our analyses underscore the importance of studying the causal structure of the representation to improve performance on downstream tasks such as generation, interpolation, and inference of the factors of variation.
△ Less
Submitted 11 January, 2023; v1 submitted 30 June, 2021;
originally announced June 2021.
-
Structure by Architecture: Structured Representations without Regularization
Authors:
Felix Leeb,
Guilia Lanzillotta,
Yashas Annadani,
Michel Besserve,
Stefan Bauer,
Bernhard Schölkopf
Abstract:
We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling. Unlike most methods which rely on matching an arbitrary, relatively unstructured, prior distribution for sampling, we propose a sampling technique that relies solely on the independence of latent variables, thereby avoiding the trade-off between reconstruc…
▽ More
We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling. Unlike most methods which rely on matching an arbitrary, relatively unstructured, prior distribution for sampling, we propose a sampling technique that relies solely on the independence of latent variables, thereby avoiding the trade-off between reconstruction quality and generative performance typically observed in VAEs. We design a novel autoencoder architecture capable of learning a structured representation without the need for aggressive regularization. Our structural decoders learn a hierarchy of latent variables, thereby ordering the information without any additional regularization or supervision. We demonstrate how these models learn a representation that improves results in a variety of downstream tasks including generation, disentanglement, and extrapolation using several challenging and natural image datasets.
△ Less
Submitted 15 February, 2024; v1 submitted 14 June, 2020;
originally announced June 2020.
-
Motion-Nets: 6D Tracking of Unknown Objects in Unseen Environments using RGB
Authors:
Felix Leeb,
Arunkumar Byravan,
Dieter Fox
Abstract:
In this work, we bridge the gap between recent pose estimation and tracking work to develop a powerful method for robots to track objects in their surroundings. Motion-Nets use a segmentation model to segment the scene, and separate translation and rotation models to identify the relative 6D motion of an object between two consecutive frames. We train our method with generated data of floating obj…
▽ More
In this work, we bridge the gap between recent pose estimation and tracking work to develop a powerful method for robots to track objects in their surroundings. Motion-Nets use a segmentation model to segment the scene, and separate translation and rotation models to identify the relative 6D motion of an object between two consecutive frames. We train our method with generated data of floating objects, and then test on several prediction tasks, including one with a real PR2 robot, and a toy control task with a simulated PR2 robot never seen during training. Motion-Nets are able to track the pose of objects with some quantitative accuracy for about 30-60 frames including occlusions and distractors. Additionally, the single step prediction errors remain low even after 100 frames. We also investigate an iterative correction procedure to improve performance for control tasks.
△ Less
Submitted 30 October, 2019;
originally announced October 2019.
-
Spatially Resolving the Condensing Effect of Cholesterol in Lipid Bilayers
Authors:
Felix Leeb,
Lutz Maibaum
Abstract:
We study the effect of cholesterol on the structure of dipalmitoylphosphatidylcholine (DPPC) phospholipid bilayers. Using extensive molecular dynamics computer simulations at atomistic resolution we observe and quantify several structural changes upon increasing cholesterol content that are collectively known as the condensing effect: a thickening of the bilayer, an increase in lipid tail order, a…
▽ More
We study the effect of cholesterol on the structure of dipalmitoylphosphatidylcholine (DPPC) phospholipid bilayers. Using extensive molecular dynamics computer simulations at atomistic resolution we observe and quantify several structural changes upon increasing cholesterol content that are collectively known as the condensing effect: a thickening of the bilayer, an increase in lipid tail order, and a decrease in lateral area. We also observe a change in leaflet interdigitation and a lack thereof in the distributions of DPPC head group orientations. These results, obtained over a wide range of cholesterol mole fractions, are then used to calibrate the analysis of phospholipid properties in bilayers containing a single cholesterol molecule per leaflet, which we perform in a spatially resolved way. We find that a single cholesterol molecule affects phospholipids in its first and second solvation shells, which puts the range of this interaction to be on the order of one to two nanometers. We also observe a tendency of phospholipids to orient their polar head groups toward the cholesterol, which provides additional support for the umbrella model of bilayer organization.
△ Less
Submitted 6 June, 2018;
originally announced June 2018.
-
SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning and Control
Authors:
Arunkumar Byravan,
Felix Leeb,
Franziska Meier,
Dieter Fox
Abstract:
In this work, we present an approach to deep visuomotor control using structured deep dynamics models. Our deep dynamics model, a variant of SE3-Nets, learns a low-dimensional pose embedding for visuomotor control via an encoder-decoder structure. Unlike prior work, our dynamics model is structured: given an input scene, our network explicitly learns to segment salient parts and predict their pose…
▽ More
In this work, we present an approach to deep visuomotor control using structured deep dynamics models. Our deep dynamics model, a variant of SE3-Nets, learns a low-dimensional pose embedding for visuomotor control via an encoder-decoder structure. Unlike prior work, our dynamics model is structured: given an input scene, our network explicitly learns to segment salient parts and predict their pose-embedding along with their motion modeled as a change in the pose space due to the applied actions. We train our model using a pair of point clouds separated by an action and show that given supervision only in the form of point-wise data associations between the frames our network is able to learn a meaningful segmentation of the scene along with consistent poses. We further show that our model can be used for closed-loop control directly in the learned low-dimensional pose space, where the actions are computed by minimizing error in the pose space using gradient-based methods, similar to traditional model-based control. We present results on controlling a Baxter robot from raw depth data in simulation and in the real world and compare against two baseline deep networks. Our method runs in real-time, achieves good prediction of scene dynamics and outperforms the baseline methods on multiple control runs. Video results can be found at: https://rse-lab.cs.washington.edu/se3-structured-deep-ctrl/
△ Less
Submitted 2 October, 2017;
originally announced October 2017.