-
The Era of Semantic Decoding
Authors:
Maxime Peyrard,
Martin Josifoski,
Robert West
Abstract:
Recent work demonstrated great promise in the idea of orchestrating collaborations between LLMs, human input, and various tools to address the inherent limitations of LLMs. We propose a novel perspective called semantic decoding, which frames these collaborative processes as optimization procedures in semantic space. Specifically, we conceptualize LLMs as semantic processors that manipulate meanin…
▽ More
Recent work demonstrated great promise in the idea of orchestrating collaborations between LLMs, human input, and various tools to address the inherent limitations of LLMs. We propose a novel perspective called semantic decoding, which frames these collaborative processes as optimization procedures in semantic space. Specifically, we conceptualize LLMs as semantic processors that manipulate meaningful pieces of information that we call semantic tokens (known thoughts). LLMs are among a large pool of other semantic processors, including humans and tools, such as search engines or code executors. Collectively, semantic processors engage in dynamic exchanges of semantic tokens to progressively construct high-utility outputs. We refer to these orchestrated interactions among semantic processors, optimizing and searching in semantic space, as semantic decoding algorithms. This concept draws a direct parallel to the well-studied problem of syntactic decoding, which involves crafting algorithms to best exploit auto-regressive language models for extracting high-utility sequences of syntactic tokens. By focusing on the semantic level and disregarding syntactic details, we gain a fresh perspective on the engineering of AI systems, enabling us to imagine systems with much greater complexity and capabilities. In this position paper, we formalize the transition from syntactic to semantic tokens as well as the analogy between syntactic and semantic decoding. Subsequently, we explore the possibilities of optimizing within the space of semantic tokens via semantic decoding algorithms. We conclude with a list of research opportunities and questions arising from this fresh perspective. The semantic decoding perspective offers a powerful abstraction for search and optimization directly in the space of meaningful concepts, with semantic tokens as the fundamental units of a new type of computation.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Symbolic Autoencoding for Self-Supervised Sequence Learning
Authors:
Mohammad Hossein Amani,
Nicolas Mario Baldwin,
Amin Mansouri,
Martin Josifoski,
Maxime Peyrard,
Robert West
Abstract:
Traditional language models, adept at next-token prediction in text sequences, often struggle with transduction tasks between distinct symbolic systems, particularly when parallel data is scarce. Addressing this issue, we introduce \textit{symbolic autoencoding} ($Σ$AE), a self-supervised framework that harnesses the power of abundant unparallel data alongside limited parallel data. $Σ$AE connects…
▽ More
Traditional language models, adept at next-token prediction in text sequences, often struggle with transduction tasks between distinct symbolic systems, particularly when parallel data is scarce. Addressing this issue, we introduce \textit{symbolic autoencoding} ($Σ$AE), a self-supervised framework that harnesses the power of abundant unparallel data alongside limited parallel data. $Σ$AE connects two generative models via a discrete bottleneck layer and is optimized end-to-end by minimizing reconstruction loss (simultaneously with supervised loss for the parallel data), such that the sequence generated by the discrete bottleneck can be read out as the transduced input sequence. We also develop gradient-based methods allowing for efficient self-supervised sequence learning despite the discreteness of the bottleneck. Our results demonstrate that $Σ$AE significantly enhances performance on transduction tasks, even with minimal parallel data, offering a promising solution for weakly supervised learning scenarios.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Evaluating Language Model Agency through Negotiations
Authors:
Tim R. Davidson,
Veniamin Veselovsky,
Martin Josifoski,
Maxime Peyrard,
Antoine Bosselut,
Michal Kosinski,
Robert West
Abstract:
We introduce an approach to evaluate language model (LM) agency using negotiation games. This approach better reflects real-world use cases and addresses some of the shortcomings of alternative LM benchmarks. Negotiation games enable us to study multi-turn, and cross-model interactions, modulate complexity, and side-step accidental evaluation data leakage. We use our approach to test six widely us…
▽ More
We introduce an approach to evaluate language model (LM) agency using negotiation games. This approach better reflects real-world use cases and addresses some of the shortcomings of alternative LM benchmarks. Negotiation games enable us to study multi-turn, and cross-model interactions, modulate complexity, and side-step accidental evaluation data leakage. We use our approach to test six widely used and publicly accessible LMs, evaluating performance and alignment in both self-play and cross-play settings. Noteworthy findings include: (i) only closed-source models tested here were able to complete these tasks; (ii) cooperative bargaining games proved to be most challenging to the models; and (iii) even the most powerful models sometimes "lose" to weaker opponents
△ Less
Submitted 16 March, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
Authors:
Giovanni Monea,
Maxime Peyrard,
Martin Josifoski,
Vishrav Chaudhary,
Jason Eisner,
Emre Kıcıman,
Hamid Palangi,
Barun Patra,
Robert West
Abstract:
Large language models (LLMs) have an impressive ability to draw on novel information supplied in their context. Yet the mechanisms underlying this contextual grounding remain unknown, especially in situations where contextual information contradicts factual knowledge stored in the parameters, which LLMs also excel at recalling. Favoring the contextual information is critical for retrieval-augmente…
▽ More
Large language models (LLMs) have an impressive ability to draw on novel information supplied in their context. Yet the mechanisms underlying this contextual grounding remain unknown, especially in situations where contextual information contradicts factual knowledge stored in the parameters, which LLMs also excel at recalling. Favoring the contextual information is critical for retrieval-augmented generation methods, which enrich the context with up-to-date information, ho** that grounding can rectify outdated or noisy stored knowledge. We present a novel method to study grounding abilities using Fakepedia, a novel dataset of counterfactual texts constructed to clash with a model's internal parametric knowledge. In this study, we introduce Fakepedia, a counterfactual dataset designed to evaluate grounding abilities when the internal parametric knowledge clashes with the contextual information. We benchmark various LLMs with Fakepedia and conduct a causal mediation analysis of LLM components when answering Fakepedia queries, based on our Masked Grouped Causal Tracing (MGCT) method. Through this analysis, we identify distinct computational patterns between grounded and ungrounded responses. We finally demonstrate that distinguishing grounded from ungrounded responses is achievable through computational analysis alone. Our results, together with existing findings about factual recall mechanisms, provide a coherent narrative of how grounding and factual recall mechanisms interact within LLMs.
△ Less
Submitted 10 June, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
What can we learn from the dynamics of the Covid-19 epidemic ?
Authors:
Michel Peyrard
Abstract:
We investigate the mechanisms behind the quasi-periodic outbursts on the Covid-19 epidemics. Data for France and Germany show that the patterns of outbursts exhibit a qualitative change in early 2022, which appears in a change in their average period, and which is confirmed by time-frequency analysis. This provides a signal which can be used to discriminate among several mechanisms. Two main ideas…
▽ More
We investigate the mechanisms behind the quasi-periodic outbursts on the Covid-19 epidemics. Data for France and Germany show that the patterns of outbursts exhibit a qualitative change in early 2022, which appears in a change in their average period, and which is confirmed by time-frequency analysis. This provides a signal which can be used to discriminate among several mechanisms. Two main ideas have been proposed to explain periodicity in epidemics. One involves memory effects and another considers exchanges between epidemic clusters and a reservoir of population. We test these two approaches in the particular case of the Covid-19 epidemics and show that the "cluster model" is the only one which appears to be able to explain the observed pattern with realistic parameters. A last section discusses our results in the context of early studies of epidemics, and we stress the importance to work with models with a limited number of parameters, which moreover can be sufficiently well estimated, to draw conclusions on the general mechanisms behind the observations.
△ Less
Submitted 3 October, 2023; v1 submitted 27 August, 2023;
originally announced August 2023.
-
Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling
Authors:
Marija Šakota,
Maxime Peyrard,
Robert West
Abstract:
Generative language models (LMs) have become omnipresent across data science. For a wide variety of tasks, inputs can be phrased as natural language prompts for an LM, from whose output the solution can then be extracted. LM performance has consistently been increasing with model size - but so has the monetary cost of querying the ever larger models. Importantly, however, not all inputs are equall…
▽ More
Generative language models (LMs) have become omnipresent across data science. For a wide variety of tasks, inputs can be phrased as natural language prompts for an LM, from whose output the solution can then be extracted. LM performance has consistently been increasing with model size - but so has the monetary cost of querying the ever larger models. Importantly, however, not all inputs are equally hard: some require larger LMs for obtaining a satisfactory solution, whereas for others smaller LMs suffice. Based on this fact, we design a framework for cost-effective language model choice, called "Fly-swat or cannon" (FORC). Given a set of inputs and a set of candidate LMs, FORC judiciously assigns each input to an LM predicted to do well on the input according to a so-called meta-model, aiming to achieve high overall performance at low cost. The cost-performance tradeoff can be flexibly tuned by the user. Options include, among others, maximizing total expected performance (or the number of processed inputs) while staying within a given cost budget, or minimizing total cost while processing all inputs. We evaluate FORC on 14 datasets covering five natural language tasks, using four candidate LMs of vastly different size and cost. With FORC, we match the performance of the largest available LM while achieving a cost reduction of 63%. Via our publicly available library, researchers as well as practitioners can thus save large amounts of money without sacrificing performance.
△ Less
Submitted 18 December, 2023; v1 submitted 11 August, 2023;
originally announced August 2023.
-
Flows: Building Blocks of Reasoning and Collaborating AI
Authors:
Martin Josifoski,
Lars Klein,
Maxime Peyrard,
Nicolas Baldwin,
Yifei Li,
Saibo Geng,
Julian Paul Schnitzler,
Yuxing Yao,
Jiheng Wei,
Debjit Paul,
Robert West
Abstract:
Recent advances in artificial intelligence (AI) have produced highly capable and controllable systems. This creates unprecedented opportunities for structured reasoning as well as collaboration among multiple AI systems and humans. To fully realize this potential, it is essential to develop a principled way of designing and studying such structured interactions. For this purpose, we introduce the…
▽ More
Recent advances in artificial intelligence (AI) have produced highly capable and controllable systems. This creates unprecedented opportunities for structured reasoning as well as collaboration among multiple AI systems and humans. To fully realize this potential, it is essential to develop a principled way of designing and studying such structured interactions. For this purpose, we introduce the conceptual framework Flows. Flows are self-contained building blocks of computation, with an isolated state, communicating through a standardized message-based interface. This modular design simplifies the process of creating Flows by allowing them to be recursively composed into arbitrarily nested interactions and is inherently concurrency-friendly. Crucially, any interaction can be implemented using this framework, including prior work on AI-AI and human-AI interactions, prompt engineering schemes, and tool augmentation. We demonstrate the potential of Flows on competitive coding, a challenging task on which even GPT-4 struggles. Our results suggest that structured reasoning and collaboration substantially improve generalization, with AI-only Flows adding +21 and human-AI Flows adding +54 absolute points in terms of solve rate. To support rapid and rigorous research, we introduce the aiFlows library embodying Flows. The aiFlows library is available at https://github.com/epfl-dlab/aiflows. Data and Flows for reproducing our experiments are available at https://github.com/epfl-dlab/cc_flows.
△ Less
Submitted 7 February, 2024; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
Authors:
Saibo Geng,
Martin Josifoski,
Maxime Peyrard,
Robert West
Abstract:
Despite their impressive performance, large language models (LMs) still struggle with reliably generating complex output structures when not finetuned to follow the required output format exactly. To address this issue, grammar-constrained decoding (GCD) can be used to control the generation of LMs, guaranteeing that the output follows a given structure. Most existing GCD methods are, however, lim…
▽ More
Despite their impressive performance, large language models (LMs) still struggle with reliably generating complex output structures when not finetuned to follow the required output format exactly. To address this issue, grammar-constrained decoding (GCD) can be used to control the generation of LMs, guaranteeing that the output follows a given structure. Most existing GCD methods are, however, limited to specific tasks, such as parsing or code generation. In this work, we demonstrate that formal grammars can describe the output space for a much wider range of tasks and argue that GCD can serve as a unified framework for structured NLP tasks in general. For increased flexibility, we introduce input-dependent grammars, which allow the grammar to depend on the input and thus enable the generation of different output structures for different inputs. We then empirically demonstrate the power and flexibility of GCD-enhanced LMs on (1) information extraction, (2) entity disambiguation, and (3) constituency parsing. Our results indicate that grammar-constrained LMs substantially outperform unconstrained LMs or even beat task-specific finetuned models. Grammar constraints thus hold great promise for harnessing off-the-shelf LMs for a wide range of structured NLP tasks, especially where training data is scarce or finetuning is expensive. Code and data: https://github.com/epfl-dlab/GCD.
△ Less
Submitted 18 January, 2024; v1 submitted 23 May, 2023;
originally announced May 2023.
-
REFINER: Reasoning Feedback on Intermediate Representations
Authors:
Debjit Paul,
Mete Ismayilzada,
Maxime Peyrard,
Beatriz Borges,
Antoine Bosselut,
Robert West,
Boi Faltings
Abstract:
Language models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences, e.g., chain-of-thought prompting. However, these intermediate inference steps may be inappropriate deductions from the initial context and lead to incorrect final predictions. Here we introduce REFINER, a framework for finetuning LMs to explicitly generate intermedi…
▽ More
Language models (LMs) have recently shown remarkable performance on reasoning tasks by explicitly generating intermediate inferences, e.g., chain-of-thought prompting. However, these intermediate inference steps may be inappropriate deductions from the initial context and lead to incorrect final predictions. Here we introduce REFINER, a framework for finetuning LMs to explicitly generate intermediate reasoning steps while interacting with a critic model that provides automated feedback on the reasoning. Specifically, the critic provides structured feedback that the reasoning LM uses to iteratively improve its intermediate arguments. Empirical evaluations of REFINER on three diverse reasoning tasks show significant improvements over baseline LMs of comparable scale. Furthermore, when using GPT-3.5 or ChatGPT as the reasoner, the trained critic significantly improves reasoning without finetuning the reasoner. Finally, our critic model is trained without expensive human-in-the-loop data but can be substituted with humans at inference time.
△ Less
Submitted 4 February, 2024; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction
Authors:
Martin Josifoski,
Marija Sakota,
Maxime Peyrard,
Robert West
Abstract:
Large language models (LLMs) have great potential for synthetic data generation. This work shows that useful data can be synthetically generated even for tasks that cannot be solved directly by LLMs: for problems with structured outputs, it is possible to prompt an LLM to perform the task in the reverse direction, by generating plausible input text for a target output structure. Leveraging this as…
▽ More
Large language models (LLMs) have great potential for synthetic data generation. This work shows that useful data can be synthetically generated even for tasks that cannot be solved directly by LLMs: for problems with structured outputs, it is possible to prompt an LLM to perform the task in the reverse direction, by generating plausible input text for a target output structure. Leveraging this asymmetry in task difficulty makes it possible to produce large-scale, high-quality data for complex tasks. We demonstrate the effectiveness of this approach on closed information extraction, where collecting ground-truth data is challenging, and no satisfactory dataset exists to date. We synthetically generate a dataset of 1.8M data points, establish its superior quality compared to existing datasets in a human evaluation, and use it to finetune small models (220M and 770M parameters), termed SynthIE, that outperform the prior state of the art (with equal model size) by a substantial margin of 57 absolute points in micro-F1 and 79 points in macro-F1. Code, data, and models are available at https://github.com/epfl-dlab/SynthIE.
△ Less
Submitted 29 October, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Language Model Decoding as Likelihood-Utility Alignment
Authors:
Martin Josifoski,
Maxime Peyrard,
Frano Rajic,
Jiheng Wei,
Debjit Paul,
Valentin Hartmann,
Barun Patra,
Vishrav Chaudhary,
Emre Kıcıman,
Boi Faltings,
Robert West
Abstract:
A critical component of a successful language generation pipeline is the decoding algorithm. However, the general principles that should guide the choice of a decoding algorithm remain unclear. Previous works only compare decoding algorithms in narrow scenarios, and their findings do not generalize across tasks. We argue that the misalignment between the model's likelihood and the task-specific no…
▽ More
A critical component of a successful language generation pipeline is the decoding algorithm. However, the general principles that should guide the choice of a decoding algorithm remain unclear. Previous works only compare decoding algorithms in narrow scenarios, and their findings do not generalize across tasks. We argue that the misalignment between the model's likelihood and the task-specific notion of utility is the key factor to understanding the effectiveness of decoding algorithms. To structure the discussion, we introduce a taxonomy of misalignment mitigation strategies (MMSs), providing a unifying view of decoding as a tool for alignment. The MMS taxonomy groups decoding algorithms based on their implicit assumptions about likelihood--utility misalignment, yielding general statements about their applicability across tasks. Specifically, by analyzing the correlation between the likelihood and the utility of predictions across a diverse set of tasks, we provide empirical evidence supporting the proposed taxonomy and a set of principles to structure reasoning when choosing a decoding algorithm. Crucially, our analysis is the first to relate likelihood-based decoding algorithms with algorithms that rely on external information, such as value-guided methods and prompting, and covers the most diverse set of tasks to date. Code, data, and models are available at https://github.com/epfl-dlab/understanding-decoding.
△ Less
Submitted 16 March, 2023; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Distribution inference risks: Identifying and mitigating sources of leakage
Authors:
Valentin Hartmann,
Léo Meynent,
Maxime Peyrard,
Dimitrios Dimitriadis,
Shruti Tople,
Robert West
Abstract:
A large body of work shows that machine learning (ML) models can leak sensitive or confidential information about their training data. Recently, leakage due to distribution inference (or property inference) attacks is gaining attention. In this attack, the goal of an adversary is to infer distributional information about the training data. So far, research on distribution inference has focused on…
▽ More
A large body of work shows that machine learning (ML) models can leak sensitive or confidential information about their training data. Recently, leakage due to distribution inference (or property inference) attacks is gaining attention. In this attack, the goal of an adversary is to infer distributional information about the training data. So far, research on distribution inference has focused on demonstrating successful attacks, with little attention given to identifying the potential causes of the leakage and to proposing mitigations. To bridge this gap, as our main contribution, we theoretically and empirically analyze the sources of information leakage that allows an adversary to perpetrate distribution inference attacks. We identify three sources of leakage: (1) memorizing specific information about the $\mathbb{E}[Y|X]$ (expected label given the feature values) of interest to the adversary, (2) wrong inductive bias of the model, and (3) finiteness of the training data. Next, based on our analysis, we propose principled mitigation techniques against distribution inference attacks. Specifically, we demonstrate that causal learning techniques are more resilient to a particular type of distribution inference risk termed distributional membership inference than associative learning methods. And lastly, we present a formalization of distribution inference that allows for reasoning about more general adversaries than was previously possible.
△ Less
Submitted 18 September, 2022;
originally announced September 2022.
-
The Glass Ceiling of Automatic Evaluation in Natural Language Generation
Authors:
Pierre Colombo,
Maxime Peyrard,
Nathan Noiry,
Robert West,
Pablo Piantanida
Abstract:
Automatic evaluation metrics capable of replacing human judgments are critical to allowing fast development of new methods. Thus, numerous research efforts have focused on crafting such metrics. In this work, we take a step back and analyze recent progress by comparing the body of existing automatic metrics and human metrics altogether. As metrics are used based on how they rank systems, we compar…
▽ More
Automatic evaluation metrics capable of replacing human judgments are critical to allowing fast development of new methods. Thus, numerous research efforts have focused on crafting such metrics. In this work, we take a step back and analyze recent progress by comparing the body of existing automatic metrics and human metrics altogether. As metrics are used based on how they rank systems, we compare metrics in the space of system rankings. Our extensive statistical analysis reveals surprising findings: automatic metrics -- old and new -- are much more similar to each other than to humans. Automatic metrics are not complementary and rank systems similarly. Strikingly, human metrics predict each other much better than the combination of all automatic metrics used to predict a human metric. It is surprising because human metrics are often designed to be independent, to capture different aspects of quality, e.g. content fidelity or readability. We provide a discussion of these findings and recommendations for future work in the field of evaluation.
△ Less
Submitted 7 October, 2022; v1 submitted 30 August, 2022;
originally announced August 2022.
-
Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning
Authors:
Damien Teney,
Maxime Peyrard,
Ehsan Abbasnejad
Abstract:
Machine learning (ML) models are typically optimized for their accuracy on a given dataset. However, this predictive criterion rarely captures all desirable properties of a model, in particular how well it matches a domain expert's understanding of a task. Underspecification refers to the existence of multiple models that are indistinguishable in their in-domain accuracy, even though they differ i…
▽ More
Machine learning (ML) models are typically optimized for their accuracy on a given dataset. However, this predictive criterion rarely captures all desirable properties of a model, in particular how well it matches a domain expert's understanding of a task. Underspecification refers to the existence of multiple models that are indistinguishable in their in-domain accuracy, even though they differ in other desirable properties such as out-of-distribution (OOD) performance. Identifying these situations is critical for assessing the reliability of ML models.
We formalize the concept of underspecification and propose a method to identify and partially address it. We train multiple models with an independence constraint that forces them to implement different functions. They discover predictive features that are otherwise ignored by standard empirical risk minimization (ERM), which we then distill into a global model with superior OOD performance. Importantly, we constrain the models to align with the data manifold to ensure that they discover meaningful features. We demonstrate the method on multiple datasets in computer vision (collages, WILDS-Camelyon17, GQA) and discuss general implications of underspecification. Most notably, in-domain performance cannot serve for OOD model selection without additional assumptions.
△ Less
Submitted 6 July, 2022;
originally announced July 2022.
-
Descartes: Generating Short Descriptions of Wikipedia Articles
Authors:
Marija Sakota,
Maxime Peyrard,
Robert West
Abstract:
Wikipedia is one of the richest knowledge sources on the Web today. In order to facilitate navigating, searching, and maintaining its content, Wikipedia's guidelines state that all articles should be annotated with a so-called short description indicating the article's topic (e.g., the short description of beer is "Alcoholic drink made from fermented cereal grains"). Nonetheless, a large fraction…
▽ More
Wikipedia is one of the richest knowledge sources on the Web today. In order to facilitate navigating, searching, and maintaining its content, Wikipedia's guidelines state that all articles should be annotated with a so-called short description indicating the article's topic (e.g., the short description of beer is "Alcoholic drink made from fermented cereal grains"). Nonetheless, a large fraction of articles (ranging from 10.2% in Dutch to 99.7% in Kazakh) have no short description yet, with detrimental effects for millions of Wikipedia users. Motivated by this problem, we introduce the novel task of automatically generating short descriptions for Wikipedia articles and propose Descartes, a multilingual model for tackling it. Descartes integrates three sources of information to generate an article description in a target language: the text of the article in all its language versions, the already-existing descriptions (if any) of the article in other languages, and semantic type information obtained from a knowledge graph. We evaluate a Descartes model trained for handling 25 languages simultaneously, showing that it beats baselines (including a strong translation-based baseline) and performs on par with monolingual models tailored for specific languages. A human evaluation on three languages further shows that the quality of Descartes's descriptions is largely indistinguishable from that of human-written descriptions; e.g., 91.3% of our English descriptions (vs. 92.1% of human-written descriptions) pass the bar for inclusion in Wikipedia, suggesting that Descartes is ready for production, with the potential to support human editors in filling a major gap in today's Wikipedia across languages.
△ Less
Submitted 17 February, 2023; v1 submitted 20 May, 2022;
originally announced May 2022.
-
Understanding temperature modulated calorimetry through studies of a model system
Authors:
Jean-Luc Garden,
Michel Peyrard
Abstract:
Temperature Modulated calorimetry is widely used but still raises some fundamental questions. In this paper we study a model system as a test sample to address some of them. The model has a nontrivial spectrum of relaxation times. We investigate temperature modulated calorimetry at constant average temperature to precise the meaning of the frequency-dependent heat capacity, its relation with entro…
▽ More
Temperature Modulated calorimetry is widely used but still raises some fundamental questions. In this paper we study a model system as a test sample to address some of them. The model has a nontrivial spectrum of relaxation times. We investigate temperature modulated calorimetry at constant average temperature to precise the meaning of the frequency-dependent heat capacity, its relation with entropy production, and how such measurements can observe the aging of a glassy sample leading to a time-dependent heat capacity. The study of the Kovacs effect for an out-of-equilibrium system shows how temperature modulated calorimetry could contribute to the understanding of this memory effect. Then we compare measurements of standard scanning calorimetry and temperature-modulated calorimetry and show how the two methods are complementary because they do not observe the same features. While it can probe the time scales of energy transfers in a system, even in the limit of low frequency temperature modulated calorimetry does not probe some relaxation phenomena which can be measured by scanning calorimetry, as suggested by experiments with glasses.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
On the Context-Free Ambiguity of Emoji
Authors:
Justyna Czestochowska,
Kristina Gligoric,
Maxime Peyrard,
Yann Mentha,
Michal Bien,
Andrea Grutter,
Anita Auer,
Aris Xanthos,
Robert West
Abstract:
Emojis come with prepacked semantics making them great candidates to create new forms of more accessible communications. Yet, little is known about how much of this emojis semantic is agreed upon by humans, outside of textual contexts. Thus, we collected a crowdsourced dataset of one-word emoji descriptions for 1,289 emojis presented to participants with no surrounding text. The emojis and their i…
▽ More
Emojis come with prepacked semantics making them great candidates to create new forms of more accessible communications. Yet, little is known about how much of this emojis semantic is agreed upon by humans, outside of textual contexts. Thus, we collected a crowdsourced dataset of one-word emoji descriptions for 1,289 emojis presented to participants with no surrounding text. The emojis and their interpretations were then examined for ambiguity. We find that with 30 annotations per emoji, 16 emojis (1.2%) are completely unambiguous, whereas 55 emojis (4.3%) are so ambiguous that their descriptions are indistinguishable from randomly chosen descriptions. Most of studied emojis are spread out between the two extremes. Furthermore, investigating the ambiguity of different types of emojis, we find that an important factor is the extent to which an emoji has an embedded symbolical meaning drawn from an established code-book of symbols. We conclude by discussing design implications.
△ Less
Submitted 5 April, 2022; v1 submitted 17 January, 2022;
originally announced January 2022.
-
GenIE: Generative Information Extraction
Authors:
Martin Josifoski,
Nicola De Cao,
Maxime Peyrard,
Fabio Petroni,
Robert West
Abstract:
Structured and grounded representation of text is typically formalized by closed information extraction, the problem of extracting an exhaustive set of (subject, relation, object) triplets that are consistent with a predefined set of entities and relations from a knowledge base schema. Most existing works are pipelines prone to error accumulation, and all approaches are only applicable to unrealis…
▽ More
Structured and grounded representation of text is typically formalized by closed information extraction, the problem of extracting an exhaustive set of (subject, relation, object) triplets that are consistent with a predefined set of entities and relations from a knowledge base schema. Most existing works are pipelines prone to error accumulation, and all approaches are only applicable to unrealistically small numbers of entities and relations. We introduce GenIE (generative information extraction), the first end-to-end autoregressive formulation of closed information extraction. GenIE naturally exploits the language knowledge from the pre-trained transformer by autoregressively generating relations and entities in textual form. Thanks to a new bi-level constrained generation strategy, only triplets consistent with the predefined knowledge base schema are produced. Our experiments show that GenIE is state-of-the-art on closed information extraction, generalizes from fewer training data points than baselines, and scales to a previously unmanageable number of entities and relations. With this work, closed information extraction becomes practical in realistic scenarios, providing new opportunities for downstream tasks. Finally, this work paves the way towards a unified end-to-end approach to the core tasks of information extraction. Code, data and models available at https://github.com/epfl-dlab/GenIE.
△ Less
Submitted 13 April, 2022; v1 submitted 15 December, 2021;
originally announced December 2021.
-
Better than Average: Paired Evaluation of NLP Systems
Authors:
Maxime Peyrard,
Wei Zhao,
Steffen Eger,
Robert West
Abstract:
Evaluation in NLP is usually done by comparing the scores of competing systems independently averaged over a common set of test instances. In this work, we question the use of averages for aggregating evaluation scores into a final number used to decide which system is best, since the average, as well as alternatives such as the median, ignores the pairing arising from the fact that systems are ev…
▽ More
Evaluation in NLP is usually done by comparing the scores of competing systems independently averaged over a common set of test instances. In this work, we question the use of averages for aggregating evaluation scores into a final number used to decide which system is best, since the average, as well as alternatives such as the median, ignores the pairing arising from the fact that systems are evaluated on the same test instances. We illustrate the importance of taking the instance-level pairing of evaluation scores into account and demonstrate, both theoretically and empirically, the advantages of aggregation methods based on pairwise comparisons, such as the Bradley-Terry (BT) model, a mechanism based on the estimated probability that a given system scores better than another on the test set. By re-evaluating 296 real NLP evaluation setups across four tasks and 18 evaluation metrics, we show that the choice of aggregation mechanism matters and yields different conclusions as to which systems are state of the art in about 30% of the setups. To facilitate the adoption of pairwise evaluation, we release a practical tool for performing the full analysis of evaluation scores with the mean, median, BT, and two variants of BT (Elo and TrueSkill), alongside functionality for appropriate statistical testing.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Invariant Language Modeling
Authors:
Maxime Peyrard,
Sarvjeet Singh Ghotra,
Martin Josifoski,
Vidhan Agarwal,
Barun Patra,
Dean Carignan,
Emre Kiciman,
Robert West
Abstract:
Large pretrained language models are critical components of modern NLP pipelines. Yet, they suffer from spurious correlations, poor out-of-domain generalization, and biases. Inspired by recent progress in causal machine learning, in particular the invariant risk minimization (IRM) paradigm, we propose invariant language modeling, a framework for learning invariant representations that generalize b…
▽ More
Large pretrained language models are critical components of modern NLP pipelines. Yet, they suffer from spurious correlations, poor out-of-domain generalization, and biases. Inspired by recent progress in causal machine learning, in particular the invariant risk minimization (IRM) paradigm, we propose invariant language modeling, a framework for learning invariant representations that generalize better across multiple environments. In particular, we adapt a game-theoretic formulation of IRM (IRM-games) to language models, where the invariance emerges from a specific training schedule in which all the environments compete to optimize their own environment-specific loss by updating subsets of the model in a round-robin fashion. We focus on controlled experiments to precisely demonstrate the ability of our method to (i) remove structured noise, (ii) ignore specific spurious correlations without affecting global performance, and (iii) achieve better out-of-domain generalization. These benefits come with a negligible computational overhead compared to standard training, do not require changing the local loss, and can be applied to any language model. We believe this framework is promising to help mitigate spurious correlations and biases in language models.
△ Less
Submitted 14 November, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Laughing Heads: Can Transformers Detect What Makes a Sentence Funny?
Authors:
Maxime Peyrard,
Beatriz Borges,
Kristina Gligorić,
Robert West
Abstract:
The automatic detection of humor poses a grand challenge for natural language processing. Transformer-based systems have recently achieved remarkable results on this task, but they usually (1)~were evaluated in setups where serious vs humorous texts came from entirely different sources, and (2)~focused on benchmarking performance without providing insights into how the models work. We make progres…
▽ More
The automatic detection of humor poses a grand challenge for natural language processing. Transformer-based systems have recently achieved remarkable results on this task, but they usually (1)~were evaluated in setups where serious vs humorous texts came from entirely different sources, and (2)~focused on benchmarking performance without providing insights into how the models work. We make progress in both respects by training and analyzing transformer-based humor recognition models on a recently introduced dataset consisting of minimal pairs of aligned sentences, one serious, the other humorous. We find that, although our aligned dataset is much harder than previous datasets, transformer-based models recognize the humorous sentence in an aligned pair with high accuracy (78%). In a careful error analysis, we characterize easy vs hard instances. Finally, by analyzing attention weights, we obtain important insights into the mechanisms by which transformers recognize humor. Most remarkably, we find clear evidence that one single attention head learns to recognize the words that make a test sentence humorous, even without access to this information at training time.
△ Less
Submitted 25 August, 2021; v1 submitted 19 May, 2021;
originally announced May 2021.
-
Memory effects in glasses: insights into the thermodynamics of out of equilibrium systems revealed by a simple model of the Kovacs effect
Authors:
Michel Peyrard,
Jean-Luc Garden
Abstract:
This paper is an extended version of an article accepted for publication in Physical Review E. Besides its fundamental interest, the model that we investigate in this article is simple enough to be used as a basis for courses or tutorials on the thermodynamics of out of equilibrium systems. It allows simple numerical calculations and analytical analysis which highlight important concepts with an e…
▽ More
This paper is an extended version of an article accepted for publication in Physical Review E. Besides its fundamental interest, the model that we investigate in this article is simple enough to be used as a basis for courses or tutorials on the thermodynamics of out of equilibrium systems. It allows simple numerical calculations and analytical analysis which highlight important concepts with an easily workable example. This version includes studies of fast cooling and heating, exhibiting cases with negative heat capacity, and further discussions on the entropy which are not presented in the Physical Review E version.
Glasses are interesting materials because they allow us to explore the puzzling properties of out-of-equilibrium systems. One of them is the Kovacs effect in which a glass, brought to an out-of-equilibrium state in which all its thermodynamic variables are identical to those of an equilibrium state, nevertheless evolves, showing a hump in some global variable before the thermodynamic variables come back to their starting point. We show that a simple three-state system is sufficient to study this phenomenon using numerical integrations and exact analytical calculations. It also brings some light on the concept of fictive temperature, often used to extend standard thermodynamics to the out-of-equilibrium properties of glasses. We confirm that the concept of a unique fictive temperature is not valid, an show it can be extended to make a connection with the various relaxation processes in the system. The model also brings further insights on the thermodynamics of out-of-equilibrium systems. Moreover we show that the three-state model is able to describe various effects observed in glasses such as the asymmetric relaxation to equilibrium discussed by Kovacs, or the reverse crossover measured on $B_2O_3$.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
KLearn: Background Knowledge Inference from Summarization Data
Authors:
Maxime Peyrard,
Robert West
Abstract:
The goal of text summarization is to compress documents to the relevant information while excluding background information already known to the receiver. So far, summarization researchers have given considerably more attention to relevance than to background knowledge. In contrast, this work puts background knowledge in the foreground. Building on the realization that the choices made by human sum…
▽ More
The goal of text summarization is to compress documents to the relevant information while excluding background information already known to the receiver. So far, summarization researchers have given considerably more attention to relevance than to background knowledge. In contrast, this work puts background knowledge in the foreground. Building on the realization that the choices made by human summarizers and annotators contain implicit information about their background knowledge, we develop and compare techniques for inferring background knowledge from summarization data. Based on this framework, we define summary scoring functions that explicitly model background knowledge, and show that these scoring functions fit human judgments significantly better than baselines. We illustrate some of the many potential applications of our framework. First, we provide insights into human information importance priors. Second, we demonstrate that averaging the background knowledge of multiple, potentially biased annotators or corpora greatly improves summary-scoring performance. Finally, we discuss potential applications of our framework beyond summarization.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Experts and authorities receive disproportionate attention on Twitter during the COVID-19 crisis
Authors:
Kristina Gligorić,
Manoel Horta Ribeiro,
Martin Müller,
Olesia Altunina,
Maxime Peyrard,
Marcel Salathé,
Giovanni Colavizza,
Robert West
Abstract:
Timely access to accurate information is crucial during the COVID-19 pandemic. Prompted by key stakeholders' cautioning against an "infodemic", we study information sharing on Twitter from January through May 2020. We observe an overall surge in the volume of general as well as COVID-19-related tweets around peak lockdown in March/April 2020. With respect to engagement (retweets and likes), accoun…
▽ More
Timely access to accurate information is crucial during the COVID-19 pandemic. Prompted by key stakeholders' cautioning against an "infodemic", we study information sharing on Twitter from January through May 2020. We observe an overall surge in the volume of general as well as COVID-19-related tweets around peak lockdown in March/April 2020. With respect to engagement (retweets and likes), accounts related to healthcare, science, government and politics received by far the largest boosts, whereas accounts related to religion and sports saw a relative decrease in engagement. While the threat of an "infodemic" remains, our results show that social media also provide a platform for experts and public authorities to be widely heard during a global crisis.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
How is information transmitted in a nerve?
Authors:
Michel Peyrard
Abstract:
In the last fifteen years a debate emerged about the validity of the famous Hodgkin-Huxley model for nerve impulse. Mechanical models have been proposed. This note reviews the experimental properties of the nerve impulse and discusses the proposed alternatives. The experimental data, which rule out some of the alternative suggestions, show that, while the Hodgkin-Huxley model may not be complete,…
▽ More
In the last fifteen years a debate emerged about the validity of the famous Hodgkin-Huxley model for nerve impulse. Mechanical models have been proposed. This note reviews the experimental properties of the nerve impulse and discusses the proposed alternatives. The experimental data, which rule out some of the alternative suggestions, show that, while the Hodgkin-Huxley model may not be complete, it nevertheless includes essential features that should not be overlooked in the attempts made to improve, or supersede, it.
△ Less
Submitted 10 September, 2020; v1 submitted 6 August, 2020;
originally announced August 2020.
-
Sudden Attention Shifts on Wikipedia During the COVID-19 Crisis
Authors:
Manoel Horta Ribeiro,
Kristina Gligorić,
Maxime Peyrard,
Florian Lemmerich,
Markus Strohmaier,
Robert West
Abstract:
We study how the COVID-19 pandemic, alongside the severe mobility restrictions that ensued, has impacted information access on Wikipedia, the world's largest online encyclopedia. A longitudinal analysis that combines pageview statistics for 12 Wikipedia language editions with mobility reports published by Apple and Google reveals massive shifts in the volume and nature of information seeking patte…
▽ More
We study how the COVID-19 pandemic, alongside the severe mobility restrictions that ensued, has impacted information access on Wikipedia, the world's largest online encyclopedia. A longitudinal analysis that combines pageview statistics for 12 Wikipedia language editions with mobility reports published by Apple and Google reveals massive shifts in the volume and nature of information seeking patterns during the pandemic. Interestingly, while we observe a transient increase in Wikipedia's pageview volume following mobility restrictions, the nature of information sought was impacted more permanently. These changes are most pronounced for language editions associated with countries where the most severe mobility restrictions were implemented. We also find that articles belonging to different topics behaved differently; e.g., attention towards entertainment-related topics is lingering and even increasing, while the interest in health- and biology-related topics was either small or transient. Our results highlight the utility of Wikipedia for studying how the pandemic is affecting people's needs, interests, and concerns.
△ Less
Submitted 19 April, 2021; v1 submitted 18 May, 2020;
originally announced May 2020.
-
A Ladder of Causal Distances
Authors:
Maxime Peyrard,
Robert West
Abstract:
Causal discovery, the task of automatically constructing a causal model from data, is of major significance across the sciences. Evaluating the performance of causal discovery algorithms should ideally involve comparing the inferred models to ground-truth models available for benchmark datasets, which in turn requires a notion of distance between causal models. While such distances have been propo…
▽ More
Causal discovery, the task of automatically constructing a causal model from data, is of major significance across the sciences. Evaluating the performance of causal discovery algorithms should ideally involve comparing the inferred models to ground-truth models available for benchmark datasets, which in turn requires a notion of distance between causal models. While such distances have been proposed previously, they are limited by focusing on graphical properties of the causal models being compared. Here, we overcome this limitation by defining distances derived from the causal distributions induced by the models, rather than exclusively from their graphical structure. Pearl and Mackenzie (2018) have arranged the properties of causal models in a hierarchy called the "ladder of causation" spanning three rungs: observational, interventional, and counterfactual. Following this organization, we introduce a hierarchy of three distances, one for each rung of the ladder. Our definitions are intuitively appealing as well as efficient to compute approximately. We put our causal distances to use by benchmarking standard causal discovery systems on both synthetic and real-world datasets for which ground-truth causal models are available. Finally, we highlight the usefulness of our causal distances by briefly discussing further applications beyond the evaluation of causal discovery techniques.
△ Less
Submitted 25 August, 2021; v1 submitted 5 May, 2020;
originally announced May 2020.
-
On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation
Authors:
Wei Zhao,
Goran Glavaš,
Maxime Peyrard,
Yang Gao,
Robert West,
Steffen Eger
Abstract:
Evaluation of cross-lingual encoders is usually performed either via zero-shot cross-lingual transfer in supervised downstream tasks or via unsupervised cross-lingual textual similarity. In this paper, we concern ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations, which represents a natural adversa…
▽ More
Evaluation of cross-lingual encoders is usually performed either via zero-shot cross-lingual transfer in supervised downstream tasks or via unsupervised cross-lingual textual similarity. In this paper, we concern ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations, which represents a natural adversarial setup for multilingual encoders. Reference-free evaluation holds the promise of web-scale comparison of MT systems. We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER. We find that they perform poorly as semantic encoders for reference-free MT evaluation and identify their two key limitations, namely, (a) a semantic mismatch between representations of mutual translations and, more prominently, (b) the inability to punish "translationese", i.e., low-quality literal translations. We propose two partial remedies: (1) post-hoc re-alignment of the vector spaces and (2) coupling of semantic-similarity based metrics with target-side language modeling. In segment-level MT evaluation, our best metric surpasses reference-based BLEU by 5.7 correlation points.
△ Less
Submitted 8 June, 2020; v1 submitted 3 May, 2020;
originally announced May 2020.
-
Onset of sliding of elastomer multicontacts: failure of a model of independent asperities to match experiments
Authors:
Julien Scheibert,
Riad Sahli,
Michel Peyrard
Abstract:
Modelling of rough frictional interfaces is often based on asperity models, in which the individual behaviour of individual microjunctions is assumed. In the absence of local measurements at the microjunction scale, quantitative comparison of such models with experiments is usually based only on macroscopic quantities, like the total tangential load resisted by the interface. Recently however, a n…
▽ More
Modelling of rough frictional interfaces is often based on asperity models, in which the individual behaviour of individual microjunctions is assumed. In the absence of local measurements at the microjunction scale, quantitative comparison of such models with experiments is usually based only on macroscopic quantities, like the total tangential load resisted by the interface. Recently however, a new experimental dataset was presented on the onset of sliding of rough elastomeric interfaces, which includes local measurements of the contact area of the individual microjunctions. Here, we use this more comprehensive dataset to test the possibility of quantitatively matching the measurements with a model of independent asperities, enriched with experimental information about the area of microjunctions and its evolution under shear. We show that, despite using parameter values and behaviour laws constrained and inspired by experiments, our model does not quantitatively match the macroscopic measurements. We discuss the possible origins of this failure .
△ Less
Submitted 6 May, 2020; v1 submitted 15 November, 2019;
originally announced November 2019.
-
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Authors:
Wei Zhao,
Maxime Peyrard,
Fei Liu,
Yang Gao,
Christian M. Meyer,
Steffen Eger
Abstract:
A robust evaluation metric has a profound impact on the development of text generation systems. A desirable metric compares system output against references based on their semantics rather than surface forms. In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of text quality. We validate our new metric,…
▽ More
A robust evaluation metric has a profound impact on the development of text generation systems. A desirable metric compares system output against references based on their semantics rather than surface forms. In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of text quality. We validate our new metric, namely MoverScore, on a number of text generation tasks including summarization, machine translation, image captioning, and data-to-text generation, where the outputs are produced by a variety of neural and non-neural systems. Our findings suggest that metrics combining contextualized representations with a distance measure perform the best. Such metrics also demonstrate strong generalization capability across tasks. For ease-of-use we make our metrics available as web service.
△ Less
Submitted 26 September, 2019; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Kinky DNA in solution: Small angle scattering study of a nucleosome positioning sequence
Authors:
Torben Schindler,
Adrián González,
Ramachandran Boopathi,
Marta Marty Roda,
Lorena Romero-Santacreu,
Andrew Wildes,
Lionel Porcar,
Anne Martel,
Nikos Theodorakopoulos,
Santiago Cuesta-López,
Dimitar Angelov,
Tobias Unruh,
Michel Peyrard
Abstract:
DNA is a flexible molecule, but the degree of its flexibility is subject to debate. The commonly-accepted persistence length of $l_p \approx 500\,$Å is inconsistent with recent studies on short-chain DNA that show much greater flexibility but do not probe its origin. We have performed X-ray and neutron small-angle scattering on a short DNA sequence containing a strong nucleosome positioning elemen…
▽ More
DNA is a flexible molecule, but the degree of its flexibility is subject to debate. The commonly-accepted persistence length of $l_p \approx 500\,$Å is inconsistent with recent studies on short-chain DNA that show much greater flexibility but do not probe its origin. We have performed X-ray and neutron small-angle scattering on a short DNA sequence containing a strong nucleosome positioning element, and analyzed the results using a modified Kratky-Porod model to determine possible conformations. Our results support a hypothesis from Crick and Klug in 1975 that some DNA sequences in solution can have sharp kinks, potentially resolving the discrepancy. Our conclusions are supported by measurements on a radiation-damaged sample, where single-strand breaks lead to increased flexibility and by an analysis of data from another sequence, which does not have kinks, but where our method can detect a locally enhanced flexibility due to an $AT$-domain.
△ Less
Submitted 25 September, 2018;
originally announced September 2018.
-
Concatenated Power Mean Word Embeddings as Universal Cross-Lingual Sentence Representations
Authors:
Andreas Rücklé,
Steffen Eger,
Maxime Peyrard,
Iryna Gurevych
Abstract:
Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to st…
▽ More
Average word embeddings are a common baseline for more sophisticated sentence embedding techniques. However, they typically fall short of the performances of more complex models such as InferSent. Here, we generalize the concept of average word embeddings to power mean word embeddings. We show that the concatenation of different types of power mean word embeddings considerably closes the gap to state-of-the-art methods monolingually and substantially outperforms these more complex techniques cross-lingually. In addition, our proposed method outperforms different recently proposed baselines such as SIF and Sent2Vec by a solid margin, thus constituting a much harder-to-beat monolingual baseline. Our data and code are publicly available.
△ Less
Submitted 12 September, 2018; v1 submitted 4 March, 2018;
originally announced March 2018.
-
Live Blog Corpus for Summarization
Authors:
Avinesh P. V. S.,
Maxime Peyrard,
Christian M. Meyer
Abstract:
Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader but are often not available. In this paper, we study a way of collecting corpora for automatic live blo…
▽ More
Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader but are often not available. In this paper, we study a way of collecting corpora for automatic live blog summarization. In an empirical evaluation using well-known state-of-the-art summarization systems, we show that live blogs corpus poses new challenges in the field of summarization. We make our tools publicly available to reconstruct the corpus to encourage the research community and replicate our results.
△ Less
Submitted 27 February, 2018;
originally announced February 2018.
-
A Simple Theoretical Model of Importance for Summarization
Authors:
Maxime Peyrard
Abstract:
Research on summarization has mainly been driven by empirical approaches, crafting systems to perform well on standard datasets with the notion of information Importance remaining latent. We argue that establishing theoretical models of Importance will advance our understanding of the task and help to further improve summarization systems. To this end, we propose simple but rigorous definitions of…
▽ More
Research on summarization has mainly been driven by empirical approaches, crafting systems to perform well on standard datasets with the notion of information Importance remaining latent. We argue that establishing theoretical models of Importance will advance our understanding of the task and help to further improve summarization systems. To this end, we propose simple but rigorous definitions of several concepts that were previously used only intuitively in summarization: Redundancy, Relevance, and Informativeness. Importance arises as a single quantity naturally unifying these concepts. Additionally, we provide intuitions to interpret the proposed quantities and experiments to demonstrate the potential of the framework to inform and guide subsequent works.
△ Less
Submitted 6 August, 2019; v1 submitted 26 January, 2018;
originally announced January 2018.
-
From thermal rectifiers to thermoelectric devices
Authors:
Giuliano Benenti,
Giulio Casati,
Carlos Mejia-Monasterio,
Michel Peyrard
Abstract:
We discuss thermal rectification and thermoelectric energy conversion from the perspective of nonequilibrium statistical mechanics and dynamical systems theory. After preliminary considerations on the dynamical foundations of the phenomenological Fourier law in classical and quantum mechanics, we illustrate ways to control the phononic heat flow and design thermal diodes. Finally, we consider the…
▽ More
We discuss thermal rectification and thermoelectric energy conversion from the perspective of nonequilibrium statistical mechanics and dynamical systems theory. After preliminary considerations on the dynamical foundations of the phenomenological Fourier law in classical and quantum mechanics, we illustrate ways to control the phononic heat flow and design thermal diodes. Finally, we consider the coupled transport of heat and charge and discuss several general mechanisms for optimizing the figure of merit of thermoelectric efficiency.
△ Less
Submitted 21 December, 2015;
originally announced December 2015.
-
Can we model DNA at the mesoscale ? Comment on: Fluctuations in the DNA double helix: A critical review
Authors:
Michel Peyrard,
Thierry Dauxois
Abstract:
Comment on "Fluctuations in the DNA double helix: A critical review" by Frank-Kamenetskii and Prakash
Comment on "Fluctuations in the DNA double helix: A critical review" by Frank-Kamenetskii and Prakash
△ Less
Submitted 23 May, 2015;
originally announced May 2015.
-
Characterization of the low temperature properties of a simplified protein model
Authors:
Johannes-Geert Hagmann,
Naoko Nakagawa,
Michel Peyrard
Abstract:
Prompted by results that showed that a simple protein model, the frustrated Gō model, appears to exhibit a transition reminiscent of the protein dynamical transition, we examine the validity of this model to describe the low-temperature properties of proteins. First, we examine equilibrium fluctuations. We calculate its incoherent neutron-scattering structure factor and show that it can be well de…
▽ More
Prompted by results that showed that a simple protein model, the frustrated Gō model, appears to exhibit a transition reminiscent of the protein dynamical transition, we examine the validity of this model to describe the low-temperature properties of proteins. First, we examine equilibrium fluctuations. We calculate its incoherent neutron-scattering structure factor and show that it can be well described by a theory using the one-phonon approximation. By performing an inherent structure analysis, we assess the transitions among energy states at low temperatures. Then, we examine non-equilibrium fluctuations after a sudden cooling of the protein. We investigate the violation of the fluctuation--dissipation theorem in order to analyze the protein glass transition. We find that the effective temperature of the quenched protein deviates from the temperature of the thermostat, however it relaxes towards the actual temperature with an Arrhenius behavior as the waiting time increases. These results of the equilibrium and non-equilibrium studies converge to the conclusion that the apparent dynamical transition of this coarse-grained model cannot be attributed to a glassy behavior.
△ Less
Submitted 23 January, 2014;
originally announced January 2014.
-
The dynamics of the DNA denaturation transition
Authors:
Titus S. van Erp,
Michel Peyrard
Abstract:
The dynamics of the DNA denaturation is studied using the Peyrard-Bishop-Dauxois model. The denaturation rate of double stranded polymers decreases exponentially as function of length below the denaturation temperature. Above Tc, the rate shows a minimum, but then increases as function of length. We also examine the influence of sequence and solvent friction. Molecules having the same number of we…
▽ More
The dynamics of the DNA denaturation is studied using the Peyrard-Bishop-Dauxois model. The denaturation rate of double stranded polymers decreases exponentially as function of length below the denaturation temperature. Above Tc, the rate shows a minimum, but then increases as function of length. We also examine the influence of sequence and solvent friction. Molecules having the same number of weak and strong base-pairs can have significantly different opening rates depending on the order of base-pairs.
△ Less
Submitted 25 April, 2012;
originally announced April 2012.
-
Base Pair Openings and Temperature Dependence of DNA Flexibility
Authors:
Nikos Theodorakopoulos,
Michel Peyrard
Abstract:
The relationship of base pair openings to DNA flexibility is examined. Published experimental data on the temperature dependence of the persistence length by two different groups are well described in terms of an inhomogeneous Kratky-Porot model with soft and hard joints, corresponding to open and closed base pairs, and sequence-dependent statistical information about the state of each pair provid…
▽ More
The relationship of base pair openings to DNA flexibility is examined. Published experimental data on the temperature dependence of the persistence length by two different groups are well described in terms of an inhomogeneous Kratky-Porot model with soft and hard joints, corresponding to open and closed base pairs, and sequence-dependent statistical information about the state of each pair provided by a Peyrard-Bishop-Dauxois (PBD) model calculation with no freely adjustable parameters.
△ Less
Submitted 17 February, 2012; v1 submitted 31 January, 2012;
originally announced January 2012.
-
Collective effects at frictional interfaces
Authors:
Oleg Braun,
Michel Peyrard,
D. V. Stryzheus,
Erio Tosatti
Abstract:
We discuss the role of the long-range elastic interaction between the contacts inside an inhomogeneous frictional interface. The interaction produces a characteristic elastic correlation length $λ_c = a^2 E / k_c$ (where $a$ is the distance between the contacts, $k_c$ is the elastic constant of a contact, and $E$ is the Young modulus of the sliding body), below which the slider may be considered a…
▽ More
We discuss the role of the long-range elastic interaction between the contacts inside an inhomogeneous frictional interface. The interaction produces a characteristic elastic correlation length $λ_c = a^2 E / k_c$ (where $a$ is the distance between the contacts, $k_c$ is the elastic constant of a contact, and $E$ is the Young modulus of the sliding body), below which the slider may be considered as a rigid body. The strong inter-contact interaction leads to a narrowing of the effective threshold distribution for contact breaking and enhances the chances for an elastic instability to appear. Above the correlation length, $r > λ_c$, the interaction leads to screening of local perturbations in the interface, or to appearance of collective modes --- frictional cracks propagating as solitary waves.
△ Less
Submitted 24 January, 2012;
originally announced January 2012.
-
Structural correlations and melting of B-DNA fibres
Authors:
Andrew Wildes,
Nikos Theodorakopoulos,
Jessica Valle Orero,
Santiago Cuesta-Lopez,
Jean-Luc Garden,
Michel Peyrard
Abstract:
Despite numerous attempts, the understanding of the thermal denaturation of DNA is still a challenge due to the lack of structural data at the transition since standard experimental approaches to DNA melting are made in solution and do not provide spatial information. We report a measurement using neutron scattering from oriented DNA fibres to determine the size of the regions that stay in the dou…
▽ More
Despite numerous attempts, the understanding of the thermal denaturation of DNA is still a challenge due to the lack of structural data at the transition since standard experimental approaches to DNA melting are made in solution and do not provide spatial information. We report a measurement using neutron scattering from oriented DNA fibres to determine the size of the regions that stay in the double-helix conformation as the melting temperature is approached from below. A Bragg peak from the B-form of DNA has been observed as a function of temperature and its width and integrated intensity have bean measured. These results, complemented by a differential calorimetry study of the melting of B DNA fibres as well as electrophoresis and optical observation data, are analysed in terms of a one-dimensional mesoscopic model of DNA.
△ Less
Submitted 14 June, 2011;
originally announced June 2011.
-
Dependence of kinetic friction on velocity: Master equation approach
Authors:
Oleg Braun,
Michel Peyrard
Abstract:
We investigate the velocity dependence of kinetic friction with a model which makes minimal assumptions on the actual mechanism of friction so that it can be applied at many scales provided the system involves multi-contact friction. Using a recently developed master equation approach we investigate the influence of two concurrent processes. First, at a nonzero temperature thermal fluctuations all…
▽ More
We investigate the velocity dependence of kinetic friction with a model which makes minimal assumptions on the actual mechanism of friction so that it can be applied at many scales provided the system involves multi-contact friction. Using a recently developed master equation approach we investigate the influence of two concurrent processes. First, at a nonzero temperature thermal fluctuations allow an activated breaking of contacts which are still below the threshold. As a result, the friction force monotonically increases with velocity. Second, the aging of contacts leads to a decrease of the friction force with velocity. Aging effects include two aspects: the delay in contact formation and aging of a contact itself, i.e., the change of its characteristics with the duration of stationary contact. All these processes are considered simultaneously with the master equation approach, giving a complete dependence of the kinetic friction force on the driving velocity and system temperature, provided the interface parameters are known.
△ Less
Submitted 29 April, 2011;
originally announced April 2011.
-
The thermal denaturation of DNA studied with neutron scattering
Authors:
Andrew Wildes,
Nikos Theodorakopoulos,
Jessica Valle Orero,
Santiago Cuesta-Lopez,
Jean-Luc Garden,
Michel Peyrard
Abstract:
The melting transition of deoxyribonucleic acid (DNA), whereby the strands of the double helix structure completely separate at a certain temperature, has been characterized using neutron scattering. A Bragg peak from B-form fibre DNA has been measured as a function of temperature, and its widths and integrated intensities have been interpreted using the Peyrard-Bishop-Dauxois (PBD) model with onl…
▽ More
The melting transition of deoxyribonucleic acid (DNA), whereby the strands of the double helix structure completely separate at a certain temperature, has been characterized using neutron scattering. A Bragg peak from B-form fibre DNA has been measured as a function of temperature, and its widths and integrated intensities have been interpreted using the Peyrard-Bishop-Dauxois (PBD) model with only one free parameter. The experiment is unique, as it gives spatial correlation along the molecule through the melting transition where other techniques cannot.
△ Less
Submitted 10 January, 2011;
originally announced January 2011.
-
Master equation approach to friction at the mesoscale
Authors:
Oleg Braun,
Michel Peyrard
Abstract:
At the mesoscale friction occurs through the breaking and formation of local contacts. This is often described by the earthquake-like model which requires numerical studies. We show that this phenomenon can also be described by a master equation, which can be solved analytically in some cases and provides an efficient numerical solution for more general cases. We examine the effect of temperature…
▽ More
At the mesoscale friction occurs through the breaking and formation of local contacts. This is often described by the earthquake-like model which requires numerical studies. We show that this phenomenon can also be described by a master equation, which can be solved analytically in some cases and provides an efficient numerical solution for more general cases. We examine the effect of temperature and aging of the contacts and discuss the statistical properties of the contacts for different situations of friction and their implications, particularly regarding the existence of stick-slip.
△ Less
Submitted 15 September, 2010;
originally announced September 2010.
-
Critical examination of the inherent-structure-landscape analysis of two-state folding proteins
Authors:
Johannnes-Geert Hagmann,
Naoko Nakagawa,
Michel Peyrard
Abstract:
Recent studies attracted the attention on the inherent structure landscape (ISL) approach as a reduced description of proteins allowing to map their full thermodynamic properties. However, the analysis has been so far limited to a single topology of a two-state folding protein, and the simplifying assumptions of the method have not been examined. In this work, we construct the thermodynamics of…
▽ More
Recent studies attracted the attention on the inherent structure landscape (ISL) approach as a reduced description of proteins allowing to map their full thermodynamic properties. However, the analysis has been so far limited to a single topology of a two-state folding protein, and the simplifying assumptions of the method have not been examined. In this work, we construct the thermodynamics of four two-state folding proteins of different sizes and secondary structure by MD simulations using the ISL method, and critically examine possible limitations of the method. Our results show that the ISL approach correctly describes the thermodynamics function, such as the specific heat, on a qualitative level. Using both analytical and numerical methods, we show that some quantitative limitations cannot be overcome with enhanced sampling or the inclusion of harmonic corrections.
△ Less
Submitted 22 December, 2009; v1 submitted 25 October, 2009;
originally announced October 2009.
-
Adding a New Dimension to DNA Melting Curves
Authors:
Santiago Cuesta-Lopez,
Dimitar Angelov,
Michel Peyrard
Abstract:
Standard DNA melting curves record the separation of the two strands versus temperature, but they do not provide any information on the location of the opening. We introduce an experimental method which adds a new dimension to the melting curves of short DNA sequences by allowing us to record the degree of opening in several positions along the molecule all at once. This adds the spatial dimensi…
▽ More
Standard DNA melting curves record the separation of the two strands versus temperature, but they do not provide any information on the location of the opening. We introduce an experimental method which adds a new dimension to the melting curves of short DNA sequences by allowing us to record the degree of opening in several positions along the molecule all at once. This adds the spatial dimension to the melting curves and allows a precise investigation of the role of the base-pair sequence on the fluctuations and denaturation of the DNA double helix. We illustrate the power of the method by investigating the influence of an AT rich region on the fluctuations of neighboring domains.
△ Less
Submitted 19 August, 2009;
originally announced August 2009.
-
On 4-point correlation functions in simple polymer models
Authors:
Johannes-Geert Hagmann,
Karol K. Kozlowski,
Nikos Theodorakopoulos,
Michel Peyrard
Abstract:
We derive an exact formula for the covariance of cartesian distances in two simple polymer models, the freely-jointed chain and a discrete flexible model with nearest-neighbor interaction. We show that even in the interaction-free case correlations exist as long as the two distances at least partially share the same segments. For the interacting case, we demonstrate that the naive expectation of…
▽ More
We derive an exact formula for the covariance of cartesian distances in two simple polymer models, the freely-jointed chain and a discrete flexible model with nearest-neighbor interaction. We show that even in the interaction-free case correlations exist as long as the two distances at least partially share the same segments. For the interacting case, we demonstrate that the naive expectation of increasing correlations with increasing interaction strength only holds in a finite range of values. Some suggestions for future single-molecule experiments are made.
△ Less
Submitted 17 April, 2009; v1 submitted 27 March, 2009;
originally announced March 2009.
-
Comment on "A generalized Langevin formalism of complete DNA melting transition"
Authors:
Titus S. van Erp,
Santiago Cuesta-Lopez,
Johannes-Geert Hagmann,
Michel Peyrard
Abstract:
We show that the calculated DNA denaturation curves for finite (Peyrard-Bishop-Dauxois (PBD) chains are intrinsically undefined.
We show that the calculated DNA denaturation curves for finite (Peyrard-Bishop-Dauxois (PBD) chains are intrinsically undefined.
△ Less
Submitted 26 February, 2009;
originally announced February 2009.
-
Modelling DNA at the mesoscale: a challenge for nonlinear science?
Authors:
Michel Peyrard,
Santiago Cuesta-Lopez,
Guillaume James
Abstract:
When it is viewed at the scale of a base pair, DNA appears as a nonlinear lattice. Modelling its properties is a fascinating goal. The detailed experiments that can be performed on this system impose constraints on the models and can be used as a guide to improve them. There are nevertheless many open problems, particularly to describe DNA at the scale of a few tens of base pairs, which is relev…
▽ More
When it is viewed at the scale of a base pair, DNA appears as a nonlinear lattice. Modelling its properties is a fascinating goal. The detailed experiments that can be performed on this system impose constraints on the models and can be used as a guide to improve them. There are nevertheless many open problems, particularly to describe DNA at the scale of a few tens of base pairs, which is relevant for many biological phenomena.
△ Less
Submitted 6 August, 2008;
originally announced August 2008.
-
Experimental and theoretical studies of sequence effects on the fluctuation and melting of short DNA molecules
Authors:
Michel Peyrard,
Santiago Cuesta-Lopez,
Dimitar Angelov
Abstract:
Understanding the melting of short DNA sequences probes DNA at the scale of the genetic code and raises questions which are very different from those posed by very long sequences, which have been extensively studied. We investigate this problem by combining experiments and theory. A new experimental method allows us to make a map** of the opening of the guanines along the sequence as a functio…
▽ More
Understanding the melting of short DNA sequences probes DNA at the scale of the genetic code and raises questions which are very different from those posed by very long sequences, which have been extensively studied. We investigate this problem by combining experiments and theory. A new experimental method allows us to make a map** of the opening of the guanines along the sequence as a function of temperature. The results indicate that non-local effects may be important in DNA because an AT-rich region is able to influence the opening of a base pair which is about 10 base pairs away. An earlier mesoscopic model of DNA is modified to correctly describe the time scales associated to the opening of individual base pairs well below melting, and to properly take into account the sequence. Using this model to analyze some characteristic sequences for which detailed experimental data on the melting is available [Montrichok et al. 2003 Europhys. Lett. {\bf 62} 452], we show that we have to introduce non-local effects of AT-rich regions to get acceptable results. This brings a second indication that the influence of these highly fluctuating regions of DNA on their neighborhood can extend to some distance.
△ Less
Submitted 6 August, 2008;
originally announced August 2008.