Skip to main content

Showing 1–19 of 19 results for author: Rothe, S

Searching in archive cs. Search in all archives.
.
  1. LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning

    Authors: Amirhossein Abaskohi, Sascha Rothe, Yadollah Yaghoobzadeh

    Abstract: In recent years, there has been significant progress in develo** pre-trained language models for NLP. However, these models often struggle when fine-tuned on small datasets. To address this issue, researchers have proposed various adaptation approaches. Prompt-based tuning is arguably the most common way, especially for larger models. Previous research shows that adding contrastive learning to p… ▽ More

    Submitted 5 July, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 10 pages, 1 figure, 8 tables, 1 algorithm Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics

    Journal ref: https://aclanthology.org/2023.acl-short

  2. arXiv:2209.15469  [pdf, other

    cs.CL

    Zero-Shot Retrieval with Search Agents and Hybrid Environments

    Authors: Michelle Chen Huebscher, Christian Buck, Massimiliano Ciaramita, Sascha Rothe

    Abstract: Learning to search is the task of building artificial agents that learn to autonomously use a search box to find information. So far, it has been shown that current language models can learn symbolic query reformulation policies, in combination with traditional term-based retrieval, but fall short of outperforming neural retrievers. We extend the previous learning to search setup to a hybrid envir… ▽ More

    Submitted 29 March, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

  3. arXiv:2204.12553  [pdf, other

    cs.HC cs.GR

    Generative 3D Animation Pipelines: Automating Facial Retargeting Workflows

    Authors: Julius Girbig, Changkun Ou, Sylvia Rothe

    Abstract: Design tools in the 3D industry, while powerful, are still tedious and labor-intensive when it comes to bringing a creative idea for a visual effect to life. In this position paper, we discussed how an infamous generative synthetic media, deepfakes, could be of use and embedded into common sophisticated 3D workflows to reduce user workloads in areas such as 3D model editing, material design, and c… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

    Comments: 4 pages, 1 figure

    ACM Class: I.3.7

  4. arXiv:2203.02064  [pdf

    physics.app-ph cs.CR cs.IT physics.optics quant-ph

    Securing Data in Multimode Fibers by Exploiting Mode-Dependent Light Propagation Effects

    Authors: Stefan Rothe, Karl-Ludwig Besser, David Krause, Robert Kuschmierz, Nektarios Koukourakis, Eduard Jorswieck, Jürgen W. Czarske

    Abstract: Multimode fibers hold great promise to advance data rates in optical communications but come with the challenge to compensate for modal crosstalk and mode-dependent losses, resulting in strong distortions. The holographic measurement of the transmission matrix enables not only correcting distortions but also harnessing these effects for creating a confidential data connection between legitimate co… ▽ More

    Submitted 10 March, 2023; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: 14 pages, 8 figures

    Journal ref: Research, vol. 6: 0065, Jan. 2023

  5. arXiv:2109.00527  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Boosting Search Engines with Interactive Agents

    Authors: Leonard Adolphs, Benjamin Boerschinger, Christian Buck, Michelle Chen Huebscher, Massimiliano Ciaramita, Lasse Espeholt, Thomas Hofmann, Yannic Kilcher, Sascha Rothe, Pier Giuseppe Sessa, Lierni Sestorain Saralegui

    Abstract: This paper presents first successful steps in designing search agents that learn meta-strategies for iterative query refinement in information-seeking tasks. Our approach uses machine reading to guide the selection of refinement terms from aggregated search results. Agents are then empowered with simple but effective search operators to exert fine-grained and transparent control over queries and s… ▽ More

    Submitted 7 June, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: Published in Transactions on Machine Learning Research (06/2022)

  6. arXiv:2106.03830  [pdf, other

    cs.CL

    A Simple Recipe for Multilingual Grammatical Error Correction

    Authors: Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn

    Abstract: This paper presents a simple recipe to train state-of-the-art multilingual Grammatical Error Correction (GEC) models. We achieve this by first proposing a language-agnostic method to generate a large number of synthetic examples. The second ingredient is to use large-scale multilingual language models (up to 11B parameters). Once fine-tuned on language-specific supervised sets we surpass the previ… ▽ More

    Submitted 9 August, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  7. arXiv:2105.11921  [pdf, other

    cs.CL

    Focus Attention: Promoting Faithfulness and Diversity in Summarization

    Authors: Rahul Aralikatte, Shashi Narayan, Joshua Maynez, Sascha Rothe, Ryan McDonald

    Abstract: Professional summaries are written with document-level information, such as the theme of the document, in mind. This is in contrast with most seq2seq decoders which simultaneously learn to focus on salient content, while deciding what to generate, at each decoding step. With the motivation to narrow this gap, we introduce Focus Attention Mechanism, a simple yet effective method to encourage decode… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: ACL 2021

  8. Achievable Physical-Layer Secrecy in Multi-Mode Fiber Channels using Artificial Noise

    Authors: Eduard Jorswieck, Andrew Lonnstrom, Karl-Ludwig Besser, Stefan Rothe, Juergen W. Czarske

    Abstract: Reliable and secure communication is an important aspect of modern fiber optic communication. In this work we consider a multi-mode fiber (MMF) channel wiretapped by an eavesdropper. We assume the transmitter knows the legitimate channel, but statistical knowledge of the eavesdropper's channel only. We propose a transmission scheme with artificial noise (AN) for such a channel. In particular, we f… ▽ More

    Submitted 7 May, 2021; originally announced May 2021.

    Comments: 5 pages, 2 figures

  9. arXiv:2010.01054  [pdf, other

    cs.CL

    Unsupervised Text Style Transfer with Padded Masked Language Models

    Authors: Eric Malmi, Aliaksei Severyn, Sascha Rothe

    Abstract: We propose Masker, an unsupervised text-editing method for style transfer. To tackle cases when no parallel source-target pairs are available, we train masked language models (MLMs) for both the source and the target domain. Then we find the text spans where the two models disagree the most in terms of likelihood. This allows us to identify the source tokens to delete to transform the source text… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  10. arXiv:2005.11216  [pdf, other

    cs.CL

    A Generative Approach to Titling and Clustering Wikipedia Sections

    Authors: Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, Abe Ittycheriah

    Abstract: We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic enco… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

    Comments: Accepted to WNGT Workshop at ACL 2020

  11. arXiv:1909.08535  [pdf, other

    cs.CR physics.optics

    Physical Layer Security in Multimode Fiber Optical Networks

    Authors: Stefan Rothe, Nektarios Koukourakis, Hannes Radner, Andrew Lonnstrom, Eduard Jorswieck, Jürgen W. Czarske

    Abstract: Inverse precoding algorithms in multimode fiber based communication networks are used to exploit mode dependent losses on the physical layer. This provides an asymmetry between legitimate (Bob) and unlegitimate (Eve) receiver of messages resulting in a significant SNR advantage for Bob. In combination with dynamic mode channel changes, Eve has no chance to reconstruct a sent message even in a wors… ▽ More

    Submitted 12 September, 2019; originally announced September 2019.

  12. arXiv:1909.01187  [pdf, other

    cs.CL

    Encode, Tag, Realize: High-Precision Text Editing

    Authors: Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn

    Abstract: We propose LaserTagger - a sequence tagging approach that casts text generation as a text editing task. Target texts are reconstructed from the inputs using three main edit operations: kee** a token, deleting it, and adding a phrase before the token. To predict the edit operations, we propose a novel model, which combines a BERT encoder with an autoregressive Transformer decoder. This approach i… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  13. Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

    Authors: Sascha Rothe, Shashi Narayan, Aliaksei Severyn

    Abstract: Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing. By warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple benchmarks while saving significant amounts of compute time. So far the focus has been mainly on the Natural Language Understanding tasks. In this paper, we demonstrate the e… ▽ More

    Submitted 16 April, 2020; v1 submitted 29 July, 2019; originally announced July 2019.

    Comments: To be published in Transactions of the Association for Computational Linguistics (TACL)

  14. arXiv:1809.08731  [pdf, ps, other

    cs.CL

    Sentence-Level Fluency Evaluation: References Help, But Can Be Spared!

    Authors: Katharina Kann, Sascha Rothe, Katja Filippova

    Abstract: Motivated by recent findings on the probabilistic modeling of acceptability judgments, we propose syntactic log-odds ratio (SLOR), a normalized language model score, as a metric for referenceless fluency evaluation of natural language generation output at the sentence level. We further introduce WPSLOR, a novel WordPiece-based version, which harnesses a more compact language model. Even though wor… ▽ More

    Submitted 23 September, 2018; originally announced September 2018.

    Comments: Accepted to CoNLL 2018

  15. arXiv:1711.11383  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Learning to Learn from Weak Supervision by Full Supervision

    Authors: Mostafa Dehghani, Aliaksei Severyn, Sascha Rothe, Jaap Kamps

    Abstract: In this paper, we propose a method for training neural networks when we have a large set of data with weak labels and a small amount of data with true labels. In our proposed model, we train two neural networks: a target network, the learner and a confidence network, the meta-learner. The target network is optimized to perform a given task and is trained using a large set of unlabeled data that ar… ▽ More

    Submitted 30 November, 2017; originally announced November 2017.

    Comments: Accepted at NIPS Workshop on Meta-Learning (MetaLearn 2017), Long Beach, CA, USA

  16. arXiv:1711.00313  [pdf, other

    cs.LG cs.CL cs.NE stat.ML

    Avoiding Your Teacher's Mistakes: Training Neural Networks with Controlled Weak Supervision

    Authors: Mostafa Dehghani, Aliaksei Severyn, Sascha Rothe, Jaap Kamps

    Abstract: Training deep neural networks requires massive amounts of training data, but for many tasks only limited labeled data is available. This makes weak supervision attractive, using weak or noisy signals like the output of heuristic methods or user click-through data for training. In a semi-supervised setting, we can use a large set of data with weak labels to pretrain a neural network and then fine-t… ▽ More

    Submitted 7 December, 2017; v1 submitted 1 November, 2017; originally announced November 2017.

  17. arXiv:1708.03418  [pdf, other

    cs.IR cs.AI cs.CL

    Learning to Attend, Copy, and Generate for Session-Based Query Suggestion

    Authors: Mostafa Dehghani, Sascha Rothe, Enrique Alfonseca, Pascal Fleury

    Abstract: Users try to articulate their complex information needs during search sessions by reformulating their queries. To make this process more effective, search engines provide related queries to help users in specifying the information need in their search process. In this paper, we propose a customized sequence-to-sequence model for session-based query suggestion. In our model, we employ a query-aware… ▽ More

    Submitted 13 November, 2017; v1 submitted 10 August, 2017; originally announced August 2017.

    Comments: Accepted to be published at The 26th ACM International Conference on Information and Knowledge Management (CIKM2017)

  18. Ultradense Word Embeddings by Orthogonal Transformation

    Authors: Sascha Rothe, Sebastian Ebert, Hinrich Schütze

    Abstract: Embeddings are generic representations that are useful for many NLP tasks. In this paper, we introduce DENSIFIER, a method that learns an orthogonal transformation of the embedding space that focuses the information relevant for a task in an ultradense subspace of a dimensionality that is smaller by a factor of 100 than the original space. We show that ultradense embeddings generated by DENSIFIER… ▽ More

    Submitted 8 May, 2016; v1 submitted 24 February, 2016; originally announced February 2016.

  19. AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes

    Authors: Sascha Rothe, Hinrich Schütze

    Abstract: We present \textit{AutoExtend}, a system to learn embeddings for synsets and lexemes. It is flexible in that it can take any word embeddings as input and does not need an additional training corpus. The synset/lexeme embeddings obtained live in the same vector space as the word embeddings. A sparse tensor formalization guarantees efficiency and parallelizability. We use WordNet as a lexical resour… ▽ More

    Submitted 4 July, 2015; originally announced July 2015.