-
Towards Robust Domain Generation Algorithm Classification
Authors:
Arthur Drichel,
Marc Meyer,
Ulrike Meyer
Abstract:
In this work, we conduct a comprehensive study on the robustness of domain generation algorithm (DGA) classifiers. We implement 32 white-box attacks, 19 of which are very effective and induce a false-negative rate (FNR) of $\approx$ 100\% on unhardened classifiers. To defend the classifiers, we evaluate different hardening approaches and propose a novel training scheme that leverages adversarial l…
▽ More
In this work, we conduct a comprehensive study on the robustness of domain generation algorithm (DGA) classifiers. We implement 32 white-box attacks, 19 of which are very effective and induce a false-negative rate (FNR) of $\approx$ 100\% on unhardened classifiers. To defend the classifiers, we evaluate different hardening approaches and propose a novel training scheme that leverages adversarial latent space vectors and discretized adversarial domains to significantly improve robustness. In our study, we highlight a pitfall to avoid when hardening classifiers and uncover training biases that can be easily exploited by attackers to bypass detection, but which can be mitigated by adversarial training (AT). In our study, we do not observe any trade-off between robustness and performance, on the contrary, hardening improves a classifier's detection performance for known and unknown DGAs. We implement all attacks and defenses discussed in this paper as a standalone library, which we make publicly available to facilitate hardening of DGA classifiers: https://gitlab.com/rwth-itsec/robust-dga-detection
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL
Authors:
Marius Meyer,
Tobias Kenter,
Lucian Petrica,
Kenneth O'Brien,
Michaela Blott,
Christian Plessl
Abstract:
Most FPGA boards in the HPC domain are well-suited for parallel scaling because of the direct integration of versatile and high-throughput network ports. However, the utilization of their network capabilities is often challenging and error-prone because the whole network stack and communication patterns have to be implemented and managed on the FPGAs. Also, this approach conceptually involves a tr…
▽ More
Most FPGA boards in the HPC domain are well-suited for parallel scaling because of the direct integration of versatile and high-throughput network ports. However, the utilization of their network capabilities is often challenging and error-prone because the whole network stack and communication patterns have to be implemented and managed on the FPGAs. Also, this approach conceptually involves a trade-off between the performance potential of improved communication and the impact of resource consumption for communication infrastructure, since the utilized resources on the FPGAs could otherwise be used for computations. In this work, we investigate this trade-off, firstly, by using synthetic benchmarks to evaluate the different configuration options of the communication framework ACCL and their impact on communication latency and throughput. Finally, we use our findings to implement a shallow water simulation whose scalability heavily depends on low-latency communication. With a suitable configuration of ACCL, good scaling behavior can be shown to all 48 FPGAs installed in the system. Overall, the results show that the availability of inter-FPGA communication frameworks as well as the configurability of framework and network stack are crucial to achieve the best application performance with low latency communication.
△ Less
Submitted 7 April, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
Image-based Deep Learning for the time-dependent prediction of fresh concrete properties
Authors:
Max Meyer,
Amadeus Langer,
Max Mehltretter,
Dries Beyer,
Max Coenen,
Tobias Schack,
Michael Haist,
Christian Heipke
Abstract:
Increasing the degree of digitisation and automation in the concrete production process can play a crucial role in reducing the CO$_2$ emissions that are associated with the production of concrete. In this paper, a method is presented that makes it possible to predict the properties of fresh concrete during the mixing process based on stereoscopic image sequences of the concretes flow behaviour. A…
▽ More
Increasing the degree of digitisation and automation in the concrete production process can play a crucial role in reducing the CO$_2$ emissions that are associated with the production of concrete. In this paper, a method is presented that makes it possible to predict the properties of fresh concrete during the mixing process based on stereoscopic image sequences of the concretes flow behaviour. A Convolutional Neural Network (CNN) is used for the prediction, which receives the images supported by information on the mix design as input. In addition, the network receives temporal information in the form of the time difference between the time at which the images are taken and the time at which the reference values of the concretes are carried out. With this temporal information, the network implicitly learns the time-dependent behaviour of the concretes properties. The network predicts the slump flow diameter, the yield stress and the plastic viscosity. The time-dependent prediction potentially opens up the pathway to determine the temporal development of the fresh concrete properties already during mixing. This provides a huge advantage for the concrete industry. As a result, countermeasures can be taken in a timely manner. It is shown that an approach based on depth and optical flow images, supported by information of the mix design, achieves the best results.
△ Less
Submitted 15 April, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI
Authors:
Hanxue Gu,
Roy Colglazier,
Haoyu Dong,
Jikai Zhang,
Yaqian Chen,
Zafer Yildiz,
Yuwen Chen,
Lin Li,
Jichen Yang,
Jay Willhite,
Alex M. Meyer,
Brian Guo,
Yashvi Atul Shah,
Emily Luo,
Shipra Rajput,
Sally Kuehn,
Clark Bulleit,
Kevin A. Wu,
Jisoo Lee,
Brandon Ramirez,
Darui Lu,
Jay M. Levin,
Maciej A. Mazurowski
Abstract:
Magnetic Resonance Imaging (MRI) is pivotal in radiology, offering non-invasive and high-quality insights into the human body. Precise segmentation of MRIs into different organs and tissues would be highly beneficial since it would allow for a higher level of understanding of the image content and enable important measurements, which are essential for accurate diagnosis and effective treatment pla…
▽ More
Magnetic Resonance Imaging (MRI) is pivotal in radiology, offering non-invasive and high-quality insights into the human body. Precise segmentation of MRIs into different organs and tissues would be highly beneficial since it would allow for a higher level of understanding of the image content and enable important measurements, which are essential for accurate diagnosis and effective treatment planning. Specifically, segmenting bones in MRI would allow for more quantitative assessments of musculoskeletal conditions, while such assessments are largely absent in current radiological practice. The difficulty of bone MRI segmentation is illustrated by the fact that limited algorithms are publicly available for use, and those contained in the literature typically address a specific anatomic area. In our study, we propose a versatile, publicly available deep-learning model for bone segmentation in MRI across multiple standard MRI locations. The proposed model can operate in two modes: fully automated segmentation and prompt-based segmentation. Our contributions include (1) collecting and annotating a new MRI dataset across various MRI protocols, encompassing over 300 annotated volumes and 8485 annotated slices across diverse anatomic regions; (2) investigating several standard network architectures and strategies for automated segmentation; (3) introducing SegmentAnyBone, an innovative foundational model-based approach that extends Segment Anything Model (SAM); (4) comparative analysis of our algorithm and previous approaches; and (5) generalization analysis of our algorithm across different anatomical locations and MRI sequences, as well as an external dataset. We publicly release our model at https://github.com/mazurowski-lab/SegmentAnyBone.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Targeted Attacks: Redefining Spear Phishing and Business Email Compromise
Authors:
Sarah Wassermann,
Maxime Meyer,
Sébastien Goutal,
Damien Riquet
Abstract:
In today's digital world, cybercrime is responsible for significant damage to organizations, including financial losses, operational disruptions, or intellectual property theft. Cyberattacks often start with an email, the major means of corporate communication. Some rare, severely damaging email threats - known as spear phishing or Business Email Compromise - have emerged. However, the literature…
▽ More
In today's digital world, cybercrime is responsible for significant damage to organizations, including financial losses, operational disruptions, or intellectual property theft. Cyberattacks often start with an email, the major means of corporate communication. Some rare, severely damaging email threats - known as spear phishing or Business Email Compromise - have emerged. However, the literature disagrees on their definition, impeding security vendors and researchers from mitigating targeted attacks. Therefore, we introduce targeted attacks. We describe targeted-attack-detection techniques as well as social-engineering methods used by fraudsters. Additionally, we present text-based attacks - with textual content as malicious payload - and compare non-targeted and targeted variants.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Multi-GPU Approach for Training of Graph ML Models on large CFD Meshes
Authors:
Sebastian Strönisch,
Maximilian Sander,
Andreas Knüpfer,
Marcus Meyer
Abstract:
Mesh-based numerical solvers are an important part in many design tool chains. However, accurate simulations like computational fluid dynamics are time and resource consuming which is why surrogate models are employed to speed-up the solution process. Machine Learning based surrogate models on the other hand are fast in predicting approximate solutions but often lack accuracy. Thus, the developmen…
▽ More
Mesh-based numerical solvers are an important part in many design tool chains. However, accurate simulations like computational fluid dynamics are time and resource consuming which is why surrogate models are employed to speed-up the solution process. Machine Learning based surrogate models on the other hand are fast in predicting approximate solutions but often lack accuracy. Thus, the development of the predictor in a predictor-corrector approach is the focus here, where the surrogate model predicts a flow field and the numerical solver corrects it. This paper scales a state-of-the-art surrogate model from the domain of graph-based machine learning to industry-relevant mesh sizes of a numerical flow simulation. The approach partitions and distributes the flow domain to multiple GPUs and provides halo exchange between these partitions during training. The utilized graph neural network operates directly on the numerical mesh and is able to preserve complex geometries as well as all other properties of the mesh. The proposed surrogate model is evaluated with an application on a three dimensional turbomachinery setup and compared to a traditionally trained distributed model. The results show that the traditional approach produces superior predictions and outperforms the proposed surrogate model. Possible explanations, improvements and future directions are outlined.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Data Bricks Space Mission: Teaching Kids about Data with Physicalization
Authors:
Lorenzo Ambrosini,
Miriah Meyer
Abstract:
The Data Bricks Space Mission is a prototype activity based on data physicalization for teaching kids about data. The design of the activity is based on a literature review and interviews with elementary school teachers, and targets kids aged 10-12. Using Lego bricks and a fictional space adventure story, teachers can use the Data Bricks Space Mission activity to empower kids to produce data, comm…
▽ More
The Data Bricks Space Mission is a prototype activity based on data physicalization for teaching kids about data. The design of the activity is based on a literature review and interviews with elementary school teachers, and targets kids aged 10-12. Using Lego bricks and a fictional space adventure story, teachers can use the Data Bricks Space Mission activity to empower kids to produce data, communicate their findings, and gain a better understanding of the relationship between data and the world around them.
△ Less
Submitted 22 September, 2022;
originally announced October 2022.
-
TotalSegmentator: robust segmentation of 104 anatomical structures in CT images
Authors:
Jakob Wasserthal,
Hanns-Christian Breit,
Manfred T. Meyer,
Maurice Pradella,
Daniel Hinck,
Alexander W. Sauter,
Tobias Heye,
Daniel Boll,
Joshy Cyriac,
Shan Yang,
Michael Bach,
Martin Segeroth
Abstract:
We present a deep learning segmentation model that can automatically and robustly segment all major anatomical structures in body CT images. In this retrospective study, 1204 CT examinations (from the years 2012, 2016, and 2020) were used to segment 104 anatomical structures (27 organs, 59 bones, 10 muscles, 8 vessels) relevant for use cases such as organ volumetry, disease characterization, and s…
▽ More
We present a deep learning segmentation model that can automatically and robustly segment all major anatomical structures in body CT images. In this retrospective study, 1204 CT examinations (from the years 2012, 2016, and 2020) were used to segment 104 anatomical structures (27 organs, 59 bones, 10 muscles, 8 vessels) relevant for use cases such as organ volumetry, disease characterization, and surgical or radiotherapy planning. The CT images were randomly sampled from routine clinical studies and thus represent a real-world dataset (different ages, pathologies, scanners, body parts, sequences, and sites). The authors trained an nnU-Net segmentation algorithm on this dataset and calculated Dice similarity coefficients (Dice) to evaluate the model's performance. The trained algorithm was applied to a second dataset of 4004 whole-body CT examinations to investigate age dependent volume and attenuation changes. The proposed model showed a high Dice score (0.943) on the test set, which included a wide range of clinical data with major pathologies. The model significantly outperformed another publicly available segmentation model on a separate dataset (Dice score, 0.932 versus 0.871, respectively). The aging study demonstrated significant correlations between age and volume and mean attenuation for a variety of organ groups (e.g., age and aortic volume; age and mean attenuation of the autochthonous dorsal musculature). The developed model enables robust and accurate segmentation of 104 anatomical structures. The annotated dataset (https://doi.org/10.5281/zenodo.6802613) and toolkit (https://www.github.com/wasserth/TotalSegmentator) are publicly available.
△ Less
Submitted 16 June, 2023; v1 submitted 11 August, 2022;
originally announced August 2022.
-
ISLES 2022: A multi-center magnetic resonance imaging stroke lesion segmentation dataset
Authors:
Moritz Roman Hernandez Petzsche,
Ezequiel de la Rosa,
Uta Hanning,
Roland Wiest,
Waldo Enrique Valenzuela Pinilla,
Mauricio Reyes,
Maria Ines Meyer,
Sook-Lei Liew,
Florian Kofler,
Ivan Ezhov,
David Robben,
Alexander Hutton,
Tassilo Friedrich,
Teresa Zarth,
Johannes Bürkle,
The Anh Baran,
Bjoern Menze,
Gabriel Broocks,
Lukas Meyer,
Claus Zimmer,
Tobias Boeckh-Behrens,
Maria Berndt,
Benno Ikenberg,
Benedikt Wiestler,
Jan S. Kirschke
Abstract:
Magnetic resonance imaging (MRI) is a central modality for stroke imaging. It is used upon patient admission to make treatment decisions such as selecting patients for intravenous thrombolysis or endovascular therapy. MRI is later used in the duration of hospital stay to predict outcome by visualizing infarct core size and location. Furthermore, it may be used to characterize stroke etiology, e.g.…
▽ More
Magnetic resonance imaging (MRI) is a central modality for stroke imaging. It is used upon patient admission to make treatment decisions such as selecting patients for intravenous thrombolysis or endovascular therapy. MRI is later used in the duration of hospital stay to predict outcome by visualizing infarct core size and location. Furthermore, it may be used to characterize stroke etiology, e.g. differentiation between (cardio)-embolic and non-embolic stroke. Computer based automated medical image processing is increasingly finding its way into clinical routine. Previous iterations of the Ischemic Stroke Lesion Segmentation (ISLES) challenge have aided in the generation of identifying benchmark methods for acute and sub-acute ischemic stroke lesion segmentation. Here we introduce an expert-annotated, multicenter MRI dataset for segmentation of acute to subacute stroke lesions. This dataset comprises 400 multi-vendor MRI cases with high variability in stroke lesion size, quantity and location. It is split into a training dataset of n=250 and a test dataset of n=150. All training data will be made publicly available. The test dataset will be used for model validation only and will not be released to the public. This dataset serves as the foundation of the ISLES 2022 challenge with the goal of finding algorithmic methods to enable the development and benchmarking of robust and accurate segmentation algorithms for ischemic stroke.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Flip** the Script on Criminal Justice Risk Assessment: An actuarial model for assessing the risk the federal sentencing system poses to defendants
Authors:
Mikaela Meyer,
Aaron Horowitz,
Erica Marshall,
Kristian Lum
Abstract:
In the criminal justice system, algorithmic risk assessment instruments are used to predict the risk a defendant poses to society; examples include the risk of recidivating or the risk of failing to appear at future court dates. However, defendants are also at risk of harm from the criminal justice system. To date, there exists no risk assessment instrument that considers the risk the system poses…
▽ More
In the criminal justice system, algorithmic risk assessment instruments are used to predict the risk a defendant poses to society; examples include the risk of recidivating or the risk of failing to appear at future court dates. However, defendants are also at risk of harm from the criminal justice system. To date, there exists no risk assessment instrument that considers the risk the system poses to the individual. We develop a risk assessment instrument that "flips the script." Using data about U.S. federal sentencing decisions, we build a risk assessment instrument that predicts the likelihood an individual will receive an especially lengthy sentence given factors that should be legally irrelevant to the sentencing decision. To do this, we develop a two-stage modeling approach. Our first-stage model is used to determine which sentences were "especially lengthy." We then use a second-stage model to predict the defendant's risk of receiving a sentence that is flagged as especially lengthy given factors that should be legally irrelevant. The factors that should be legally irrelevant include, for example, race, court location, and other socio-demographic information about the defendant. Our instrument achieves comparable predictive accuracy to risk assessment instruments used in pretrial and parole contexts. We discuss the limitations of our modeling approach and use the opportunity to highlight how traditional risk assessment instruments in various criminal justice settings also suffer from many of the same limitations and embedded value systems of their creators.
△ Less
Submitted 13 July, 2022; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Self-Supervised Learning to Guide Scientifically Relevant Categorization of Martian Terrain Images
Authors:
Tejas Panambur,
Deep Chakraborty,
Melissa Meyer,
Ralph Milliken,
Erik Learned-Miller,
Mario Parente
Abstract:
Automatic terrain recognition in Mars rover images is an important problem not just for navigation, but for scientists interested in studying rock types, and by extension, conditions of the ancient Martian paleoclimate and habitability. Existing approaches to label Martian terrain either involve the use of non-expert annotators producing taxonomies of limited granularity (e.g. soil, sand, bedrock,…
▽ More
Automatic terrain recognition in Mars rover images is an important problem not just for navigation, but for scientists interested in studying rock types, and by extension, conditions of the ancient Martian paleoclimate and habitability. Existing approaches to label Martian terrain either involve the use of non-expert annotators producing taxonomies of limited granularity (e.g. soil, sand, bedrock, float rock, etc.), or rely on generic class discovery approaches that tend to produce perceptual classes such as rover parts and landscape, which are irrelevant to geologic analysis. Expert-labeled datasets containing granular geological/geomorphological terrain categories are rare or inaccessible to public, and sometimes require the extraction of relevant categorical information from complex annotations. In order to facilitate the creation of a dataset with detailed terrain categories, we present a self-supervised method that can cluster sedimentary textures in images captured from the Mast camera onboard the Curiosity rover (Mars Science Laboratory). We then present a qualitative analysis of these clusters and describe their geologic significance via the creation of a set of granular terrain categories. The precision and geologic validation of these automatically discovered clusters suggest that our methods are promising for the rapid classification of important geologic features and will therefore facilitate our long-term goal of producing a large, granular, and publicly available dataset for Mars terrain recognition.
△ Less
Submitted 20 April, 2022;
originally announced April 2022.
-
Multi-FPGA Designs and Scaling of HPC Challenge Benchmarks via MPI and Circuit-Switched Inter-FPGA Networks
Authors:
Marius Meyer,
Tobias Kenter,
Christian Plessl
Abstract:
While FPGA accelerator boards and their respective high-level design tools are maturing, there is still a lack of multi-FPGA applications, libraries, and not least, benchmarks and reference implementations towards sustained HPC usage of these devices. As in the early days of GPUs in HPC, for workloads that can reasonably be decoupled into loosely coupled working sets, multi-accelerator support can…
▽ More
While FPGA accelerator boards and their respective high-level design tools are maturing, there is still a lack of multi-FPGA applications, libraries, and not least, benchmarks and reference implementations towards sustained HPC usage of these devices. As in the early days of GPUs in HPC, for workloads that can reasonably be decoupled into loosely coupled working sets, multi-accelerator support can be achieved by using standard communication interfaces like MPI on the host side. However, for performance and productivity, some applications can profit from a tighter coupling of the accelerators. FPGAs offer unique opportunities here when extending the dataflow characteristics to their communication ininterfaces. In this work, we extend the HPCC FPGA benchmark suite by multi-FPGA support and three missing benchmarks that particularly characterize or stress inter-device communication: b_eff, PTRANS, and LINPACK. With all benchmarks implemented for current boards with Intel and Xilinx FPGAs, we established a baseline for multi-FPGA performance. Additionally, for the communication-centric benchmarks, we explored the potential of direct FPGA-to-FPGA communication with a circuit-switched inter-FPGA network that is currently only available for one of the boards. The evaluation with parallel execution on up to 26 FPGA boards makes use of one of the largest academic FPGA installations.
△ Less
Submitted 28 February, 2022;
originally announced February 2022.
-
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
Authors:
Kaustubh D. Dhole,
Varun Gangal,
Sebastian Gehrmann,
Aadesh Gupta,
Zhenhao Li,
Saad Mahamood,
Abinaya Mahendiran,
Simon Mille,
Ashish Shrivastava,
Samson Tan,
Tongshuang Wu,
Jascha Sohl-Dickstein,
**ho D. Choi,
Eduard Hovy,
Ondrej Dusek,
Sebastian Ruder,
Sajant Anand,
Nagender Aneja,
Rabin Banjade,
Lisa Barthe,
Hanna Behnke,
Ian Berlot-Attwell,
Connor Boyle,
Caroline Brun,
Marco Antonio Sobrevilla Cabezudo
, et al. (101 additional authors not shown)
Abstract:
Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split…
▽ More
Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks. We demonstrate the efficacy of NL-Augmenter by using several of its transformations to analyze the robustness of popular natural language models. The infrastructure, datacards and robustness analysis results are available publicly on the NL-Augmenter repository (https://github.com/GEM-benchmark/NL-Augmenter).
△ Less
Submitted 11 October, 2022; v1 submitted 5 December, 2021;
originally announced December 2021.
-
Social Media for Emergency Rescue: An Analysis of Rescue Requests on Twitter during Hurricane Harvey
Authors:
Lei Zou,
Danqing Liao,
Nina S. N. Lam,
Michelle Meyer,
Nasir G. Gharaibeh,
Heng Cai,
Bing Zhou,
Dongying Li
Abstract:
Social media plays increasingly significant roles in disaster response, but effectively leveraging social media for rescue is challenging. This study analyzed rescue requests on Twitter during the 2017 Hurricane Harvey, in which many residents resorted to social media to call for help. The objectives include (1) understanding the characteristics of rescue-request messages; (2) revealing the spatia…
▽ More
Social media plays increasingly significant roles in disaster response, but effectively leveraging social media for rescue is challenging. This study analyzed rescue requests on Twitter during the 2017 Hurricane Harvey, in which many residents resorted to social media to call for help. The objectives include (1) understanding the characteristics of rescue-request messages; (2) revealing the spatial-temporal patterns of rescue requests; (3) determining the social-geographical conditions of communities needing rescue; and (4) identifying the challenges of using social media for rescue and propose improvement strategies. About half of rescue requests either did not provide sufficient information or neglected to include rescue-related hashtags or accounts. Of the 824 geocoded unique rescue requests, 41% were from FEMA-defined minimal flood risk zones. Communities sending more rescue requests on Twitter were environmentally and socioeconomically more vulnerable. Finally, we derived a framework summarizing the steps and strategies needed to improve social media use for rescue operations.
△ Less
Submitted 13 November, 2021;
originally announced November 2021.
-
Manifesto for Putting 'Chartjunk' in the Trash 2021!
Authors:
Derya Akbaba,
Jack Wilburn,
Main T. Nance,
Miriah Meyer
Abstract:
In this provocation we ask the visualization research community to join us in removing chartjunk from our research lexicon. We present an etymology of chartjunk, framing its provocative origins as misaligned, and harmful, to the ways the term is currently used by visualization researchers. We call on the community to dissolve chartjunk from the ways we talk about, write about, and think about the…
▽ More
In this provocation we ask the visualization research community to join us in removing chartjunk from our research lexicon. We present an etymology of chartjunk, framing its provocative origins as misaligned, and harmful, to the ways the term is currently used by visualization researchers. We call on the community to dissolve chartjunk from the ways we talk about, write about, and think about the graphical devices we design and study. As a step towards this goal we contribute a performance of maintenance through a trio of acts: editing the Wikipedia page on chartjunk, cutting out chartjunk from IEEE papers, and scanning and posting a repository of the pages with chartjunk removed to invite the community to re-imagine how we describe visualizations. This contribution blurs the boundaries between research, activism, and maintenance art, and is intended to inspire the community to join us in taking out the trash.
△ Less
Submitted 25 October, 2021; v1 submitted 21 September, 2021;
originally announced September 2021.
-
Data Hunches: Incorporating Personal Knowledge into Visualizations
Authors:
Haihan Lin,
Derya Akbaba,
Miriah Meyer,
Alexander Lex
Abstract:
The trouble with data is that it frequently provides only an imperfect representation of a phenomenon of interest. Experts who are familiar with their datasets will often make implicit, mental corrections when analyzing a dataset, or will be cautious not to be over-confident in any findings if caveats are present. However, the implicit knowledge about the caveats of a dataset are typically not col…
▽ More
The trouble with data is that it frequently provides only an imperfect representation of a phenomenon of interest. Experts who are familiar with their datasets will often make implicit, mental corrections when analyzing a dataset, or will be cautious not to be over-confident in any findings if caveats are present. However, the implicit knowledge about the caveats of a dataset are typically not collected in a structured way, which is problematic especially when teams work together who might have knowledge about different aspects of a dataset. In this work, we define such analyst's knowledge about datasets as data hunches. We discuss the implications of data hunches and propose a set of techniques for recording and communicating data hunches through data visualization. Furthermore, we provide guidelines for designing visualizations that support recording and visualizing data hunches. We envision that data hunches will empower analysts to externalize their knowledge, facilitate collaboration and communication, and support the ability to learn from others' data hunches.
△ Less
Submitted 10 April, 2022; v1 submitted 14 September, 2021;
originally announced September 2021.
-
Exploring the Personal Informatics Analysis Gap: "There's a Lot of Bacon"
Authors:
Jimmy Moore,
Pascal Goffin,
Jason Wiese,
Miriah Meyer
Abstract:
Personal informatics research helps people track personal data for the purposes of self-reflection and gaining self-knowledge. This field, however, has predominantly focused on the data collection and insight-generation elements of self-tracking, with less attention paid to flexible data analysis. As a result, this inattention has led to inflexible analytic pipelines that do not reflect or support…
▽ More
Personal informatics research helps people track personal data for the purposes of self-reflection and gaining self-knowledge. This field, however, has predominantly focused on the data collection and insight-generation elements of self-tracking, with less attention paid to flexible data analysis. As a result, this inattention has led to inflexible analytic pipelines that do not reflect or support the diverse ways people want to engage with their data. This paper contributes a review of personal informatics and visualization research literature to expose a gap in our knowledge for designing flexible tools that assist people engaging with and analyzing personal data in personal contexts, what we call the personal informatics analysis gap. We explore this gap through a multistage longitudinal study on how asthmatics engage with personal air quality data, and we report how participants: were motivated by broad and diverse goals; exhibited patterns in the way they explored their data; engaged with their data in playful ways; discovered new insights through serendipitous exploration; and were reluctant to use analysis tools on their own. These results present new opportunities for visual analysis research and suggest the need for fundamental shifts in how and what we design when supporting personal data analysis.
△ Less
Submitted 8 August, 2021;
originally announced August 2021.
-
An interview method for engaging personal data
Authors:
Jimmy Moore,
Pascal Goffin,
Jason Wiese,
Miriah Meyer
Abstract:
Whether investigating research questions or designing systems, many researchers and designers need to engage users with their personal data. However, it is difficult to successfully design user-facing tools for interacting with personal data without first understanding what users want to do with their data. Techniques for raw data exploration, sketching, or physicalization can avoid the perils of…
▽ More
Whether investigating research questions or designing systems, many researchers and designers need to engage users with their personal data. However, it is difficult to successfully design user-facing tools for interacting with personal data without first understanding what users want to do with their data. Techniques for raw data exploration, sketching, or physicalization can avoid the perils of tool development, but prevent direct analytical access to users' rich personal data. We present a new method that directly tackles this challenge: the data engagement interview. This interview method incorporates an analyst to provide real-time personal data analysis, granting interview participants the opportunity to directly engage with their data, and interviewers to observe and ask questions throughout this engagement. We describe the method's development through a case study with asthmatic participants, share insights and guidance from our experience, and report a broad set of insights from these interviews.
△ Less
Submitted 4 November, 2021; v1 submitted 23 July, 2021;
originally announced July 2021.
-
Using system context information to complement weakly labeled data
Authors:
Matthias Meyer,
Michaela Wenner,
Clément Hibert,
Fabian Walter,
Lothar Thiele
Abstract:
Real-world datasets collected with sensor networks often contain incomplete and uncertain labels as well as artefacts arising from the system environment. Complete and reliable labeling is often infeasible for large-scale and long-term sensor network deployments due to the labor and time overhead, limited availability of experts and missing ground truth. In addition, if the machine learning method…
▽ More
Real-world datasets collected with sensor networks often contain incomplete and uncertain labels as well as artefacts arising from the system environment. Complete and reliable labeling is often infeasible for large-scale and long-term sensor network deployments due to the labor and time overhead, limited availability of experts and missing ground truth. In addition, if the machine learning method used for analysis is sensitive to certain features of a deployment, labeling and learning needs to be repeated for every new deployment. To address these challenges, we propose to make use of system context information formalized in an information graph and embed it in the learning process via contrastive learning. Based on real-world data we show that this approach leads to an increased accuracy in case of weakly labeled data and leads to an increased robustness and transferability of the classifier to new sensor locations.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
Plattformen und neue Technologien im Journalismus: Ergebnisse einer Online-Befragung von Journalistinnen und Journalisten in Deutschland
Authors:
Benjamin Rech,
Matthias Meyer
Abstract:
In an online survey in December 2020, 385 journalists in Germany were surveyed about platforms in journalism and about their frequency of use and willingness to adopt emerging technologies. Journalists have a commitment to publish on a journalism platform on a full-time basis. Freelancers have a higher commitment than employed journalists. A platform subscription model is rated more attractive tha…
▽ More
In an online survey in December 2020, 385 journalists in Germany were surveyed about platforms in journalism and about their frequency of use and willingness to adopt emerging technologies. Journalists have a commitment to publish on a journalism platform on a full-time basis. Freelancers have a higher commitment than employed journalists. A platform subscription model is rated more attractive than advertising for a platform. Employed journalists on the other hand consider advertising more attractive than freelance journalists. For German journalists it is important that the platform is developed in Europe or Germany and that it sets high standards on data protection. Multimedia forms and interactive elements are used occasionally, often or always. Stories or Reels are predominantly not used. AI software as well as editorial analytics are rarely or never used. Apart from stories or reels, journalists intend to use multimedia forms and interactive elements more often in the future. They are receptive to software for research process documentation as well as to the analysis of indicators of their own publications. Software as a support for text production, image selection or headline suggestions is mostly rejected.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.
-
An augmentation strategy to mimic multi-scanner variability in MRI
Authors:
Maria Ines Meyer,
Ezequiel de la Rosa,
Nuno Barros,
Roberto Paolella,
Koen Van Leemput,
Diana M. Sima
Abstract:
Most publicly available brain MRI datasets are very homogeneous in terms of scanner and protocols, and it is difficult for models that learn from such data to generalize to multi-center and multi-scanner data. We propose a novel data augmentation approach with the aim of approximating the variability in terms of intensities and contrasts present in real world clinical data. We use a Gaussian Mixtu…
▽ More
Most publicly available brain MRI datasets are very homogeneous in terms of scanner and protocols, and it is difficult for models that learn from such data to generalize to multi-center and multi-scanner data. We propose a novel data augmentation approach with the aim of approximating the variability in terms of intensities and contrasts present in real world clinical data. We use a Gaussian Mixture Model based approach to change tissue intensities individually, producing new contrasts while preserving anatomical information. We train a deep learning model on a single scanner dataset and evaluate it on a multi-center and multi-scanner dataset. The proposed approach improves the generalization capability of the model to other scanners not present in the training data.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
Photon-Driven Neural Path Guiding
Authors:
Shilin Zhu,
Zexiang Xu,
Tiancheng Sun,
Alexandr Kuznetsov,
Mark Meyer,
Henrik Wann Jensen,
Hao Su,
Ravi Ramamoorthi
Abstract:
Although Monte Carlo path tracing is a simple and effective algorithm to synthesize photo-realistic images, it is often very slow to converge to noise-free results when involving complex global illumination. One of the most successful variance-reduction techniques is path guiding, which can learn better distributions for importance sampling to reduce pixel noise. However, previous methods require…
▽ More
Although Monte Carlo path tracing is a simple and effective algorithm to synthesize photo-realistic images, it is often very slow to converge to noise-free results when involving complex global illumination. One of the most successful variance-reduction techniques is path guiding, which can learn better distributions for importance sampling to reduce pixel noise. However, previous methods require a large number of path samples to achieve reliable path guiding. We present a novel neural path guiding approach that can reconstruct high-quality sampling distributions for path guiding from a sparse set of samples, using an offline trained neural network. We leverage photons traced from light sources as the input for sampling density reconstruction, which is highly effective for challenging scenes with strong global illumination. To fully make use of our deep neural network, we partition the scene space into an adaptive hierarchical grid, in which we apply our network to reconstruct high-quality sampling distributions for any local region in the scene. This allows for highly efficient path guiding for any path bounce at any location in path tracing. We demonstrate that our photon-driven neural path guiding method can generalize well on diverse challenging testing scenes that are not seen in training. Our approach achieves significantly better rendering results of testing scenes than previous state-of-the-art path guiding methods.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Insights From Experiments With Rigor in an EvoBio Design Study
Authors:
Jen Rogers,
Austin H. Patton,
Luke Harmon,
Alexander Lex,
Miriah Meyer
Abstract:
Design study is an established approach of conducting problem-driven visualization research. The academic visualizationcommunity has produced a large body of work for reporting on design studies, informed by a handful of theoretical frameworks, andapplied to a broad range of application areas. The result is an abundance of reported insights into visualization design, with anemphasis on novel visua…
▽ More
Design study is an established approach of conducting problem-driven visualization research. The academic visualizationcommunity has produced a large body of work for reporting on design studies, informed by a handful of theoretical frameworks, andapplied to a broad range of application areas. The result is an abundance of reported insights into visualization design, with anemphasis on novel visualization techniques and systems as the primary contribution of these studies. In recent work we proposeda new, interpretivist perspective on design study and six companion criteria for rigor that highlight the opportunities for researchersto contribute knowledge that extends beyond visualization idioms and software. In this work we conducted a year-long collaborationwith evolutionary biologists to develop an interactive tool for visual exploration of multivariate datasets and phylogenetic trees. Duringthis design study we experimented with methods to support three of the rigor criteria:ABUNDANT,REFLEXIVE, andTRANSPARENT. As aresult we contribute two novel visualization techniques for the analysis of multivariate phylogenetic datasets, three methodologicalrecommendations for conducting design studies drawn from reflections over our process of experimentation, and two writing devices forreporting interpretivist design study. We offer this work as an example for implementing the rigor criteria to produce a diverse range ofknowledge contributions.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Text Data Augmentation: Towards better detection of spear-phishing emails
Authors:
Mehdi Regina,
Maxime Meyer,
Sébastien Goutal
Abstract:
Text data augmentation, i.e., the creation of new textual data from an existing text, is challenging. Indeed, augmentation transformations should take into account language complexity while being relevant to the target Natural Language Processing (NLP) task (e.g., Machine Translation, Text Classification). Initially motivated by an application of Business Email Compromise (BEC) detection, we propo…
▽ More
Text data augmentation, i.e., the creation of new textual data from an existing text, is challenging. Indeed, augmentation transformations should take into account language complexity while being relevant to the target Natural Language Processing (NLP) task (e.g., Machine Translation, Text Classification). Initially motivated by an application of Business Email Compromise (BEC) detection, we propose a corpus and task agnostic augmentation framework used as a service to augment English texts within our company. Our proposal combines different methods, utilizing BERT language model, multi-step back-translation and heuristics. We show that our augmentation framework improves performances on several text classification tasks using publicly available models and corpora as well as on a BEC detection task. We also provide a comprehensive argumentation about the limitations of our augmentation framework.
△ Less
Submitted 25 March, 2021; v1 submitted 4 July, 2020;
originally announced July 2020.
-
Empowering Active Learning to Jointly Optimize System and User Demands
Authors:
Ji-Ung Lee,
Christian M. Meyer,
Iryna Gurevych
Abstract:
Existing approaches to active learning maximize the system performance by sampling unlabeled instances for annotation that yield the most efficient training. However, when active learning is integrated with an end-user application, this can lead to frustration for participating users, as they spend time labeling instances that they would not otherwise be interested in reading. In this paper, we pr…
▽ More
Existing approaches to active learning maximize the system performance by sampling unlabeled instances for annotation that yield the most efficient training. However, when active learning is integrated with an end-user application, this can lead to frustration for participating users, as they spend time labeling instances that they would not otherwise be interested in reading. In this paper, we propose a new active learning approach that jointly optimizes the seemingly counteracting objectives of the active learning system (training efficiently) and the user (receiving useful instances). We study our approach in an educational application, which particularly benefits from this technique as the system needs to rapidly learn to predict the appropriateness of an exercise to a particular user, while the users should receive only exercises that match their skills. We evaluate multiple learning strategies and user types with data from real users and find that our joint approach better satisfies both objectives when alternative methods lead to many unsuitable exercises for end users.
△ Less
Submitted 11 May, 2020; v1 submitted 9 May, 2020;
originally announced May 2020.
-
Evaluating FPGA Accelerator Performance with a Parameterized OpenCL Adaptation of the HPCChallenge Benchmark Suite
Authors:
Marius Meyer,
Tobias Kenter,
Christian Plessl
Abstract:
FPGAs have found increasing adoption in data center applications since a new generation of high-level tools have become available which noticeably reduce development time for FPGA accelerators and still provide high quality of results. There is however no high-level benchmark suite available which specifically enables a comparison of FPGA architectures, programming tools and libraries for HPC appl…
▽ More
FPGAs have found increasing adoption in data center applications since a new generation of high-level tools have become available which noticeably reduce development time for FPGA accelerators and still provide high quality of results. There is however no high-level benchmark suite available which specifically enables a comparison of FPGA architectures, programming tools and libraries for HPC applications.
To fill this gap, we have developed an OpenCL-based open source implementation of the HPCC benchmark suite for Xilinx and Intel FPGAs. This benchmark can serve to analyze the current capabilities of FPGA devices, cards and development tool flows, track progress over time and point out specific difficulties for FPGA acceleration in the HPC domain. Additionally, the benchmark documents proven performance optimization patterns. We will continue optimizing and porting the benchmark for new generations of FPGAs and design tools and encourage active participation to create a valuable tool for the community.
△ Less
Submitted 12 June, 2020; v1 submitted 23 April, 2020;
originally announced April 2020.
-
A low-overhead soft-hard fault-tolerant architecture, design and management scheme for reliable high-performance many-core 3D-NoC systems
Authors:
Khanh N Dang,
Michael Meyer,
Yuichi Okuyama,
Abderazek Ben Abdallah
Abstract:
The Network-on-Chip (NoC) paradigm has been proposed as a favorable solution to handle the strict communication requirements between the increasingly large number of cores on a single chip. However, NoC systems are exposed to the aggressive scaling down of transistors, low operating voltages, and high integration and power densities, making them vulnerable to permanent (hard) faults and transient…
▽ More
The Network-on-Chip (NoC) paradigm has been proposed as a favorable solution to handle the strict communication requirements between the increasingly large number of cores on a single chip. However, NoC systems are exposed to the aggressive scaling down of transistors, low operating voltages, and high integration and power densities, making them vulnerable to permanent (hard) faults and transient (soft) errors. A hard fault in a NoC can lead to external blocking, causing congestion across the whole network. A soft error is more challenging because of its silent data corruption, which leads to a large area of erroneous data due to error propagation, packet re-transmission, and deadlock. In this paper, we present the architecture and design of a comprehensive soft error and hard fault-tolerant 3D-NoC system, named 3D-Hard-Fault-Soft-Error-Tolerant-OASIS-NoC (3D-FETO). With the aid of efficient mechanisms and algorithms, 3D-FETO is capable of detecting and recovering from soft errors which occur in the routing pipeline stages and leverages reconfigurable components to handle permanent faults in links, input buffers, and crossbars. In-depth evaluation results show that the 3D-FETO system is able to work around different kinds of hard faults and soft errors, ensuring graceful performance degradation, while minimizing additional hardware complexity and remaining power efficient.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
Reliability Assessment and Quantitative Evaluation of Soft-Error Resilient 3D Network-on-Chip Systems
Authors:
Khanh N Dang,
Michael Meyer,
Yuichi Okuyama,
Abderazek Ben Abdallah
Abstract:
Three-Dimensional Networks-on-Chips (3D-NoCs) have been proposed as an auspicious solution, merging the high parallelism of the Network-on-Chip (NoC) paradigm with the high-performance and low-power cost of 3D-ICs. However, as technology scales down, the reliability issues are becoming more crucial, especially for complex 3D-NoC which provides the communication requirements of multi and many-core…
▽ More
Three-Dimensional Networks-on-Chips (3D-NoCs) have been proposed as an auspicious solution, merging the high parallelism of the Network-on-Chip (NoC) paradigm with the high-performance and low-power cost of 3D-ICs. However, as technology scales down, the reliability issues are becoming more crucial, especially for complex 3D-NoC which provides the communication requirements of multi and many-core systems-on-chip. Reliability assessment is prominent for early stages of the manufacturing process to prevent costly redesigns of a target system. In this paper, we present an accurate reliability assessment and quantitative evaluation of a soft-error resilient 3D-NoC based on a soft-error resilient mechanism. The system can recover from transient errors occurring in different pipeline stages of the router. Based on this analysis, the effects of failures in the network's principal components are determined.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
Improved inter-scanner MS lesion segmentation by adversarial training on longitudinal data
Authors:
Mattias Billast,
Maria Ines Meyer,
Diana M. Sima,
David Robben
Abstract:
The evaluation of white matter lesion progression is an important biomarker in the follow-up of MS patients and plays a crucial role when deciding the course of treatment. Current automated lesion segmentation algorithms are susceptible to variability in image characteristics related to MRI scanner or protocol differences. We propose a model that improves the consistency of MS lesion segmentations…
▽ More
The evaluation of white matter lesion progression is an important biomarker in the follow-up of MS patients and plays a crucial role when deciding the course of treatment. Current automated lesion segmentation algorithms are susceptible to variability in image characteristics related to MRI scanner or protocol differences. We propose a model that improves the consistency of MS lesion segmentations in inter-scanner studies. First, we train a CNN base model to approximate the performance of icobrain, an FDA-approved clinically available lesion segmentation software. A discriminator model is then trained to predict if two lesion segmentations are based on scans acquired using the same scanner type or not, achieving a 78% accuracy in this task. Finally, the base model and the discriminator are trained adversarially on multi-scanner longitudinal data to improve the inter-scanner consistency of the base model. The performance of the models is evaluated on an unseen dataset containing manual delineations. The inter-scanner variability is evaluated on test-retest data, where the adversarial network produces improved results over the base model and the FDA-approved solution.
△ Less
Submitted 27 October, 2020; v1 submitted 3 February, 2020;
originally announced February 2020.
-
When is ACL's Deadline? A Scientific Conversational Agent
Authors:
Mohsen Mesgar,
Paul Youssef,
Lin Li,
Dominik Bierwirth,
Yihao Li,
Christian M. Meyer,
Iryna Gurevych
Abstract:
Our conversational agent UKP-ATHENA assists NLP researchers in finding and exploring scientific literature, identifying relevant authors, planning or post-processing conference visits, and preparing paper submissions using a unified interface based on natural language inputs and responses. UKP-ATHENA enables new access paths to our swiftly evolving research area with its massive amounts of scienti…
▽ More
Our conversational agent UKP-ATHENA assists NLP researchers in finding and exploring scientific literature, identifying relevant authors, planning or post-processing conference visits, and preparing paper submissions using a unified interface based on natural language inputs and responses. UKP-ATHENA enables new access paths to our swiftly evolving research area with its massive amounts of scientific information and high turnaround times. UKP-ATHENA's responses connect information from multiple heterogeneous sources which researchers currently have to explore manually one after another. Unlike a search engine, UKP-ATHENA maintains the context of a conversation to allow for efficient information access on papers, researchers, and conferences. Our architecture consists of multiple components with reference implementations that can be easily extended by new skills and domains. Our user-based evaluation shows that UKP-ATHENA already responds 45% of different formulations of defined intents with 37% information coverage rate.
△ Less
Submitted 23 November, 2019;
originally announced November 2019.
-
Relevance Vector Machines for harmonization of MRI brain volumes using image descriptors
Authors:
Maria Ines Meyer,
Ezequiel de la Rosa,
Koen Van Leemput,
Diana M. Sima
Abstract:
With the increased need for multi-center magnetic resonance imaging studies, problems arise related to differences in hardware and software between centers. Namely, current algorithms for brain volume quantification are unreliable for the longitudinal assessment of volume changes in this type of setting. Currently most methods attempt to decrease this issue by regressing the scanner- and/or center…
▽ More
With the increased need for multi-center magnetic resonance imaging studies, problems arise related to differences in hardware and software between centers. Namely, current algorithms for brain volume quantification are unreliable for the longitudinal assessment of volume changes in this type of setting. Currently most methods attempt to decrease this issue by regressing the scanner- and/or center-effects from the original data. In this work, we explore a novel approach to harmonize brain volume measurements by using only image descriptors. First, we explore the relationships between volumes and image descriptors. Then, we train a Relevance Vector Machine (RVM) model over a large multi-site dataset of healthy subjects to perform volume harmonization. Finally, we validate the method over two different datasets: i) a subset of unseen healthy controls; and ii) a test-retest dataset of multiple sclerosis (MS) patients. The method decreases scanner and center variability while preserving measurements that did not require correction in MS patient data. We show that image descriptors can be used as input to a machine learning algorithm to improve the reliability of longitudinal volumetric studies.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Authors:
Wei Zhao,
Maxime Peyrard,
Fei Liu,
Yang Gao,
Christian M. Meyer,
Steffen Eger
Abstract:
A robust evaluation metric has a profound impact on the development of text generation systems. A desirable metric compares system output against references based on their semantics rather than surface forms. In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of text quality. We validate our new metric,…
▽ More
A robust evaluation metric has a profound impact on the development of text generation systems. A desirable metric compares system output against references based on their semantics rather than surface forms. In this paper we investigate strategies to encode system and reference texts to devise a metric that shows a high correlation with human judgment of text quality. We validate our new metric, namely MoverScore, on a number of text generation tasks including summarization, machine translation, image captioning, and data-to-text generation, where the outputs are produced by a variety of neural and non-neural systems. Our findings suggest that metrics combining contextualized representations with a distance measure perform the best. Such metrics also demonstrate strong generalization capability across tasks. For ease-of-use we make our metrics available as web service.
△ Less
Submitted 26 September, 2019; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Better Rewards Yield Better Summaries: Learning to Summarise Without References
Authors:
Florian Böhm,
Yang Gao,
Christian M. Meyer,
Ori Shapira,
Ido Dagan,
Iryna Gurevych
Abstract:
Reinforcement Learning (RL) based document summarisation systems yield state-of-the-art performance in terms of ROUGE scores, because they directly use ROUGE as the rewards during training. However, summaries with high ROUGE scores often receive low human judgement. To find a better reward function that can guide RL to generate human-appealing summaries, we learn a reward function from human ratin…
▽ More
Reinforcement Learning (RL) based document summarisation systems yield state-of-the-art performance in terms of ROUGE scores, because they directly use ROUGE as the rewards during training. However, summaries with high ROUGE scores often receive low human judgement. To find a better reward function that can guide RL to generate human-appealing summaries, we learn a reward function from human ratings on 2,500 summaries. Our reward function only takes the document and system summary as input. Hence, once trained, it can be used to train RL-based summarisation systems without using any reference summaries. We show that our learned rewards have significantly higher correlation with human ratings than previous approaches. Human evaluation experiments show that, compared to the state-of-the-art supervised-learning systems and ROUGE-as-rewards RL summarisation systems, the RL systems using our learned rewards during training generate summarieswith higher human ratings. The learned reward function and our source code are available at https://github.com/yg211/summary-reward-no-reference.
△ Less
Submitted 3 September, 2019;
originally announced September 2019.
-
FAMULUS: Interactive Annotation and Feedback Generation for Teaching Diagnostic Reasoning
Authors:
Jonas Pfeiffer,
Christian M. Meyer,
Claudia Schulz,
Jan Kiesewetter,
Jan Zottmann,
Michael Sailer,
Elisabeth Bauer,
Frank Fischer,
Martin R. Fischer,
Iryna Gurevych
Abstract:
Our proposed system FAMULUS helps students learn to diagnose based on automatic feedback in virtual patient simulations, and it supports instructors in labeling training data.
Diagnosing is an exceptionally difficult skill to obtain but vital for many different professions (e.g., medical doctors, teachers).
Previous case simulation systems are limited to multiple-choice questions and thus cann…
▽ More
Our proposed system FAMULUS helps students learn to diagnose based on automatic feedback in virtual patient simulations, and it supports instructors in labeling training data.
Diagnosing is an exceptionally difficult skill to obtain but vital for many different professions (e.g., medical doctors, teachers).
Previous case simulation systems are limited to multiple-choice questions and thus cannot give constructive individualized feedback on a student's diagnostic reasoning process.
Given initially only limited data, we leverage a (replaceable) NLP model to both support experts in their further data annotation with automatic suggestions, and we provide automatic feedback for students.
We argue that because the central model consistently improves, our interactive approach encourages both students and instructors to recurrently use the tool, and thus accelerate the speed of data creation and annotation.
We show results from two user studies on diagnostic reasoning in medicine and teacher education and outline how our system can be extended to further use cases.
△ Less
Submitted 29 August, 2019;
originally announced August 2019.
-
Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation
Authors:
Yang Gao,
Christian M. Meyer,
Mohsen Mesgar,
Iryna Gurevych
Abstract:
Document summarisation can be formulated as a sequential decision-making problem, which can be solved by Reinforcement Learning (RL) algorithms. The predominant RL paradigm for summarisation learns a cross-input policy, which requires considerable time, data and parameter tuning due to the huge search spaces and the delayed rewards. Learning input-specific RL policies is a more efficient alternati…
▽ More
Document summarisation can be formulated as a sequential decision-making problem, which can be solved by Reinforcement Learning (RL) algorithms. The predominant RL paradigm for summarisation learns a cross-input policy, which requires considerable time, data and parameter tuning due to the huge search spaces and the delayed rewards. Learning input-specific RL policies is a more efficient alternative but so far depends on handcrafted rewards, which are difficult to design and yield poor performance. We propose RELIS, a novel RL paradigm that learns a reward function with Learning-to-Rank (L2R) algorithms at training time and uses this reward function to train an input-specific RL policy at test time. We prove that RELIS guarantees to generate near-optimal summaries with appropriate L2R and RL algorithms. Empirically, we evaluate our approach on extractive multi-document summarisation. We show that RELIS reduces the training time by two orders of magnitude compared to the state-of-the-art models while performing on par with them.
△ Less
Submitted 30 July, 2019;
originally announced July 2019.
-
Criteria for Rigor in Visualization Design Study
Authors:
Miriah Meyer,
Jason Dykes
Abstract:
We develop a new perspective on research conducted through visualization design study that emphasizes design as a method of inquiry and the broad range of knowledge-contributions achieved through it as multiple, subjective, and socially constructed. From this interpretivist position we explore the nature of visualization design study and develop six criteria for rigor. We propose that rigor is est…
▽ More
We develop a new perspective on research conducted through visualization design study that emphasizes design as a method of inquiry and the broad range of knowledge-contributions achieved through it as multiple, subjective, and socially constructed. From this interpretivist position we explore the nature of visualization design study and develop six criteria for rigor. We propose that rigor is established and judged according to the extent to which visualization design study research and its reporting are INFORMED, REFLEXIVE, ABUNDANT, PLAUSIBLE, RESONANT, and TRANSPARENT. This perspective and the criteria were constructed through a four-year engagement with the discourse around rigor and the nature of knowledge in social science, information systems, and design. We suggest methods from cognate disciplines that can support visualization researchers in meeting these criteria during the planning, execution, and reporting of design study. Through a series of deliberately provocative questions, we explore implications of this new perspective for design study research in visualization, concluding that as a discipline, visualization is not yet well positioned to embrace, nurture, and fully benefit from a rigorous, interpretivist approach to design study. The perspective and criteria we present are intended to stimulate dialogue and debate around the nature of visualization design study and the broader underpinnings of the discipline.
△ Less
Submitted 13 September, 2019; v1 submitted 19 July, 2019;
originally announced July 2019.
-
Manipulating the Difficulty of C-Tests
Authors:
Ji-Ung Lee,
Erik Schwan,
Christian M. Meyer
Abstract:
We propose two novel manipulation strategies for increasing and decreasing the difficulty of C-tests automatically. This is a crucial step towards generating learner-adaptive exercises for self-directed language learning and preparing language assessment tests. To reach the desired difficulty level, we manipulate the size and the distribution of gaps based on absolute and relative gap difficulty p…
▽ More
We propose two novel manipulation strategies for increasing and decreasing the difficulty of C-tests automatically. This is a crucial step towards generating learner-adaptive exercises for self-directed language learning and preparing language assessment tests. To reach the desired difficulty level, we manipulate the size and the distribution of gaps based on absolute and relative gap difficulty predictions. We evaluate our approach in corpus-based experiments and in a user study with 60 participants. We find that both strategies are able to generate C-tests with the desired difficulty level.
△ Less
Submitted 2 July, 2019; v1 submitted 17 June, 2019;
originally announced June 2019.
-
Preference-based Interactive Multi-Document Summarisation
Authors:
Yang Gao,
Christian M. Meyer,
Iryna Gurevych
Abstract:
Interactive NLP is a promising paradigm to close the gap between automatic NLP systems and the human upper bound. Preference-based interactive learning has been successfully applied, but the existing methods require several thousand interaction rounds even in simulations with perfect user feedback. In this paper, we study preference-based interactive summarisation. To reduce the number of interact…
▽ More
Interactive NLP is a promising paradigm to close the gap between automatic NLP systems and the human upper bound. Preference-based interactive learning has been successfully applied, but the existing methods require several thousand interaction rounds even in simulations with perfect user feedback. In this paper, we study preference-based interactive summarisation. To reduce the number of interaction rounds, we propose the Active Preference-based ReInforcement Learning (APRIL) framework. APRIL uses Active Learning to query the user, Preference Learning to learn a summary ranking function from the preferences, and neural Reinforcement Learning to efficiently search for the (near-)optimal summary. Our results show that users can easily provide reliable preferences over summaries and that APRIL outperforms the state-of-the-art preference-based interactive method in both simulation and real-user experiments.
△ Less
Submitted 7 June, 2019;
originally announced June 2019.
-
Analysis of Automatic Annotation Suggestions for Hard Discourse-Level Tasks in Expert Domains
Authors:
Claudia Schulz,
Christian M. Meyer,
Jan Kiesewetter,
Michael Sailer,
Elisabeth Bauer,
Martin R. Fischer,
Frank Fischer,
Iryna Gurevych
Abstract:
Many complex discourse-level tasks can aid domain experts in their work but require costly expert annotations for data creation. To speed up and ease annotations, we investigate the viability of automatically generated annotation suggestions for such tasks. As an example, we choose a task that is particularly hard for both humans and machines: the segmentation and classification of epistemic activ…
▽ More
Many complex discourse-level tasks can aid domain experts in their work but require costly expert annotations for data creation. To speed up and ease annotations, we investigate the viability of automatically generated annotation suggestions for such tasks. As an example, we choose a task that is particularly hard for both humans and machines: the segmentation and classification of epistemic activities in diagnostic reasoning texts. We create and publish a new dataset covering two domains and carefully analyse the suggested annotations. We find that suggestions have positive effects on annotation speed and performance, while not introducing noteworthy biases. Envisioning suggestion models that improve with newly annotated texts, we contrast methods for continuous model adjustment and suggest the most effective setup for suggestions in future expert tasks.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
An Overview of GSMA's M2M Remote Provisioning Specification
Authors:
Maxime Meyer,
Elizabeth A. Quaglia,
Ben Smyth
Abstract:
M2M devices are ubiquitous, and there is a growing tendency to connect such devices to mobile networks. Network operators are investigating new solutions to lower their costs and to address usability issues. Embedded SIM cards with remote provisioning capability are one of the most promising solutions. GSMA, the leading consortium on mobile network standards, has proposed a specification for such…
▽ More
M2M devices are ubiquitous, and there is a growing tendency to connect such devices to mobile networks. Network operators are investigating new solutions to lower their costs and to address usability issues. Embedded SIM cards with remote provisioning capability are one of the most promising solutions. GSMA, the leading consortium on mobile network standards, has proposed a specification for such an embedded SIM card, called eUICC. The specification describes eUICC architecture and a remote provisioning mechanism. Embodiments of this specification have the potential to disrupt the telecommunications market: eUICCs will be shipped to device manufacturers and then remotely provisioned with a subscription, whereas (currently) SIMs must be provisioned prior to ship**. In this article, we present a comprehensive overview of GSMA's specification and its motivation. In particular, we describe the technology and the protocols involved in remote provisioning.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.
-
Origraph: Interactive Network Wrangling
Authors:
Alex Bigelow,
Carolina Nobre,
Miriah Meyer,
Alexander Lex
Abstract:
Networks are a natural way of thinking about many datasets. The data on which a network is based, however, is rarely collected in a form that suits the analysis process, making it necessary to create and reshape networks. Data wrangling is widely acknowledged to be a critical part of the data analysis pipeline, yet interactive network wrangling has received little attention in the visualization re…
▽ More
Networks are a natural way of thinking about many datasets. The data on which a network is based, however, is rarely collected in a form that suits the analysis process, making it necessary to create and reshape networks. Data wrangling is widely acknowledged to be a critical part of the data analysis pipeline, yet interactive network wrangling has received little attention in the visualization research community. In this paper, we discuss a set of operations that are important for wrangling network datasets and introduce a visual data wrangling tool, Origraph, that enables analysts to apply these operations to their datasets. Key operations include creating a network from source data such as tables, resha** a network by introducing new node or edge classes, filtering nodes or edges, and deriving new node or edge attributes. Our tool, Origraph, enables analysts to execute these operations with little to no programming, and to immediately visualize the results. Origraph provides views to investigate the network model, a sample of the network, and node and edge attributes. In addition, we introduce interfaces designed to aid analysts in specifying arguments for sensible network wrangling operations. We demonstrate the usefulness of Origraph in two Use Cases: first, we investigate gender bias in the film industry, and then the influence of money on the political support for the war in Yemen.
△ Less
Submitted 19 July, 2019; v1 submitted 15 December, 2018;
originally announced December 2018.
-
Challenges in the Automatic Analysis of Students' Diagnostic Reasoning
Authors:
Claudia Schulz,
Christian M. Meyer,
Michael Sailer,
Jan Kiesewetter,
Elisabeth Bauer,
Frank Fischer,
Martin R. Fischer,
Iryna Gurevych
Abstract:
Diagnostic reasoning is a key component of many professions. To improve students' diagnostic reasoning skills, educational psychologists analyse and give feedback on epistemic activities used by these students while diagnosing, in particular, hypothesis generation, evidence generation, evidence evaluation, and drawing conclusions. However, this manual analysis is highly time-consuming. We aim to e…
▽ More
Diagnostic reasoning is a key component of many professions. To improve students' diagnostic reasoning skills, educational psychologists analyse and give feedback on epistemic activities used by these students while diagnosing, in particular, hypothesis generation, evidence generation, evidence evaluation, and drawing conclusions. However, this manual analysis is highly time-consuming. We aim to enable the large-scale adoption of diagnostic reasoning analysis and feedback by automating the epistemic activity identification. We create the first corpus for this task, comprising diagnostic reasoning self-explanations of students from two domains annotated with epistemic activities. Based on insights from the corpus creation and the task's characteristics, we discuss three challenges for the automatic identification of epistemic activities using AI methods: the correct identification of epistemic activity spans, the reliable distinction of similar epistemic activities, and the detection of overlap** epistemic activities. We propose a separate performance metric for each challenge and thus provide an evaluation framework for future research. Indeed, our evaluation of various state-of-the-art recurrent neural network architectures reveals that current techniques fail to address some of these challenges.
△ Less
Submitted 26 November, 2018;
originally announced November 2018.
-
Event-triggered Natural Hazard Monitoring with Convolutional Neural Networks on the Edge
Authors:
Matthias Meyer,
Timo Farei-Campagna,
Akos Pasztor,
Reto Da Forno,
Tonio Gsell,
Jérome Faillettaz,
Andreas Vieli,
Samuel Weber,
Jan Beutel,
Lothar Thiele
Abstract:
In natural hazard warning systems fast decision making is vital to avoid catastrophes. Decision making at the edge of a wireless sensor network promises fast response times but is limited by the availability of energy, data transfer speed, processing and memory constraints. In this work we present a realization of a wireless sensor network for hazard monitoring based on an array of event-triggered…
▽ More
In natural hazard warning systems fast decision making is vital to avoid catastrophes. Decision making at the edge of a wireless sensor network promises fast response times but is limited by the availability of energy, data transfer speed, processing and memory constraints. In this work we present a realization of a wireless sensor network for hazard monitoring based on an array of event-triggered single-channel micro-seismic sensors with advanced signal processing and characterization capabilities based on a novel co-detection technique. On the one hand we leverage an ultra-low power, threshold-triggering circuit paired with on-demand digital signal acquisition capable of extracting relevant information exactly and efficiently at times when it matters most and consequentially not wasting precious resources when nothing can be observed. On the other hand we utilize machine-learning-based classification implemented on low-power, off-the-shelf microcontrollers to avoid false positive warnings and to actively identify humans in hazard zones. The sensors' response time and memory requirement is substantially improved by quantizing and pipelining the inference of a convolutional neural network. In this way, convolutional neural networks that would not run unmodified on a memory constrained device can be executed in real-time and at scale on low-power embedded devices. A field study with our system is running on the rockfall scarp of the Matterhorn Hörnligrat at 3500 m a.s.l. since 08/2018.
△ Less
Submitted 1 March, 2019; v1 submitted 22 October, 2018;
originally announced October 2018.
-
Reflection On Reflection In Design Study
Authors:
Jason Dykes,
Miriah Meyer
Abstract:
Visualization design study research methodologies emphasize the need for reflection to generate knowledge. And yet, there is very little guidance in the literature specifying what reflection in the context of design studies actually involves. We initiated a community discussion on this topic through a panel at the 2017 IEEE VIS Conference - this report documents the panel discussion. We analyze th…
▽ More
Visualization design study research methodologies emphasize the need for reflection to generate knowledge. And yet, there is very little guidance in the literature specifying what reflection in the context of design studies actually involves. We initiated a community discussion on this topic through a panel at the 2017 IEEE VIS Conference - this report documents the panel discussion. We analyze the panel content through the lense of our own reflective experiences and propose several priorities for ongoing thinking on reflection in applied visualization research.
△ Less
Submitted 25 September, 2018;
originally announced September 2018.
-
APRIL: Interactively Learning to Summarise by Combining Active Preference Learning and Reinforcement Learning
Authors:
Yang Gao,
Christian M. Meyer,
Iryna Gurevych
Abstract:
We propose a method to perform automatic document summarisation without using reference summaries. Instead, our method interactively learns from users' preferences. The merit of preference-based interactive summarisation is that preferences are easier for users to provide than reference summaries. Existing preference-based interactive learning methods suffer from high sample complexity, i.e. they…
▽ More
We propose a method to perform automatic document summarisation without using reference summaries. Instead, our method interactively learns from users' preferences. The merit of preference-based interactive summarisation is that preferences are easier for users to provide than reference summaries. Existing preference-based interactive learning methods suffer from high sample complexity, i.e. they need to interact with the oracle for many rounds in order to converge. In this work, we propose a new objective function, which enables us to leverage active learning, preference learning and reinforcement learning techniques in order to reduce the sample complexity. Both simulation and real-user experiments suggest that our method significantly advances the state of the art. Our source code is freely available at https://github.com/UKPLab/emnlp2018-april.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
A Framework for Creative-Visualization Opportunities Workshops
Authors:
Ethan Kerzner,
Sarah Goodwin,
Jason Dykes,
Sara Jones,
Miriah Meyer
Abstract:
Applied visualization researchers often work closely with domain collaborators to explore new and useful applications of visualization. The early stages of collaborations are typically time consuming for all stakeholders as researchers piece together an understanding of domain challenges from disparate discussions and meetings. A number of recent projects, however, report on the use of creative vi…
▽ More
Applied visualization researchers often work closely with domain collaborators to explore new and useful applications of visualization. The early stages of collaborations are typically time consuming for all stakeholders as researchers piece together an understanding of domain challenges from disparate discussions and meetings. A number of recent projects, however, report on the use of creative visualization-opportunities (CVO) workshops to accelerate the early stages of applied work, eliciting a wealth of requirements in a few days of focused work. Yet, there is no established guidance for how to use such workshops effectively. In this paper, we present the results of a 2-year collaboration in which we analyzed the use of 17 workshops in 10 visualization contexts. Its primary contribution is a framework for CVO workshops that: 1) identifies a process model for using workshops; 2) describes a structure of what happens within effective workshops; 3) recommends 25 actionable guidelines for future workshops; and 4) presents an example workshop and workshop methods. The creation of this framework exemplifies the use of critical reflection to learn about visualization in practice from diverse studies and experience.
△ Less
Submitted 2 August, 2018;
originally announced August 2018.
-
A Retrospective Analysis of the Fake News Challenge Stance Detection Task
Authors:
Andreas Hanselowski,
Avinesh PVS,
Benjamin Schiller,
Felix Caspelherr,
Debanjan Chaudhuri,
Christian M. Meyer,
Iryna Gurevych
Abstract:
The 2017 Fake News Challenge Stage 1 (FNC-1) shared task addressed a stance classification task as a crucial first step towards detecting fake news. To date, there is no in-depth analysis paper to critically discuss FNC-1's experimental setup, reproduce the results, and draw conclusions for next-generation stance classification methods. In this paper, we provide such an in-depth analysis for the t…
▽ More
The 2017 Fake News Challenge Stage 1 (FNC-1) shared task addressed a stance classification task as a crucial first step towards detecting fake news. To date, there is no in-depth analysis paper to critically discuss FNC-1's experimental setup, reproduce the results, and draw conclusions for next-generation stance classification methods. In this paper, we provide such an in-depth analysis for the three top-performing systems. We first find that FNC-1's proposed evaluation metric favors the majority class, which can be easily classified, and thus overestimates the true discriminative power of the methods. Therefore, we propose a new F1-based metric yielding a changed system ranking. Next, we compare the features and architectures used, which leads to a novel feature-rich stacked LSTM model that performs on par with the best systems, but is superior in predicting minority classes. To understand the methods' ability to generalize, we derive a new dataset and perform both in-domain and cross-domain experiments. Our qualitative and quantitative study helps interpreting the original FNC-1 scores and understand which features help improving performance and why. Our new dataset and all source code used during the reproduction study are publicly available for future research.
△ Less
Submitted 13 June, 2018;
originally announced June 2018.
-
Live Blog Corpus for Summarization
Authors:
Avinesh P. V. S.,
Maxime Peyrard,
Christian M. Meyer
Abstract:
Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader but are often not available. In this paper, we study a way of collecting corpora for automatic live blo…
▽ More
Live blogs are an increasingly popular news format to cover breaking news and live events in online journalism. Online news websites around the world are using this medium to give their readers a minute by minute update on an event. Good summaries enhance the value of the live blogs for a reader but are often not available. In this paper, we study a way of collecting corpora for automatic live blog summarization. In an empirical evaluation using well-known state-of-the-art summarization systems, we show that live blogs corpus poses new challenges in the field of summarization. We make our tools publicly available to reconstruct the corpus to encourage the research community and replicate our results.
△ Less
Submitted 27 February, 2018;
originally announced February 2018.
-
Unsupervised Feature Learning for Audio Analysis
Authors:
Matthias Meyer,
Jan Beutel,
Lothar Thiele
Abstract:
Identifying acoustic events from a continuously streaming audio source is of interest for many applications including environmental monitoring for basic research. In this scenario neither different event classes are known nor what distinguishes one class from another. Therefore, an unsupervised feature learning method for exploration of audio data is presented in this paper. It incorporates the tw…
▽ More
Identifying acoustic events from a continuously streaming audio source is of interest for many applications including environmental monitoring for basic research. In this scenario neither different event classes are known nor what distinguishes one class from another. Therefore, an unsupervised feature learning method for exploration of audio data is presented in this paper. It incorporates the two following novel contributions: First, an audio frame predictor based on a Convolutional LSTM autoencoder is demonstrated, which is used for unsupervised feature extraction. Second, a training method for autoencoders is presented, which leads to distinct features by amplifying event similarities. In comparison to standard approaches, the features extracted from the audio frame predictor trained with the novel approach show 13 % better results when used with a classifier and 36 % better results when used for clustering.
△ Less
Submitted 11 December, 2017;
originally announced December 2017.
-
Efficient Convolutional Neural Network For Audio Event Detection
Authors:
Matthias Meyer,
Lukas Cavigelli,
Lothar Thiele
Abstract:
Wireless distributed systems as used in sensor networks, Internet-of-Things and cyber-physical systems, impose high requirements on resource efficiency. Advanced preprocessing and classification of data at the network edge can help to decrease the communication demand and to reduce the amount of data to be processed centrally. In the area of distributed acoustic sensing, the combination of algorit…
▽ More
Wireless distributed systems as used in sensor networks, Internet-of-Things and cyber-physical systems, impose high requirements on resource efficiency. Advanced preprocessing and classification of data at the network edge can help to decrease the communication demand and to reduce the amount of data to be processed centrally. In the area of distributed acoustic sensing, the combination of algorithms with a high classification rate and resource-constraint embedded systems is essential. Unfortunately, algorithms for acoustic event detection have a high memory and computational demand and are not suited for execution at the network edge. This paper addresses these aspects by applying structural optimizations to a convolutional neural network for audio event detection to reduce the memory requirement by a factor of more than 500 and the computational effort by a factor of 2.1 while performing 9.2% better.
△ Less
Submitted 28 September, 2017;
originally announced September 2017.