-
Standardized Data-Parallel Rendering Using ANARI
Authors:
Ingo Wald,
Stefan Zellmann,
Jefferson Amstutz,
Qi Wu,
Kevin Griffin,
Milan Jaros,
Stefan Wesner
Abstract:
We propose and discuss a paradigm that allows for expressing \emph{data-parallel} rendering with the classically non-parallel ANARI API. We propose this as a new standard for data-parallel sci-vis rendering, describe two different implementations of this paradigm, and use multiple sample integrations into existing apps to show how easy it is to adopt this paradigm, and what can be gained from doin…
▽ More
We propose and discuss a paradigm that allows for expressing \emph{data-parallel} rendering with the classically non-parallel ANARI API. We propose this as a new standard for data-parallel sci-vis rendering, describe two different implementations of this paradigm, and use multiple sample integrations into existing apps to show how easy it is to adopt this paradigm, and what can be gained from doing so.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency
Authors:
Leonidas Gee,
Milan Gritta,
Gerasimos Lampouras,
Ignacio Iacobacci
Abstract:
Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self…
▽ More
Code Language Models have been trained to generate accurate solutions, typically with no regard for runtime. On the other hand, previous works that explored execution optimisation have observed corresponding drops in functional correctness. To that end, we introduce Code-Optimise, a framework that incorporates both correctness (passed, failed) and runtime (quick, slow) as learning signals via self-generated preference data. Our framework is both lightweight and robust as it dynamically selects solutions to reduce overfitting while avoiding a reliance on larger models for learning signals. Code-Optimise achieves significant improvements in pass@k while decreasing the competitive baseline runtimes by an additional 6% for in-domain data and up to 3% for out-of-domain data. As a byproduct, the average length of the generated solutions is reduced by up to 48% on MBPP and 23% on HumanEval, resulting in faster and cheaper inference. The generated data and codebase will be open-sourced at www.open-source.link.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Open-Source Web Service with Morphological Dictionary-Supplemented Deep Learning for Morphosyntactic Analysis of Czech
Authors:
Milan Straka,
Jana Straková
Abstract:
We present an open-source web service for Czech morphosyntactic analysis. The system combines a deep learning model with rescoring by a high-precision morphological dictionary at inference time. We show that our hybrid method surpasses two competitive baselines: While the deep learning model ensures generalization for out-of-vocabulary words and better disambiguation, an improvement over an existi…
▽ More
We present an open-source web service for Czech morphosyntactic analysis. The system combines a deep learning model with rescoring by a high-precision morphological dictionary at inference time. We show that our hybrid method surpasses two competitive baselines: While the deep learning model ensures generalization for out-of-vocabulary words and better disambiguation, an improvement over an existing morphological analyser MorphoDiTa, at the same time, the deep learning model benefits from inference-time guidance of a manually curated morphological dictionary. We achieve 50% error reduction in lemmatization and 58% error reduction in POS tagging over MorphoDiTa, while also offering dependency parsing. The model is trained on one of the currently largest Czech morphosyntactic corpora, the PDT-C 1.0, with the trained models available at https://hdl.handle.net/11234/1-5293. We provide the tool as a web service deployed at https://lindat.mff.cuni.cz/services/udpipe/. The source code is available at GitHub (https://github.com/ufal/udpipe/tree/udpipe-2), along with a Python client for a simple use. The documentation for the models can be found at https://ufal.mff.cuni.cz/udpipe/2/models#czech_pdtc1.0_model.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
CWRCzech: 100M Query-Document Czech Click Dataset and Its Application to Web Relevance Ranking
Authors:
Josef Vonášek,
Milan Straka,
Rostislav Krč,
Lenka Lasoňová,
Ekaterina Egorova,
Jana Straková,
Jakub Náplava
Abstract:
We present CWRCzech, Click Web Ranking dataset for Czech, a 100M query-document Czech click dataset for relevance ranking with user behavior data collected from search engine logs of Seznam.cz. To the best of our knowledge, CWRCzech is the largest click dataset with raw text published so far. It provides document positions in the search results as well as information about user behavior: 27.6M cli…
▽ More
We present CWRCzech, Click Web Ranking dataset for Czech, a 100M query-document Czech click dataset for relevance ranking with user behavior data collected from search engine logs of Seznam.cz. To the best of our knowledge, CWRCzech is the largest click dataset with raw text published so far. It provides document positions in the search results as well as information about user behavior: 27.6M clicked documents and 10.8M dwell times. In addition, we also publish a manually annotated Czech test for the relevance task, containing nearly 50k query-document pairs, each annotated by at least 2 annotators. Finally, we analyze how the user behavior data improve relevance ranking and show that models trained on data automatically harnessed at sufficient scale can surpass the performance of models trained on human annotated data. CWRCzech is published under an academic non-commercial license and is available to the research community at https://github.com/seznam/CWRCzech.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Formally Verifying Deep Reinforcement Learning Controllers with Lyapunov Barrier Certificates
Authors:
Udayan Mandal,
Guy Amir,
Haoze Wu,
Ieva Daukantas,
Fletcher Lee Newell,
Umberto J. Ravaioli,
Baoluo Meng,
Michael Durling,
Milan Ganai,
Tobey Shim,
Guy Katz,
Clark Barrett
Abstract:
Deep reinforcement learning (DRL) is a powerful machine learning paradigm for generating agents that control autonomous systems. However, the "black box" nature of DRL agents limits their deployment in real-world safety-critical applications. A promising approach for providing strong guarantees on an agent's behavior is to use Neural Lyapunov Barrier (NLB) certificates, which are learned functions…
▽ More
Deep reinforcement learning (DRL) is a powerful machine learning paradigm for generating agents that control autonomous systems. However, the "black box" nature of DRL agents limits their deployment in real-world safety-critical applications. A promising approach for providing strong guarantees on an agent's behavior is to use Neural Lyapunov Barrier (NLB) certificates, which are learned functions over the system whose properties indirectly imply that an agent behaves as desired. However, NLB-based certificates are typically difficult to learn and even more difficult to verify, especially for complex systems. In this work, we present a novel method for training and verifying NLB-based certificates for discrete-time systems. Specifically, we introduce a technique for certificate composition, which simplifies the verification of highly-complex systems by strategically designing a sequence of certificates. When jointly verified with neural network verification engines, these certificates provide a formal guarantee that a DRL agent both achieves its goals and avoids unsafe behavior. Furthermore, we introduce a technique for certificate filtering, which significantly simplifies the process of producing formally verified certificates. We demonstrate the merits of our approach with a case study on providing safety and liveness guarantees for a DRL-controlled spacecraft.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Tools at the Frontiers of Quantitative Verification
Authors:
Roman Andriushchenko,
Alexander Bork,
Carlos E. Budde,
Milan Češka,
Kush Grover,
Ernst Moritz Hahn,
Arnd Hartmanns,
Bryant Israelsen,
Nils Jansen,
Joshua Jeppson,
Sebastian Junges,
Maximilian A. Köhl,
Bettina Könighofer,
Jan Křetínský,
Tobias Meggendorfer,
David Parker,
Stefan Pranger,
Tim Quatmann,
Enno Ruijters,
Landon Taylor,
Matthias Volk,
Maximilian Weininger,
Zhen Zhang
Abstract:
The analysis of formal models that include quantitative aspects such as timing or probabilistic choices is performed by quantitative verification tools. Broad and mature tool support is available for computing basic properties such as expected rewards on basic models such as Markov chains. Previous editions of QComp, the comparison of tools for the analysis of quantitative formal models, focused o…
▽ More
The analysis of formal models that include quantitative aspects such as timing or probabilistic choices is performed by quantitative verification tools. Broad and mature tool support is available for computing basic properties such as expected rewards on basic models such as Markov chains. Previous editions of QComp, the comparison of tools for the analysis of quantitative formal models, focused on this setting. Many application scenarios, however, require more advanced property types such as LTL and parameter synthesis queries as well as advanced models like stochastic games and partially observable MDPs. For these, tool support is in its infancy today. This paper presents the outcomes of QComp 2023: a survey of the state of the art in quantitative verification tool support for advanced property types and models. With tools ranging from first research prototypes to well-supported integrations into established toolsets, this report highlights today's active areas and tomorrow's challenges in tool-focused research for quantitative verification.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Mitigating Text Toxicity with Counterfactual Generation
Authors:
Milan Bhan,
Jean-Noel Vittaut,
Nina Achache,
Victor Legrand,
Nicolas Chesneau,
Annabelle Blangero,
Juliette Murris,
Marie-Jeanne Lesot
Abstract:
Toxicity mitigation consists in rephrasing text in order to remove offensive or harmful meaning. Neural natural language processing (NLP) models have been widely used to target and mitigate textual toxicity. However, existing methods fail to detoxify text while preserving the initial non-toxic meaning at the same time. In this work, we propose to apply counterfactual generation methods from the eX…
▽ More
Toxicity mitigation consists in rephrasing text in order to remove offensive or harmful meaning. Neural natural language processing (NLP) models have been widely used to target and mitigate textual toxicity. However, existing methods fail to detoxify text while preserving the initial non-toxic meaning at the same time. In this work, we propose to apply counterfactual generation methods from the eXplainable AI (XAI) field to target and mitigate textual toxicity. In particular, we perform text detoxification by applying local feature importance and counterfactual generation methods to a toxicity classifier distinguishing between toxic and non-toxic texts. We carry out text detoxification through counterfactual generation on three datasets and compare our approach to three competitors. Automatic and human evaluations show that recently developed NLP counterfactual generators can mitigate toxicity accurately while better preserving the meaning of the initial text as compared to classical detoxification methods. Finally, we take a step back from using automated detoxification tools, and discuss how to manage the polysemous nature of toxicity and the risk of malicious use of detoxification tools. This work is the first to bridge the gap between counterfactual generation and text detoxification and paves the way towards more practical application of XAI methods.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
HumanRankEval: Automatic Evaluation of LMs as Conversational Assistants
Authors:
Milan Gritta,
Gerasimos Lampouras,
Ignacio Iacobacci
Abstract:
Language models (LMs) as conversational assistants recently became popular tools that help people accomplish a variety of tasks. These typically result from adapting LMs pretrained on general domain text sequences through further instruction-tuning and possibly preference optimisation methods. The evaluation of such LMs would ideally be performed using human judgement, however, this is not scalabl…
▽ More
Language models (LMs) as conversational assistants recently became popular tools that help people accomplish a variety of tasks. These typically result from adapting LMs pretrained on general domain text sequences through further instruction-tuning and possibly preference optimisation methods. The evaluation of such LMs would ideally be performed using human judgement, however, this is not scalable. On the other hand, automatic evaluation featuring auxiliary LMs as judges and/or knowledge-based tasks is scalable but struggles with assessing conversational ability and adherence to instructions. To help accelerate the development of LMs as conversational assistants, we propose a novel automatic evaluation task: HumanRankEval (HRE). It consists of a large-scale, diverse and high-quality set of questions, each with several answers authored and scored by humans. To perform evaluation, HRE ranks these answers based on their log-likelihood under the LM's distribution, and subsequently calculates their correlation with the corresponding human rankings. We support HRE's efficacy by investigating how efficiently it separates pretrained and instruction-tuned LMs of various sizes. We show that HRE correlates well with human judgements and is particularly responsive to model changes following instruction-tuning.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
On Probabilistic and Causal Reasoning with Summation Operators
Authors:
Duligur Ibeling,
Thomas F. Icard,
Milan Mossé
Abstract:
Ibeling et al. (2023). axiomatize increasingly expressive languages of causation and probability, and Mosse et al. (2024) show that reasoning (specifically the satisfiability problem) in each causal language is as difficult, from a computational complexity perspective, as reasoning in its merely probabilistic or "correlational" counterpart. Introducing a summation operator to capture common device…
▽ More
Ibeling et al. (2023). axiomatize increasingly expressive languages of causation and probability, and Mosse et al. (2024) show that reasoning (specifically the satisfiability problem) in each causal language is as difficult, from a computational complexity perspective, as reasoning in its merely probabilistic or "correlational" counterpart. Introducing a summation operator to capture common devices that appear in applications -- such as the $do$-calculus of Pearl (2009) for causal inference, which makes ample use of marginalization -- van der Zander et al. (2023) partially extend these earlier complexity results to causal and probabilistic languages with marginalization. We complete this extension, fully characterizing the complexity of probabilistic and causal reasoning with summation, demonstrating that these again remain equally difficult. Surprisingly, allowing free variables for random variable values results in a system that is undecidable, so long as the ranges of these random variables are unrestricted. We finally axiomatize these languages featuring marginalization (or more generally summation), resolving open questions posed by Ibeling et al. (2023).
△ Less
Submitted 18 May, 2024; v1 submitted 5 May, 2024;
originally announced May 2024.
-
PhilHumans: Benchmarking Machine Learning for Personal Health
Authors:
Vadim Liventsev,
Vivek Kumar,
Allmin Pradhap Singh Susaiyah,
Zixiu Wu,
Ivan Rodin,
Asfand Yaar,
Simone Balloccu,
Marharyta Beraziuk,
Sebastiano Battiato,
Giovanni Maria Farinella,
Aki Härmä,
Rim Helaoui,
Milan Petkovic,
Diego Reforgiato Recupero,
Ehud Reiter,
Daniele Riboni,
Raymond Sterling
Abstract:
The use of machine learning in Healthcare has the potential to improve patient outcomes as well as broaden the reach and affordability of Healthcare. The history of other application areas indicates that strong benchmarks are essential for the development of intelligent systems. We present Personal Health Interfaces Leveraging HUman-MAchine Natural interactions (PhilHumans), a holistic suite of be…
▽ More
The use of machine learning in Healthcare has the potential to improve patient outcomes as well as broaden the reach and affordability of Healthcare. The history of other application areas indicates that strong benchmarks are essential for the development of intelligent systems. We present Personal Health Interfaces Leveraging HUman-MAchine Natural interactions (PhilHumans), a holistic suite of benchmarks for machine learning across different Healthcare settings - talk therapy, diet coaching, emergency care, intensive care, obstetric sonography - as well as different learning settings, such as action anticipation, timeseries modeling, insight mining, language modeling, computer vision, reinforcement learning and program synthesis
△ Less
Submitted 16 May, 2024; v1 submitted 4 May, 2024;
originally announced May 2024.
-
An Adaptive Approach for Infinitely Many-armed Bandits under Generalized Rotting Constraints
Authors:
Jung-hun Kim,
Milan Vojnovic,
Se-Young Yun
Abstract:
In this study, we consider the infinitely many-armed bandit problems in a rested rotting setting, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios regarding the rotting of rewards: one in which the cumulative amount of rotting is bounded by $V_T$, referred to as the slow-rotting case, and the other in which the cumulative…
▽ More
In this study, we consider the infinitely many-armed bandit problems in a rested rotting setting, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios regarding the rotting of rewards: one in which the cumulative amount of rotting is bounded by $V_T$, referred to as the slow-rotting case, and the other in which the cumulative number of rotting instances is bounded by $S_T$, referred to as the abrupt-rotting case. To address the challenge posed by rotting rewards, we introduce an algorithm that utilizes UCB with an adaptive sliding window, designed to manage the bias and variance trade-off arising due to rotting rewards. Our proposed algorithm achieves tight regret bounds for both slow and abrupt rotting scenarios. Lastly, we demonstrate the performance of our algorithm using numerical experiments.
△ Less
Submitted 24 May, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
A multi-robot system for the detection of explosive devices
Authors:
Ken Hasselmann,
Mario Malizia,
Rafael Caballero,
Fabio Polisano,
Shashank Govindaraj,
Jakob Stigler,
Oleksii Ilchenko,
Milan Bajic,
Geert De Cubber
Abstract:
In order to clear the world of the threat posed by landmines and other explosive devices, robotic systems can play an important role. However, the development of such field robots that need to operate in hazardous conditions requires the careful consideration of multiple aspects related to the perception, mobility, and collaboration capabilities of the system. In the framework of a European challe…
▽ More
In order to clear the world of the threat posed by landmines and other explosive devices, robotic systems can play an important role. However, the development of such field robots that need to operate in hazardous conditions requires the careful consideration of multiple aspects related to the perception, mobility, and collaboration capabilities of the system. In the framework of a European challenge, the Artificial Intelligence for Detection of Explosive Devices - eXtended (AIDEDeX) project proposes to design a heterogeneous multi-robot system with advanced sensor fusion algorithms. This system is specifically designed to detect and classify improvised explosive devices, explosive ordnances, and landmines. This project integrates specialised sensors, including electromagnetic induction, ground penetrating radar, X-Ray backscatter imaging, Raman spectrometers, and multimodal cameras, to achieve comprehensive threat identification and localisation. The proposed system comprises a fleet of unmanned ground vehicles and unmanned aerial vehicles. This article details the operational phases of the AIDEDeX system, from rapid terrain exploration using unmanned aerial vehicles to specialised detection and classification by unmanned ground vehicles equipped with a robotic manipulator. Initially focusing on a centralised approach, the project will also explore the potential of a decentralised control architecture, taking inspiration from swarm robotics to provide a robust, adaptable, and scalable solution for explosive detection.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
Authors:
Vincent Conitzer,
Rachel Freedman,
Jobst Heitzig,
Wesley H. Holliday,
Bob M. Jacobs,
Nathan Lambert,
Milan Mossé,
Eric Pacuit,
Stuart Russell,
Hailey Schoelkopf,
Emanuel Tewolde,
William S. Zwicker
Abstract:
Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as hel** to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level prin…
▽ More
Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, such as hel** to commit crimes or producing racist text. One approach to fine-tuning, called reinforcement learning from human feedback, learns from humans' expressed preferences over multiple outputs. Another approach is constitutional AI, in which the input from humans is a list of high-level principles. But how do we deal with potentially diverging input from humans? How can we aggregate the input into consistent data about "collective" preferences or otherwise use it to make collective choices about model behavior? In this paper, we argue that the field of social choice is well positioned to address these questions, and we discuss ways forward for this agenda, drawing on discussions in a recent workshop on Social Choice for AI Ethics and Safety held in Berkeley, CA, USA in December 2023.
△ Less
Submitted 4 June, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
ÚFAL LatinPipe at EvaLatin 2024: Morphosyntactic Analysis of Latin
Authors:
Milan Straka,
Jana Straková,
Federica Gamba
Abstract:
We present LatinPipe, the winning submission to the EvaLatin 2024 Dependency Parsing shared task. Our system consists of a fine-tuned concatenation of base and large pre-trained LMs, with a dot-product attention head for parsing and softmax classification heads for morphology to jointly learn both dependency parsing and morphological analysis. It is trained by sampling from seven publicly availabl…
▽ More
We present LatinPipe, the winning submission to the EvaLatin 2024 Dependency Parsing shared task. Our system consists of a fine-tuned concatenation of base and large pre-trained LMs, with a dot-product attention head for parsing and softmax classification heads for morphology to jointly learn both dependency parsing and morphological analysis. It is trained by sampling from seven publicly available Latin corpora, utilizing additional harmonization of annotations to achieve a more unified annotation style. Before fine-tuning, we train the system for a few initial epochs with frozen weights. We also add additional local relative contextualization by stacking the BiLSTM layers on top of the Transformer(s). Finally, we ensemble output probability distributions from seven randomly instantiated networks for the final submission. The code is available at https://github.com/ufal/evalatin2024-latinpipe.
△ Less
Submitted 29 May, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
From News to Summaries: Building a Hungarian Corpus for Extractive and Abstractive Summarization
Authors:
Botond Barta,
Dorina Lakatos,
Attila Nagy,
Milán Konor Nyist,
Judit Ács
Abstract:
Training summarization models requires substantial amounts of training data. However for less resourceful languages like Hungarian, openly available models and datasets are notably scarce. To address this gap our paper introduces HunSum-2 an open-source Hungarian corpus suitable for training abstractive and extractive summarization models. The dataset is assembled from segments of the Common Crawl…
▽ More
Training summarization models requires substantial amounts of training data. However for less resourceful languages like Hungarian, openly available models and datasets are notably scarce. To address this gap our paper introduces HunSum-2 an open-source Hungarian corpus suitable for training abstractive and extractive summarization models. The dataset is assembled from segments of the Common Crawl corpus undergoing thorough cleaning, preprocessing and deduplication. In addition to abstractive summarization we generate sentence-level labels for extractive summarization using sentence similarity. We train baseline models for both extractive and abstractive summarization using the collected dataset. To demonstrate the effectiveness of the trained models, we perform both quantitative and qualitative evaluation. Our dataset, models and code are publicly available, encouraging replication, further research, and real-world applications across various domains.
△ Less
Submitted 12 April, 2024; v1 submitted 4 April, 2024;
originally announced April 2024.
-
A Survey on Error-Bounded Lossy Compression for Scientific Datasets
Authors:
Sheng Di,
**yang Liu,
Kai Zhao,
Xin Liang,
Robert Underwood,
Zhaorui Zhang,
Milan Shah,
Yafan Huang,
Jiajun Huang,
Xiaodong Yu,
Congrong Ren,
Hanqi Guo,
Grant Wilkins,
Dingwen Tao,
Jiannan Tian,
Sian **,
Zizhe Jian,
Daoce Wang,
MD Hasanur Rahman,
Boyuan Zhang,
Jon C. Calhoun,
Guanpeng Li,
Kazutomo Yoshii,
Khalid Ayed Alharthi,
Franck Cappello
Abstract:
Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving the reconstructed data fidelity very well. Many error-bounded lossy compressors have been developed for a wide range of parallel and distributed use cases for years. These lossy compressors are designed with distinct compression models and design principles, such that each…
▽ More
Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving the reconstructed data fidelity very well. Many error-bounded lossy compressors have been developed for a wide range of parallel and distributed use cases for years. These lossy compressors are designed with distinct compression models and design principles, such that each of them features particular pros and cons. In this paper we provide a comprehensive survey of emerging error-bounded lossy compression techniques for different use cases each involving big data to process. The key contribution is fourfold. (1) We summarize an insightful taxonomy of lossy compression into 6 classic compression models. (2) We provide a comprehensive survey of 10+ commonly used compression components/modules used in error-bounded lossy compressors. (3) We provide a comprehensive survey of 10+ state-of-the-art error-bounded lossy compressors as well as how they combine the various compression modules in their designs. (4) We provide a comprehensive survey of the lossy compression for 10+ modern scientific applications and use-cases. We believe this survey is useful to multiple communities including scientific applications, high-performance computing, lossy compression, and big data.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Adversary-Robust Graph-Based Learning of WSIs
Authors:
Saba Heidari Gheshlaghi,
Milan Aryal,
Nasim Yahyasoltani,
Masoud Ganji
Abstract:
Enhancing the robustness of deep learning models against adversarial attacks is crucial, especially in critical domains like healthcare where significant financial interests heighten the risk of such attacks. Whole slide images (WSIs) are high-resolution, digitized versions of tissue samples mounted on glass slides, scanned using sophisticated imaging equipment. The digital analysis of WSIs presen…
▽ More
Enhancing the robustness of deep learning models against adversarial attacks is crucial, especially in critical domains like healthcare where significant financial interests heighten the risk of such attacks. Whole slide images (WSIs) are high-resolution, digitized versions of tissue samples mounted on glass slides, scanned using sophisticated imaging equipment. The digital analysis of WSIs presents unique challenges due to their gigapixel size and multi-resolution storage format. In this work, we aim at improving the robustness of cancer Gleason grading classification systems against adversarial attacks, addressing challenges at both the image and graph levels. As regards the proposed algorithm, we develop a novel and innovative graph-based model which utilizes GNN to extract features from the graph representation of WSIs. A denoising module, along with a pooling layer is incorporated to manage the impact of adversarial attacks on the WSIs. The process concludes with a transformer module that classifies various grades of prostate cancer based on the processed data. To assess the effectiveness of the proposed method, we conducted a comparative analysis using two scenarios. Initially, we trained and tested the model without the denoiser using WSIs that had not been exposed to any attack. We then introduced a range of attacks at either the image or graph level and processed them through the proposed network. The performance of the model was evaluated in terms of accuracy and kappa scores. The results from this comparison showed a significant improvement in cancer diagnosis accuracy, highlighting the robustness and efficiency of the proposed method in handling adversarial challenges in the context of medical imaging.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Practical End-to-End Optical Music Recognition for Pianoform Music
Authors:
Jiří Mayer,
Milan Straka,
Jan Hajič jr.,
Pavel Pecina
Abstract:
The majority of recent progress in Optical Music Recognition (OMR) has been achieved with Deep Learning methods, especially models following the end-to-end paradigm, reading input images and producing a linear sequence of tokens. Unfortunately, many music scores, especially piano music, cannot be easily converted to a linear sequence. This has led OMR researchers to use custom linearized encodings…
▽ More
The majority of recent progress in Optical Music Recognition (OMR) has been achieved with Deep Learning methods, especially models following the end-to-end paradigm, reading input images and producing a linear sequence of tokens. Unfortunately, many music scores, especially piano music, cannot be easily converted to a linear sequence. This has led OMR researchers to use custom linearized encodings, instead of broadly accepted structured formats for music notation. Their diversity makes it difficult to compare the performance of OMR systems directly. To bring recent OMR model progress closer to useful results: (a) We define a sequential format called Linearized MusicXML, allowing to train an end-to-end model directly and maintaining close cohesion and compatibility with the industry-standard MusicXML format. (b) We create a dev and test set for benchmarking typeset OMR with MusicXML ground truth based on the OpenScore Lieder corpus. They contain 1,438 and 1,493 pianoform systems, each with an image from IMSLP. (c) We train and fine-tune an end-to-end model to serve as a baseline on the dataset and employ the TEDn metric to evaluate the model. We also test our model against the recently published synthetic pianoform dataset GrandStaff and surpass the state-of-the-art results.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Fuzzy Fault Trees Formalized
Authors:
Thi Kim Nhung Dang,
Milan Lopuhaä-Zwakenberg,
Mariëlle Stoelinga
Abstract:
Fault tree analysis is a vital method of assessing safety risks. It helps to identify potential causes of accidents, assess their likelihood and severity, and suggest preventive measures. Quantitative analysis of fault trees is often done via the dependability metrics that compute the system's failure behaviour over time. However, the lack of precise data is a major obstacle to quantitative analys…
▽ More
Fault tree analysis is a vital method of assessing safety risks. It helps to identify potential causes of accidents, assess their likelihood and severity, and suggest preventive measures. Quantitative analysis of fault trees is often done via the dependability metrics that compute the system's failure behaviour over time. However, the lack of precise data is a major obstacle to quantitative analysis, and so to reliability analysis. Fuzzy logic is a popular framework for dealing with ambiguous values and has applications in many domains. A number of fuzzy approaches have been proposed to fault tree analysis, but -- to the best of our knowledge -- none of them provide rigorous definitions or algorithms for computing fuzzy unreliability values. In this paper, we define a rigorous framework for fuzzy unreliability values. In addition, we provide a bottom-up algorithm to efficiently calculate fuzzy reliability for a system. The algorithm incorporates the concept of $α$-cuts method. That is, performing binary algebraic operations on intervals on horizontally discretised $α$-cut representations of fuzzy numbers. The method preserves the nonlinearity of fuzzy unreliability. Finally, we illustrate the results obtained from two case studies.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Towards Decoding Brain Activity During Passive Listening of Speech
Authors:
Milán András Fodor,
Tamás Gábor Csapó,
Frigyes Viktor Arthur
Abstract:
The aim of the study is to investigate the complex mechanisms of speech perception and ultimately decode the electrical changes in the brain accruing while listening to speech. We attempt to decode heard speech from intracranial electroencephalographic (iEEG) data using deep learning methods. The goal is to aid the advancement of brain-computer interface (BCI) technology for speech synthesis, and,…
▽ More
The aim of the study is to investigate the complex mechanisms of speech perception and ultimately decode the electrical changes in the brain accruing while listening to speech. We attempt to decode heard speech from intracranial electroencephalographic (iEEG) data using deep learning methods. The goal is to aid the advancement of brain-computer interface (BCI) technology for speech synthesis, and, hopefully, to provide an additional perspective on the cognitive processes of speech perception. This approach diverges from the conventional focus on speech production and instead chooses to investigate neural representations of perceived speech. This angle opened up a complex perspective, potentially allowing us to study more sophisticated neural patterns. Leveraging the power of deep learning models, the research aimed to establish a connection between these intricate neural activities and the corresponding speech sounds. Despite the approach not having achieved a breakthrough yet, the research sheds light on the potential of decoding neural activity during speech perception. Our current efforts can serve as a foundation, and we are optimistic about the potential of expanding and improving upon this work to move closer towards more advanced BCIs, better understanding of processes underlying perceived speech and its relation to spoken speech.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Self-adhesivity in lattices of abstract conditional independence models
Authors:
Tobias Boege,
Janneke H. Bolt,
Milan Studený
Abstract:
We introduce an algebraic concept of the frame for abstract conditional independence (CI) models, together with basic operations with respect to which such a frame should be closed: copying and marginalization. Three standard examples of such frames are (discrete) probabilistic CI structures, semi-graphoids and structural semi-graphoids. We concentrate on those frames which are closed under the op…
▽ More
We introduce an algebraic concept of the frame for abstract conditional independence (CI) models, together with basic operations with respect to which such a frame should be closed: copying and marginalization. Three standard examples of such frames are (discrete) probabilistic CI structures, semi-graphoids and structural semi-graphoids. We concentrate on those frames which are closed under the operation of set-theoretical intersection because, for these, the respective families of CI models are lattices. This allows one to apply the results from lattice theory and formal concept analysis to describe such families in terms of implications among CI statements.
The central concept of this paper is that of self-adhesivity defined in algebraic terms, which is a combinatorial reflection of the self-adhesivity concept studied earlier in context of polymatroids and information theory. The generalization also leads to a self-adhesivity operator defined on the hyper-level of CI frames. We answer some of the questions related to this approach and raise other open questions.
The core of the paper is in computations. The combinatorial approach to computation might overcome some memory and space limitation of software packages based on polyhedral geometry, in particular, if SAT solvers are utilized. We characterize some basic CI families over 4 variables in terms of canonical implications among CI statements. We apply our method in information-theoretical context to the task of entropic region demarcation over 5 variables.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Self-AMPLIFY: Improving Small Language Models with Self Post Hoc Explanations
Authors:
Milan Bhan,
Jean-Noel Vittaut,
Nicolas Chesneau,
Marie-Jeanne Lesot
Abstract:
Incorporating natural language rationales in the prompt and In-Context Learning (ICL) have led to a significant improvement of Large Language Models (LLMs) performance. However, generating high-quality rationales require human-annotation or the use of auxiliary proxy models. In this work, we propose Self-AMPLIFY to automatically generate rationales from post hoc explanation methods applied to Smal…
▽ More
Incorporating natural language rationales in the prompt and In-Context Learning (ICL) have led to a significant improvement of Large Language Models (LLMs) performance. However, generating high-quality rationales require human-annotation or the use of auxiliary proxy models. In this work, we propose Self-AMPLIFY to automatically generate rationales from post hoc explanation methods applied to Small Language Models (SLMs) to improve their own performance. Self-AMPLIFY is a 3-step method that targets samples, generates rationales and builds a final prompt to leverage ICL. Self-AMPLIFY performance is evaluated on four SLMs and five datasets requiring strong reasoning abilities. Self-AMPLIFY achieves good results against competitors, leading to strong accuracy improvement. Self-AMPLIFY is the first method to apply post hoc explanation methods to autoregressive language models to generate rationales to improve their own performance in a fully automated manner.
△ Less
Submitted 17 June, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Star-Forest Decompositions of Complete Graphs
Authors:
Todor Antić,
Jelena Glišić,
Milan Milivojčević
Abstract:
We deal with the problem of decomposing a complete geometric graph into plane star-forests. In particular, we disprove a recent conjecture by Pach, Saghafian and Schnider by constructing for each $n$ a complete geometric graph on $n$ vertices which can be decomposed into $\frac{n}{2}+1$ plane star-forests. Additionally we prove that for even $n$, every decomposition of complete abstract graph on…
▽ More
We deal with the problem of decomposing a complete geometric graph into plane star-forests. In particular, we disprove a recent conjecture by Pach, Saghafian and Schnider by constructing for each $n$ a complete geometric graph on $n$ vertices which can be decomposed into $\frac{n}{2}+1$ plane star-forests. Additionally we prove that for even $n$, every decomposition of complete abstract graph on $n$ vertices into $\frac{n}{2}+1$ star-forests is composed of a perfect matching and $\frac{n}{2}$ star-forests with two edge-balanced components, which we call broken double stars.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
CABINET: Content Relevance based Noise Reduction for Table Question Answering
Authors:
Sohan Patnaik,
Heril Changwal,
Milan Aggarwal,
Sumit Bhatia,
Yaman Kumar,
Balaji Krishnamurthy
Abstract:
Table understanding capability of Large Language Models (LLMs) has been extensively studied through the task of question-answering (QA) over tables. Typically, only a small part of the whole table is relevant to derive the answer for a given question. The irrelevant parts act as noise and are distracting information, resulting in sub-optimal performance due to the vulnerability of LLMs to noise. T…
▽ More
Table understanding capability of Large Language Models (LLMs) has been extensively studied through the task of question-answering (QA) over tables. Typically, only a small part of the whole table is relevant to derive the answer for a given question. The irrelevant parts act as noise and are distracting information, resulting in sub-optimal performance due to the vulnerability of LLMs to noise. To mitigate this, we propose CABINET (Content RelevAnce-Based NoIse ReductioN for TablE QuesTion-Answering) - a framework to enable LLMs to focus on relevant tabular data by suppressing extraneous information. CABINET comprises an Unsupervised Relevance Scorer (URS), trained differentially with the QA LLM, that weighs the table content based on its relevance to the input question before feeding it to the question-answering LLM (QA LLM). To further aid the relevance scorer, CABINET employs a weakly supervised module that generates a parsing statement describing the criteria of rows and columns relevant to the question and highlights the content of corresponding table cells. CABINET significantly outperforms various tabular LLM baselines, as well as GPT3-based in-context learning methods, is more robust to noise, maintains outperformance on tables of varying sizes, and establishes new SoTA performance on WikiTQ, FeTaQA, and WikiSQL datasets. We release our code and datasets at https://github.com/Sohanpatnaik106/CABINET_QA.
△ Less
Submitted 13 February, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Causal Machine Learning for Cost-Effective Allocation of Development Aid
Authors:
Milan Kuzmanovic,
Dennis Frauen,
Tobias Hatt,
Stefan Feuerriegel
Abstract:
The Sustainable Development Goals (SDGs) of the United Nations provide a blueprint of a better future by 'leaving no one behind', and, to achieve the SDGs by 2030, poor countries require immense volumes of development aid. In this paper, we develop a causal machine learning framework for predicting heterogeneous treatment effects of aid disbursements to inform effective aid allocation. Specificall…
▽ More
The Sustainable Development Goals (SDGs) of the United Nations provide a blueprint of a better future by 'leaving no one behind', and, to achieve the SDGs by 2030, poor countries require immense volumes of development aid. In this paper, we develop a causal machine learning framework for predicting heterogeneous treatment effects of aid disbursements to inform effective aid allocation. Specifically, our framework comprises three components: (i) a balancing autoencoder that uses representation learning to embed high-dimensional country characteristics while addressing treatment selection bias; (ii) a counterfactual generator to compute counterfactual outcomes for varying aid volumes to address small sample-size settings; and (iii) an inference model that is used to predict heterogeneous treatment-response curves. We demonstrate the effectiveness of our framework using data with official development aid earmarked to end HIV/AIDS in 105 countries, amounting to more than USD 5.2 billion. For this, we first show that our framework successfully computes heterogeneous treatment-response curves using semi-synthetic data. Then, we demonstrate our framework using real-world HIV data. Our framework points to large opportunities for a more effective aid allocation, suggesting that the total number of new HIV infections could be reduced by up to 3.3% (~50,000 cases) compared to the current allocation practice.
△ Less
Submitted 15 June, 2024; v1 submitted 30 January, 2024;
originally announced January 2024.
-
Querying Fault and Attack Trees: Property Specification on a Water Network
Authors:
Stefano M. Nicoletti,
Milan Lopuhaä-Zwakenberg,
E. Moritz Hahn,
Mariëlle Stoelinga
Abstract:
We provide an overview of three different query languages whose objective is to specify properties on the highly popular formalisms of fault trees (FTs) and attack trees (ATs). These are BFL, a Boolean Logic for FTs, PFL, a probabilistic extension of BFL and ATM, a logic for security metrics on ATs. We validate the framework composed by these three logics by applying them to the case study of a wa…
▽ More
We provide an overview of three different query languages whose objective is to specify properties on the highly popular formalisms of fault trees (FTs) and attack trees (ATs). These are BFL, a Boolean Logic for FTs, PFL, a probabilistic extension of BFL and ATM, a logic for security metrics on ATs. We validate the framework composed by these three logics by applying them to the case study of a water distribution network. We extend the FT for this network - found in the literature - and we propose to model the system under analysis with the Fault Trees/Attack Trees (FT/ATs) formalism, combining both FTs and ATs in a unique model. Furthermore, we propose a novel combination of the showcased logics to account for queries that jointly consider both the FT and the AT of the model, integrating influences of attacks on failure probabilities of different components. Finally, we extend the domain specific language for PFL with novel constructs to capture the interplay between metrics of attacks - e.g., "cost", success probabilities - and failure probabilities in the system.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Fuzzy quantitative attack tree analysis
Authors:
Thi Kim Nhung Dang,
Milan Lopuhaä-Zwakenberg,
Mariëlle Stoelinga
Abstract:
Attack trees are important for security, as they help to identify weaknesses and vulnerabilities in a system. Quantitative attack tree analysis supports a number security metrics, which formulate important KPIs such as the shortest, most likely and cheapest attacks.
A key bottleneck in quantitative analysis is that the values are usually not known exactly, due to insufficient data and/or lack of…
▽ More
Attack trees are important for security, as they help to identify weaknesses and vulnerabilities in a system. Quantitative attack tree analysis supports a number security metrics, which formulate important KPIs such as the shortest, most likely and cheapest attacks.
A key bottleneck in quantitative analysis is that the values are usually not known exactly, due to insufficient data and/or lack of knowledge. Fuzzy logic is a prominent framework to handle such uncertain values, with applications in numerous domains. While several studies proposed fuzzy approaches to attack tree analysis, none of them provided a firm definition of fuzzy metric values or generic algorithms for computation of fuzzy metrics.
In this work, we define a generic formulation for fuzzy metric values that applies to most quantitative metrics. The resulting metric value is a fuzzy number obtained by following Zadeh's extension principle, obtained when we equip the basis attack steps, i.e., the leaves of the attack trees, with fuzzy numbers. In addition, we prove a modular decomposition theorem that yields a bottom-up algorithm to efficiently calculate the top fuzzy metric value.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Automation of Triangle Ruler-and-Compass Constructions Using Constraint Solvers
Authors:
Milan Banković
Abstract:
In this paper, we present an approach to automated solving of triangle ruler-and-compass construction problems using finite-domain constraint solvers. The constraint model is described in the MiniZinc modeling language, and is based on the automated planning. The main benefit of using general constraint solvers for such purpose, instead of develo** dedicated tools, is that we can rely on the…
▽ More
In this paper, we present an approach to automated solving of triangle ruler-and-compass construction problems using finite-domain constraint solvers. The constraint model is described in the MiniZinc modeling language, and is based on the automated planning. The main benefit of using general constraint solvers for such purpose, instead of develo** dedicated tools, is that we can rely on the efficient search that is already implemented within the solver, enabling us to focus on geometric aspects of the problem. We may also use the solver's built-in optimization capabilities to search for the shortest possible constructions. We evaluate our approach on 74 solvable problems from the Wernick's list, and compare it to the dedicated triangle construction solver ArgoTriCS. The results show that our approach is comparable to dedicated tools, while it requires much less effort to implement. Also, our model often finds shorter constructions, thanks to the optimization capabilities offered by the constraint solvers.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Attack tree metrics are operad algebras
Authors:
Milan Lopuhaä-Zwakenberg
Abstract:
Attack Trees (ATs) are a widely used tool for security analysis. ATs can be employed in quantitative security analysis through metrics, which assign a security value to an AT. Many different AT metrics exist, and there exist multiple general definitions that aim to study a wide variety of AT metrics at once. However, these all have drawbacks: they do not capture all metrics, and they do not easily…
▽ More
Attack Trees (ATs) are a widely used tool for security analysis. ATs can be employed in quantitative security analysis through metrics, which assign a security value to an AT. Many different AT metrics exist, and there exist multiple general definitions that aim to study a wide variety of AT metrics at once. However, these all have drawbacks: they do not capture all metrics, and they do not easily generalize to extensions of ATs. In this paper, we introduce a definition of AT metrics based on category theory, specifically operad algebras. This encompasses all previous definitions of AT metrics, and is easily generalized to extensions of ATs. Furthermore, we show that under easily expressed operad-theoretic conditions, existing metric calculation algorithms can be extended in considerable generality.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
Amplification of Addictive New Media Features in the Metaverse
Authors:
Ljubisa Bojic,
Joerg Matthes,
Milan Cabarkapa
Abstract:
The emergence of the metaverse, envisioned as a hyperreal virtual universe facilitating boundless human interaction, stands to revolutionize our conception of media, with significant impacts on addiction, creativity, relationships, and social polarization. This paper aims to dissect the addictive potential of the metaverse due to its immersive and interactive features, scrutinize the effects of it…
▽ More
The emergence of the metaverse, envisioned as a hyperreal virtual universe facilitating boundless human interaction, stands to revolutionize our conception of media, with significant impacts on addiction, creativity, relationships, and social polarization. This paper aims to dissect the addictive potential of the metaverse due to its immersive and interactive features, scrutinize the effects of its recommender systems on creativity and social polarization, and explore potential consequences stemming from the metaverse development. We employed a literature review methodology, drawing parallels from the research on new media platforms and examining the progression of reality-mimicking features in media from historical perspectives to understand this transformative digital frontier. The findings suggest that these immersive and interactive features could potentially exacerbate media addiction. The designed recommender systems, while aiding personalization and user engagement, might contribute to social polarization and affect the diversity of creative output. However, our conclusions are based primarily on theoretical propositions from studies conducted on existing media platforms and lack empirical support specific to the metaverse. Therefore, this paper identifies a critical gap requiring further research, through empirical studies focused on metaverse use and addiction and exploration of privacy, security, and ethical implications associated with this burgeoning digital universe. As the development of the metaverse accelerates, it is incumbent on scholars, technologists, and policymakers to navigate its multilayered impacts thoughtfully to balance innovation with societal well-being.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Digital-analog quantum learning on Rydberg atom arrays
Authors:
Jonathan Z. Lu,
Lucy Jiao,
Kristina Wolinski,
Milan Kornjača,
Hong-Ye Hu,
Sergio Cantu,
Fangli Liu,
Susanne F. Yelin,
Sheng-Tao Wang
Abstract:
We propose hybrid digital-analog learning algorithms on Rydberg atom arrays, combining the potentially practical utility and near-term realizability of quantum learning with the rapidly scaling architectures of neutral atoms. Our construction requires only single-qubit operations in the digital setting and global driving according to the Rydberg Hamiltonian in the analog setting. We perform a comp…
▽ More
We propose hybrid digital-analog learning algorithms on Rydberg atom arrays, combining the potentially practical utility and near-term realizability of quantum learning with the rapidly scaling architectures of neutral atoms. Our construction requires only single-qubit operations in the digital setting and global driving according to the Rydberg Hamiltonian in the analog setting. We perform a comprehensive numerical study of our algorithm on both classical and quantum data, given respectively by handwritten digit classification and unsupervised quantum phase boundary learning. We show in the two representative problems that digital-analog learning is not only feasible in the near term, but also requires shorter circuit depths and is more robust to realistic error models as compared to digital learning schemes. Our results suggest that digital-analog learning opens a promising path towards improved variational quantum learning experiments in the near term.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
On the Convergence of Loss and Uncertainty-based Active Learning Algorithms
Authors:
Daniel Haimovich,
Dima Karamshuk,
Fridolin Linder,
Niek Tax,
Milan Vojnovic
Abstract:
We investigate the convergence rates and data sample sizes required for training a machine learning model using a stochastic gradient descent (SGD) algorithm, where data points are sampled based on either their loss value or uncertainty value. These training methods are particularly relevant for active learning and data subset selection problems. For SGD with a constant step size update, we presen…
▽ More
We investigate the convergence rates and data sample sizes required for training a machine learning model using a stochastic gradient descent (SGD) algorithm, where data points are sampled based on either their loss value or uncertainty value. These training methods are particularly relevant for active learning and data subset selection problems. For SGD with a constant step size update, we present convergence results for linear classifiers and linearly separable datasets using squared hinge loss and similar training loss functions. Additionally, we extend our analysis to more general classifiers and datasets, considering a wide range of loss-based sampling strategies and smooth convex training loss functions. We propose a novel algorithm called Adaptive-Weight Sampling (AWS) that utilizes SGD with an adaptive step size that achieves stochastic Polyak's step size in expectation. We establish convergence rate results for AWS for smooth convex training loss functions. Our numerical experiments demonstrate the efficiency of AWS on various datasets by using either exact or estimated loss values.
△ Less
Submitted 11 June, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Surf-CDM: Score-Based Surface Cold-Diffusion Model For Medical Image Segmentation
Authors:
Fahim Ahmed Zaman,
Mathews Jacob,
Amanda Chang,
Kan Liu,
Milan Sonka,
Xiaodong Wu
Abstract:
Diffusion models have shown impressive performance for image generation, often times outperforming other generative models. Since their introduction, researchers have extended the powerful noise-to-image denoising pipeline to discriminative tasks, including image segmentation. In this work we propose a conditional score-based generative modeling framework for medical image segmentation which relie…
▽ More
Diffusion models have shown impressive performance for image generation, often times outperforming other generative models. Since their introduction, researchers have extended the powerful noise-to-image denoising pipeline to discriminative tasks, including image segmentation. In this work we propose a conditional score-based generative modeling framework for medical image segmentation which relies on a parametric surface representation for the segmentation masks. The surface re-parameterization allows the direct application of standard diffusion theory, as opposed to when the mask is represented as a binary mask. Moreover, we adapted an extended variant of the diffusion technique known as the "cold-diffusion" where the diffusion model can be constructed with deterministic perturbations instead of Gaussian noise, which facilitates significantly faster convergence in the reverse diffusion. We evaluated our method on the segmentation of the left ventricle from 65 transthoracic echocardiogram videos (2230 echo image frames) and compared its performance to the most popular and widely used image segmentation models. Our proposed model not only outperformed the compared methods in terms of segmentation accuracy, but also showed potential in estimating segmentation uncertainties for further downstream analyses due to its inherent generative nature.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Harnessing the Power of Neural Operators with Automatically Encoded Conservation Laws
Authors:
Ning Liu,
Yiming Fan,
Xianyi Zeng,
Milan Klöwer,
Lu Zhang,
Yue Yu
Abstract:
Neural operators (NOs) have emerged as effective tools for modeling complex physical systems in scientific machine learning. In NOs, a central characteristic is to learn the governing physical laws directly from data. In contrast to other machine learning applications, partial knowledge is often known a priori about the physical system at hand whereby quantities such as mass, energy and momentum a…
▽ More
Neural operators (NOs) have emerged as effective tools for modeling complex physical systems in scientific machine learning. In NOs, a central characteristic is to learn the governing physical laws directly from data. In contrast to other machine learning applications, partial knowledge is often known a priori about the physical system at hand whereby quantities such as mass, energy and momentum are exactly conserved. Currently, NOs have to learn these conservation laws from data and can only approximately satisfy them due to finite training data and random noise. In this work, we introduce conservation law-encoded neural operators (clawNOs), a suite of NOs that endow inference with automatic satisfaction of such conservation laws. ClawNOs are built with a divergence-free prediction of the solution field, with which the continuity equation is automatically guaranteed. As a consequence, clawNOs are compliant with the most fundamental and ubiquitous conservation laws essential for correct physical consistency. As demonstrations, we consider a wide variety of scientific applications ranging from constitutive modeling of material deformation, incompressible fluid dynamics, to atmospheric simulation. ClawNOs significantly outperform the state-of-the-art NOs in learning efficacy, especially in small-data regimes.
△ Less
Submitted 4 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
GPT-4 Surpassing Human Performance in Linguistic Pragmatics
Authors:
Ljubisa Bojic,
Predrag Kovacevic,
Milan Cabarkapa
Abstract:
As Large Language Models (LLMs) become increasingly integrated into everyday life, their capabilities to understand and emulate human cognition are under steady examination. This study investigates the ability of LLMs to comprehend and interpret linguistic pragmatics, an aspect of communication that considers context and implied meanings. Using Grice's communication principles, LLMs and human subj…
▽ More
As Large Language Models (LLMs) become increasingly integrated into everyday life, their capabilities to understand and emulate human cognition are under steady examination. This study investigates the ability of LLMs to comprehend and interpret linguistic pragmatics, an aspect of communication that considers context and implied meanings. Using Grice's communication principles, LLMs and human subjects (N=76) were evaluated based on their responses to various dialogue-based tasks. The findings revealed the superior performance and speed of LLMs, particularly GPT4, over human subjects in interpreting pragmatics. GPT4 also demonstrated accuracy in the pre-testing of human-written samples, indicating its potential in text analysis. In a comparative analysis of LLMs using human individual and average scores, the models exhibited significant chronological improvement. The models were ranked from lowest to highest score, with GPT2 positioned at 78th place, GPT3 ranking at 23rd, Bard at 10th, GPT3.5 placing 5th, Best Human scoring 2nd, and GPT4 achieving the top spot. The findings highlight the remarkable progress made in the development and performance of these LLMs. Future studies should consider diverse subjects, multiple languages, and other cognitive aspects to fully comprehend the capabilities of LLMs. This research holds significant implications for the development and application of AI-based models in communication-centered sectors.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Performance evaluation of Private and Public Blockchains for multi-cloud service federation
Authors:
Adam Zahir,
Milan Groshev,
Kiril Antevski,
Carlos J. Bernardos,
Constantine Ayimba,
Antonio de la Oliva
Abstract:
The stringent low-latency, high reliability, availability and resilience requirements of 6G use cases will present challenges to cloud providers. Currently, cloud providers lack simple, efficient, and secure implementation of provisioning solutions that meet these challenges. Multi-cloud federation is a promising approach. In this paper, we evaluate the application of private and public blockchain…
▽ More
The stringent low-latency, high reliability, availability and resilience requirements of 6G use cases will present challenges to cloud providers. Currently, cloud providers lack simple, efficient, and secure implementation of provisioning solutions that meet these challenges. Multi-cloud federation is a promising approach. In this paper, we evaluate the application of private and public blockchain networks for multi-cloud federation. We compare the performance of blockchain-based federation in private and public blockchain networks and their integration with a production-ready orchestration solution. Our results show that the public blockchain needs approximately 91 seconds to complete the federation procedure compared to the 48 seconds in the private blockchain scenario.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Adaptive Differentially Quantized Subspace Perturbation (ADQSP): A Unified Framework for Privacy-Preserving Distributed Average Consensus
Authors:
Qiongxiu Li,
Jaron Skovsted Gundersen,
Milan Lopuhaa-Zwakenberg,
Richard Heusdens
Abstract:
Privacy-preserving distributed average consensus has received significant attention recently due to its wide applicability. Based on the achieved performances, existing approaches can be broadly classified into perfect accuracy-prioritized approaches such as secure multiparty computation (SMPC), and worst-case privacy-prioritized approaches such as differential privacy (DP). Methods of the first c…
▽ More
Privacy-preserving distributed average consensus has received significant attention recently due to its wide applicability. Based on the achieved performances, existing approaches can be broadly classified into perfect accuracy-prioritized approaches such as secure multiparty computation (SMPC), and worst-case privacy-prioritized approaches such as differential privacy (DP). Methods of the first class achieve perfect output accuracy but reveal some private information, while methods from the second class provide privacy against the strongest adversary at the cost of a loss of accuracy. In this paper, we propose a general approach named adaptive differentially quantized subspace perturbation (ADQSP) which combines quantization schemes with so-called subspace perturbation. Although not relying on cryptographic primitives, the proposed approach enjoys the benefits of both accuracy-prioritized and privacy-prioritized methods and is able to unify them. More specifically, we show that by varying a single quantization parameter the proposed method can vary between SMPC-type performances and DP-type performances. Our results show the potential of exploiting traditional distributed signal processing tools for providing cryptographic guarantees. In addition to a comprehensive theoretical analysis, numerical validations are conducted to substantiate our results.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
waveSLAM: Empowering Accurate Indoor Map** Using Off-the-Shelf Millimeter-wave Self-sensing
Authors:
Pablo Picazo,
Milan Groshev,
Alejandro Blanco,
Claudio Fiandrino,
Antonio de la Oliva,
Joerg Widmer
Abstract:
This paper presents the design, implementation and evaluation of waveSLAM, a low-cost mobile robot system that uses the millimetre wave (mmWave) communication devices to enhance the indoor map** process targeting environments with reduced visibility or glass/mirror walls. A unique feature of waveSLAM is that it only leverages existing Commercial-Off-The-Shelf (COTS) hardware (Lidar and mmWave ra…
▽ More
This paper presents the design, implementation and evaluation of waveSLAM, a low-cost mobile robot system that uses the millimetre wave (mmWave) communication devices to enhance the indoor map** process targeting environments with reduced visibility or glass/mirror walls. A unique feature of waveSLAM is that it only leverages existing Commercial-Off-The-Shelf (COTS) hardware (Lidar and mmWave radios) that are mounted on mobile robots to improve the accurate indoor map** achieved with optical sensors. The key intuition behind the waveSLAM design is that while the mobile robots moves freely, the mmWave radios can periodically exchange angle and distance estimates between themselves (self-sensing) by bouncing the signal from the environment, thus enabling accurate estimates of the target object/material surface. Our experiments verify that waveSLAM can archive cm-level accuracy with errors below 22 cm and 20deg in angle orientation which is compatible with Lidar when building indoor maps.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Fault tree reliability analysis via squarefree polynomials
Authors:
Milan Lopuhaä-Zwakenberg
Abstract:
Fault tree (FT) analysis is a prominent risk assessment method in industrial systems. Unreliability is one of the key safety metrics in quantitative FT analysis. Existing algorithms for unreliability analysis are based on binary decision diagrams, for which it is hard to give time complexity guarantees beyond a worst-case exponential bound. In this paper, we present a novel method to calculate FT…
▽ More
Fault tree (FT) analysis is a prominent risk assessment method in industrial systems. Unreliability is one of the key safety metrics in quantitative FT analysis. Existing algorithms for unreliability analysis are based on binary decision diagrams, for which it is hard to give time complexity guarantees beyond a worst-case exponential bound. In this paper, we present a novel method to calculate FT unreliability based on algebras of squarefree polynomials and prove its validity. We furthermore prove that time complexity is low when the number of multiparent nodes is limited. Experiments show that our method is competitive with the state-of-the-art and outperforms it for FTs with few multiparent nodes.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
U-Net v2: Rethinking the Skip Connections of U-Net for Medical Image Segmentation
Authors:
Yaopeng Peng,
Milan Sonka,
Danny Z. Chen
Abstract:
In this paper, we introduce U-Net v2, a new robust and efficient U-Net variant for medical image segmentation. It aims to augment the infusion of semantic information into low-level features while simultaneously refining high-level features with finer details. For an input image, we begin by extracting multi-level features with a deep neural network encoder. Next, we enhance the feature map of eac…
▽ More
In this paper, we introduce U-Net v2, a new robust and efficient U-Net variant for medical image segmentation. It aims to augment the infusion of semantic information into low-level features while simultaneously refining high-level features with finer details. For an input image, we begin by extracting multi-level features with a deep neural network encoder. Next, we enhance the feature map of each level by infusing semantic information from higher-level features and integrating finer details from lower-level features through Hadamard product. Our novel skip connections empower features of all the levels with enriched semantic characteristics and intricate details. The improved features are subsequently transmitted to the decoder for further processing and segmentation. Our method can be seamlessly integrated into any Encoder-Decoder network. We evaluate our method on several public medical image segmentation datasets for skin lesion segmentation and polyp segmentation, and the experimental results demonstrate the segmentation accuracy of our new method over state-of-the-art methods, while preserving memory and computational efficiency. Code is available at: https://github.com/yaoppeng/U-Net_v2
△ Less
Submitted 30 March, 2024; v1 submitted 29 November, 2023;
originally announced November 2023.
-
PHG-Net: Persistent Homology Guided Medical Image Classification
Authors:
Yaopeng Peng,
Hongxiao Wang,
Milan Sonka,
Danny Z. Chen
Abstract:
Modern deep neural networks have achieved great successes in medical image analysis. However, the features captured by convolutional neural networks (CNNs) or Transformers tend to be optimized for pixel intensities and neglect key anatomical structures such as connected components and loops. In this paper, we propose a persistent homology guided approach (PHG-Net) that explores topological feature…
▽ More
Modern deep neural networks have achieved great successes in medical image analysis. However, the features captured by convolutional neural networks (CNNs) or Transformers tend to be optimized for pixel intensities and neglect key anatomical structures such as connected components and loops. In this paper, we propose a persistent homology guided approach (PHG-Net) that explores topological features of objects for medical image classification. For an input image, we first compute its cubical persistence diagram and extract topological features into a vector representation using a small neural network (called the PH module). The extracted topological features are then incorporated into the feature map generated by CNN or Transformer for feature fusion. The PH module is lightweight and capable of integrating topological features into any CNN or Transformer architectures in an end-to-end fashion. We evaluate our PHG-Net on three public datasets and demonstrate its considerable improvements on the target classification tasks over state-of-the-art methods.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Onedata4Sci: Life science data management solution based on Onedata
Authors:
Tomáš Svoboda,
Tomáš Raček,
Josef Handl,
Jozef Sabo,
Adrián Rošinec,
Łukasz Opioła,
Wojciech Jesionek,
Milan Ešner,
Markéta Pernisová,
Natallia Madzia Valasevich,
Aleš Křenek,
Radka Svobodová
Abstract:
Life-science experimental methods generate vast and ever-increasing volumes of data, which provide highly valuable research resources. However, management of these data is nontrivial and applicable software solutions are currently subject to intensive development. The solutions mainly fall into one of the two groups: general data management systems (e.g. Onedata, iRODS, B2SHARE, CERNBox) or very s…
▽ More
Life-science experimental methods generate vast and ever-increasing volumes of data, which provide highly valuable research resources. However, management of these data is nontrivial and applicable software solutions are currently subject to intensive development. The solutions mainly fall into one of the two groups: general data management systems (e.g. Onedata, iRODS, B2SHARE, CERNBox) or very specialised data management solutions (e.g. solutions for biomolecular simulation data, biological imaging data, genomic data).
To bridge this gap between them, we provide Onedata4Sci, a prototype data management solution, which is focused on the management of life science data and covers four key steps of the data life cycle, i.e. data acquisition, user access, computational processing and archiving. Onedata4Sci is based on the Onedata data management system. It is written in Python, fully containerised, with the support for processing the stored data in Kubernetes. The applicability of Onedata4Sci is shown in three distinct use cases -- plant imaging data, cellular imaging data, and cryo-electron microscopy data. Despite the use cases covering very different types of data and user patterns, Onedata4Sci demonstrated an ability to successfully handle all these conditions. Complete source codes of Onedata4Sci are available on GitHub (https://github.com/CERIT-SC/onedata4sci), and its documentation and manual for installation are also provided.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
ÚFAL CorPipe at CRAC 2023: Larger Context Improves Multilingual Coreference Resolution
Authors:
Milan Straka
Abstract:
We present CorPipe, the winning entry to the CRAC 2023 Shared Task on Multilingual Coreference Resolution. Our system is an improved version of our earlier multilingual coreference pipeline, and it surpasses other participants by a large margin of 4.5 percent points. CorPipe first performs mention detection, followed by coreference linking via an antecedent-maximization approach on the retrieved s…
▽ More
We present CorPipe, the winning entry to the CRAC 2023 Shared Task on Multilingual Coreference Resolution. Our system is an improved version of our earlier multilingual coreference pipeline, and it surpasses other participants by a large margin of 4.5 percent points. CorPipe first performs mention detection, followed by coreference linking via an antecedent-maximization approach on the retrieved spans. Both tasks are trained jointly on all available corpora using a shared pretrained language model. Our main improvements comprise inputs larger than 512 subwords and changing the mention decoding to support ensembling. The source code is available at https://github.com/ufal/crac2023-corpipe.
△ Less
Submitted 7 December, 2023; v1 submitted 24 November, 2023;
originally announced November 2023.
-
Neural General Circulation Models for Weather and Climate
Authors:
Dmitrii Kochkov,
Janni Yuval,
Ian Langmore,
Peter Norgaard,
Jamie Smith,
Griffin Mooers,
Milan Klöwer,
James Lottes,
Stephan Rasp,
Peter Düben,
Sam Hatfield,
Peter Battaglia,
Alvaro Sanchez-Gonzalez,
Matthew Willson,
Michael P. Brenner,
Stephan Hoyer
Abstract:
General circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine learning (ML) models trained on reanalysis data achieved comparable or better skill than GCMs for deterministic weather fore…
▽ More
General circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine learning (ML) models trained on reanalysis data achieved comparable or better skill than GCMs for deterministic weather forecasting. However, these models have not demonstrated improved ensemble forecasts, or shown sufficient stability for long-term weather and climate simulations. Here we present the first GCM that combines a differentiable solver for atmospheric dynamics with ML components, and show that it can generate forecasts of deterministic weather, ensemble weather and climate on par with the best ML and physics-based methods. NeuralGCM is competitive with ML models for 1-10 day forecasts, and with the European Centre for Medium-Range Weather Forecasts ensemble prediction for 1-15 day forecasts. With prescribed sea surface temperature, NeuralGCM can accurately track climate metrics such as global mean temperature for multiple decades, and climate forecasts with 140 km resolution exhibit emergent phenomena such as realistic frequency and trajectories of tropical cyclones. For both weather and climate, our approach offers orders of magnitude computational savings over conventional GCMs. Our results show that end-to-end deep learning is compatible with tasks performed by conventional GCMs, and can enhance the large-scale physical simulations that are essential for understanding and predicting the Earth system.
△ Less
Submitted 7 March, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Artifact-Robust Graph-Based Learning in Digital Pathology
Authors:
Saba Heidari Gheshlaghi,
Milan Aryal,
Nasim Yahyasoltani,
Masoud Ganji
Abstract:
Whole slide images~(WSIs) are digitized images of tissues placed in glass slides using advanced scanners. The digital processing of WSIs is challenging as they are gigapixel images and stored in multi-resolution format. A common challenge with WSIs is that perturbations/artifacts are inevitable during storing the glass slides and digitizing them. These perturbations include motion, which often ari…
▽ More
Whole slide images~(WSIs) are digitized images of tissues placed in glass slides using advanced scanners. The digital processing of WSIs is challenging as they are gigapixel images and stored in multi-resolution format. A common challenge with WSIs is that perturbations/artifacts are inevitable during storing the glass slides and digitizing them. These perturbations include motion, which often arises from slide movement during placement, and changes in hue and brightness due to variations in staining chemicals and the quality of digitizing scanners. In this work, a novel robust learning approach to account for these artifacts is presented. Due to the size and resolution of WSIs and to account for neighborhood information, graph-based methods are called for. We use graph convolutional network~(GCN) to extract features from the graph representing WSI. Through a denoiser {and pooling layer}, the effects of perturbations in WSIs are controlled and the output is followed by a transformer for the classification of different grades of prostate cancer. To compare the efficacy of the proposed approach, the model without denoiser is trained and tested with WSIs without any perturbation and then different perturbations are introduced in WSIs and passed through the network with the denoiser. The accuracy and kappa scores of the proposed model with prostate cancer dataset compared with non-robust algorithms show significant improvement in cancer diagnosis.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Trust, but Verify: Robust Image Segmentation using Deep Learning
Authors:
Fahim Ahmed Zaman,
Xiaodong Wu,
Weiyu Xu,
Milan Sonka,
Raghuraman Mudumbai
Abstract:
We describe a method for verifying the output of a deep neural network for medical image segmentation that is robust to several classes of random as well as worst-case perturbations i.e. adversarial attacks. This method is based on a general approach recently developed by the authors called "Trust, but Verify" wherein an auxiliary verification network produces predictions about certain masked feat…
▽ More
We describe a method for verifying the output of a deep neural network for medical image segmentation that is robust to several classes of random as well as worst-case perturbations i.e. adversarial attacks. This method is based on a general approach recently developed by the authors called "Trust, but Verify" wherein an auxiliary verification network produces predictions about certain masked features in the input image using the segmentation as an input. A well-designed auxiliary network will produce high-quality predictions when the input segmentations are accurate, but will produce low-quality predictions when the segmentations are incorrect. Checking the predictions of such a network with the original image allows us to detect bad segmentations. However, to ensure the verification method is truly robust, we need a method for checking the quality of the predictions that does not itself rely on a black-box neural network. Indeed, we show that previous methods for segmentation evaluation that do use deep neural regression networks are vulnerable to false negatives i.e. can inaccurately label bad segmentations as good. We describe the design of a verification network that avoids such vulnerability and present results to demonstrate its robustness compared to previous methods.
△ Less
Submitted 19 December, 2023; v1 submitted 25 October, 2023;
originally announced October 2023.
-
A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems
Authors:
Songbo Hu,
Han Zhou,
Moy Yuan,
Milan Gritta,
Guchun Zhang,
Ignacio Iacobacci,
Anna Korhonen,
Ivan Vulić
Abstract:
Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP. In this work, we take stock of and empirically analyse task performance disparities that exist between multilingual task-oriented dialogue (ToD) systems. We first define new quantitative measures of absolute and relative equivalence in system performance, capturing…
▽ More
Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP. In this work, we take stock of and empirically analyse task performance disparities that exist between multilingual task-oriented dialogue (ToD) systems. We first define new quantitative measures of absolute and relative equivalence in system performance, capturing disparities across languages and within individual languages. Through a series of controlled experiments, we demonstrate that performance disparities depend on a number of factors: the nature of the ToD task at hand, the underlying pretrained language model, the target language, and the amount of ToD annotated data. We empirically prove the existence of the adaptation and intrinsic biases in current ToD systems: e.g., ToD systems trained for Arabic or Turkish using annotated ToD data fully parallel to English ToD data still exhibit diminished ToD task performance. Beyond providing a series of insights into the performance disparities of ToD systems in different languages, our analyses offer practical tips on how to approach ToD data collection and system development for new languages.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Graphs with three and four distinct eigenvalues based on circulants
Authors:
Milan Bašić
Abstract:
In this paper, we aim to address the open questions raised in various recent papers regarding characterization of circulant graphs with three or four distinct eigenvalues in their spectra. Our focus is on providing characterizations and constructing classes of graphs falling under this specific category. We present a characterization of circulant graphs with prime number order and unitary Cayley g…
▽ More
In this paper, we aim to address the open questions raised in various recent papers regarding characterization of circulant graphs with three or four distinct eigenvalues in their spectra. Our focus is on providing characterizations and constructing classes of graphs falling under this specific category. We present a characterization of circulant graphs with prime number order and unitary Cayley graphs with arbitrary order, both of which possess spectra displaying three or four distinct eigenvalues. Various constructions of circulant graphs with composite orders are provided whose spectra consist of four distinct eigenvalues. These constructions primarily utilize specific subgraphs of circulant graphs that already possess two or three eigenvalues in their spectra, employing graph operations like the tensor product, the union, and the complement. Finally, we characterize the iterated line graphs of unitary Cayley graphs whose spectra contain three or four distinct eigenvalues, and we show their non-circulant nature.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.