-
Scalable Variational Causal Discovery Unconstrained by Acyclicity
Authors:
Nu Hoang,
Bao Duong,
Thin Nguyen
Abstract:
Bayesian causal discovery offers the power to quantify epistemic uncertainties among a broad range of structurally diverse causal theories potentially explaining the data, represented in forms of directed acyclic graphs (DAGs). However, existing methods struggle with efficient DAG sampling due to the complex acyclicity constraint. In this study, we propose a scalable Bayesian approach to effective…
▽ More
Bayesian causal discovery offers the power to quantify epistemic uncertainties among a broad range of structurally diverse causal theories potentially explaining the data, represented in forms of directed acyclic graphs (DAGs). However, existing methods struggle with efficient DAG sampling due to the complex acyclicity constraint. In this study, we propose a scalable Bayesian approach to effectively learn the posterior distribution over causal graphs given observational data thanks to the ability to generate DAGs without explicitly enforcing acyclicity. Specifically, we introduce a novel differentiable DAG sampling method that can generate a valid acyclic causal graph by map** an unconstrained distribution of implicit topological orders to a distribution over DAGs. Given this efficient DAG sampling scheme, we are able to model the posterior distribution over causal graphs using a simple variational distribution over a continuous domain, which can be learned via the variational inference framework. Extensive empirical experiments on both simulated and real datasets demonstrate the superior performance of the proposed model compared to several state-of-the-art baselines.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Enabling Causal Discovery in Post-Nonlinear Models with Normalizing Flows
Authors:
Nu Hoang,
Bao Duong,
Thin Nguyen
Abstract:
Post-nonlinear (PNL) causal models stand out as a versatile and adaptable framework for modeling intricate causal relationships. However, accurately capturing the invertibility constraint required in PNL models remains challenging in existing studies. To address this problem, we introduce CAF-PoNo (Causal discovery via Normalizing Flows for Post-Nonlinear models), harnessing the power of the norma…
▽ More
Post-nonlinear (PNL) causal models stand out as a versatile and adaptable framework for modeling intricate causal relationships. However, accurately capturing the invertibility constraint required in PNL models remains challenging in existing studies. To address this problem, we introduce CAF-PoNo (Causal discovery via Normalizing Flows for Post-Nonlinear models), harnessing the power of the normalizing flows architecture to enforce the crucial invertibility constraint in PNL models. Through normalizing flows, our method precisely reconstructs the hidden noise, which plays a vital role in cause-effect identification through statistical independence testing. Furthermore, the proposed approach exhibits remarkable extensibility, as it can be seamlessly expanded to facilitate multivariate causal discovery via causal order identification, empowering us to efficiently unravel complex causal relationships. Extensive experimental evaluations on both simulated and real datasets consistently demonstrate that the proposed method outperforms several state-of-the-art approaches in both bivariate and multivariate causal discovery tasks.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Analytical exciton energies in monolayer transition-metal dichalcogenides
Authors:
Hanh T. Dinh,
Ngoc-Hung Phan,
Duy-Nhat Ly,
Dai-Nam Le,
Ngoc-Tram D. Hoang,
Nhat-Quang Nguyen,
Phuoc-Thien Doan,
Van-Hoang Le
Abstract:
We derive an analytical expression for $s$-state exciton energies in monolayer transition-metal dichalcogenides (TMDCs): $E_{\text{ns}}=-{\text{Ry}}^*\times P_n/{(n-1/2+0.479\, r^*_0/κ)^2}$, $n=1,2,...$, where $r^*_0$ and $κ$ are the dimensionless screening length and dielectric constant of the surrounding medium; $\text{Ry}^*$ is an effective Rydberg energy scaled by the dielectric constant and e…
▽ More
We derive an analytical expression for $s$-state exciton energies in monolayer transition-metal dichalcogenides (TMDCs): $E_{\text{ns}}=-{\text{Ry}}^*\times P_n/{(n-1/2+0.479\, r^*_0/κ)^2}$, $n=1,2,...$, where $r^*_0$ and $κ$ are the dimensionless screening length and dielectric constant of the surrounding medium; $\text{Ry}^*$ is an effective Rydberg energy scaled by the dielectric constant and exciton reduced mass; $P_n(r^*_0/κ)$ is a function of variables $n$ and $r^*_0/κ$. Its values are around 1.0 so we consider it a term that corrects the Rydberg energy. Despite the simple form, the suggested formula gives exciton energies with high precision compared to the exact numerical solutions that accurately describe recent experimental data for a large class of TMDC materials, including WSe$_2$, WS$_2$, MoSe$_2$, MoS$_2$, and MoTe$_2$. To achieve these results, we have developed a so-called regulated perturbation theory by combining the conventional perturbation method with several elements of the Feranchuk-Komarov operator method, including the Levi-Civita transformation, the algebraic calculation technique via the annihilation and creation operators, and the introduction of a free parameter to optimize the convergence rate of the perturbation series. This universal form of exciton energies could be helpful in various physical analyses, including retrieval of the material parameters such as reduced exciton mass and screening length from the available measured exciton energies.
△ Less
Submitted 6 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Hack Me If You Can: Aggregating AutoEncoders for Countering Persistent Access Threats Within Highly Imbalanced Data
Authors:
Sidahmed Benabderrahmane,
Ngoc Hoang,
Petko Valtchev,
James Cheney,
Talal Rahwan
Abstract:
Advanced Persistent Threats (APTs) are sophisticated, targeted cyberattacks designed to gain unauthorized access to systems and remain undetected for extended periods. To evade detection, APT cyberattacks deceive defense layers with breaches and exploits, thereby complicating exposure by traditional anomaly detection-based security methods. The challenge of detecting APTs with machine learning is…
▽ More
Advanced Persistent Threats (APTs) are sophisticated, targeted cyberattacks designed to gain unauthorized access to systems and remain undetected for extended periods. To evade detection, APT cyberattacks deceive defense layers with breaches and exploits, thereby complicating exposure by traditional anomaly detection-based security methods. The challenge of detecting APTs with machine learning is compounded by the rarity of relevant datasets and the significant imbalance in the data, which makes the detection process highly burdensome. We present AE-APT, a deep learning-based tool for APT detection that features a family of AutoEncoder methods ranging from a basic one to a Transformer-based one. We evaluated our tool on a suite of provenance trace databases produced by the DARPA Transparent Computing program, where APT-like attacks constitute as little as 0.004% of the data. The datasets span multiple operating systems, including Android, Linux, BSD, and Windows, and cover two attack scenarios. The outcomes showed that AE-APT has significantly higher detection rates compared to its competitors, indicating superior performance in detecting and ranking anomalies.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
TabularFM: An Open Framework For Tabular Foundational Models
Authors:
Quan M. Tran,
Suong N. Hoang,
Lam M. Nguyen,
Dzung Phan,
Hoang Thanh Lam
Abstract:
Foundational models (FMs), pretrained on extensive datasets using self-supervised techniques, are capable of learning generalized patterns from large amounts of data. This reduces the need for extensive labeled datasets for each new task, saving both time and resources by leveraging the broad knowledge base established during pretraining. Most research on FMs has primarily focused on unstructured…
▽ More
Foundational models (FMs), pretrained on extensive datasets using self-supervised techniques, are capable of learning generalized patterns from large amounts of data. This reduces the need for extensive labeled datasets for each new task, saving both time and resources by leveraging the broad knowledge base established during pretraining. Most research on FMs has primarily focused on unstructured data, such as text and images, or semi-structured data, like time-series. However, there has been limited attention to structured data, such as tabular data, which, despite its prevalence, remains under-studied due to a lack of clean datasets and insufficient research on the transferability of FMs for various tabular data tasks. In response to this gap, we introduce a framework called TabularFM, which incorporates state-of-the-art methods for develo** FMs specifically for tabular data. This includes variations of neural architectures such as GANs, VAEs, and Transformers. We have curated a million of tabular datasets and released cleaned versions to facilitate the development of tabular FMs. We pretrained FMs on this curated data, benchmarked various learning methods on these datasets, and released the pretrained models along with leaderboards for future comparative studies. Our fully open-sourced system provides a comprehensive analysis of the transferability of tabular FMs. By releasing these datasets, pretrained models, and leaderboards, we aim to enhance the validity and usability of tabular FMs in the near future.
△ Less
Submitted 17 June, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Positive Steady-State Varieties of Small Chemical Reaction Networks
Authors:
Maize Curiel,
Elise Farr,
Galileo Fries,
Luis David García Puente,
Julian Hutchins,
Vuong Nguyen Hoang
Abstract:
Chemical reaction network theory is a field of applied mathematics concerned with modeling chemical systems, and can be used in other contexts such as in systems biology to study cellular signaling pathways or epidemiology to study the effect of human interaction on the spread of disease. In this paper, we seek to understand a chemical reaction network's equilibrium points through the lens of alge…
▽ More
Chemical reaction network theory is a field of applied mathematics concerned with modeling chemical systems, and can be used in other contexts such as in systems biology to study cellular signaling pathways or epidemiology to study the effect of human interaction on the spread of disease. In this paper, we seek to understand a chemical reaction network's equilibrium points through the lens of algebraic geometry by computing the positive part of the steady-state variety defined by polynomial equations arising from the assumption of mass-action kinetics. We provide a systematic classification of positive steady-state varieties produced by 2-species, 2-reaction networks, grounded in combinatorial and algebraic properties. While some (restricted) techniques exist to fully understand the ideal defining the positive steady-state variety, this computation presents a significant challenge in general. Our classification theorems provide a simplification of previous criteria, and aim to provide a foundation for future analysis of larger networks.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Offline Model-Based Optimization via Policy-Guided Gradient Search
Authors:
Yassine Chemingui,
Aryan Deshwal,
Trong Nghia Hoang,
Janardhan Rao Doppa
Abstract:
Offline optimization is an emerging problem in many experimental engineering domains including protein, drug or aircraft design, where online experimentation to collect evaluation data is too expensive or dangerous. To avoid that, one has to optimize an unknown function given only its offline evaluation at a fixed set of inputs. A naive solution to this problem is to learn a surrogate model of the…
▽ More
Offline optimization is an emerging problem in many experimental engineering domains including protein, drug or aircraft design, where online experimentation to collect evaluation data is too expensive or dangerous. To avoid that, one has to optimize an unknown function given only its offline evaluation at a fixed set of inputs. A naive solution to this problem is to learn a surrogate model of the unknown function and optimize this surrogate instead. However, such a naive optimizer is prone to erroneous overestimation of the surrogate (possibly due to over-fitting on a biased sample of function evaluation) on inputs outside the offline dataset. Prior approaches addressing this challenge have primarily focused on learning robust surrogate models. However, their search strategies are derived from the surrogate model rather than the actual offline data. To fill this important gap, we introduce a new learning-to-search perspective for offline optimization by reformulating it as an offline reinforcement learning problem. Our proposed policy-guided gradient search approach explicitly learns the best policy for a given surrogate model created from the offline data. Our empirical results on multiple benchmarks demonstrate that the learned optimization policy can be combined with existing offline surrogates to significantly improve the optimization performance.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Delay and Overhead Efficient Transmission Scheduling for Federated Learning in UAV Swarms
Authors:
Duc N. M. Hoang,
Vu Tuan Truong,
Hung Duy Le,
Long Bao Le
Abstract:
This paper studies the wireless scheduling design to coordinate the transmissions of (local) model parameters of federated learning (FL) for a swarm of unmanned aerial vehicles (UAVs). The overall goal of the proposed design is to realize the FL training and aggregation processes with a central aggregator exploiting the sensory data collected by the UAVs but it considers the multi-hop wireless net…
▽ More
This paper studies the wireless scheduling design to coordinate the transmissions of (local) model parameters of federated learning (FL) for a swarm of unmanned aerial vehicles (UAVs). The overall goal of the proposed design is to realize the FL training and aggregation processes with a central aggregator exploiting the sensory data collected by the UAVs but it considers the multi-hop wireless network formed by the UAVs. Such transmissions of model parameters over the UAV-based wireless network potentially cause large transmission delays and overhead. Our proposed framework smartly aggregates local model parameters trained by the UAVs while efficiently transmitting the underlying parameters to the central aggregator in each FL global round. We theoretically show that the proposed scheme achieves minimal delay and communication overhead. Extensive numerical experiments demonstrate the superiority of the proposed scheme compared to other baselines.
△ Less
Submitted 22 February, 2024;
originally announced May 2024.
-
Interplay between magnetic and lattice excitations and emergent multiple phase transitions in MnPSe3-xSx
Authors:
Deepu Kumar,
Nguyen The Hoang,
Yumin Sim,
Youngsu Choi,
Kalaivanan Raju,
Rajesh Kumar Ulaganathan,
Raman Sankar,
Maeng-Je Seong,
Kwang-Yong Choi
Abstract:
The intricate interplay between spin and lattice degrees of freedom in two-dimensional magnetic materials plays a pivotal role in modifying their magnetic characteristics, engendering hybrid quasiparticles, and implementing functional devices. Herein, we present our comprehensive and in-depth investigations on magnetic and lattice excitations of MnPSe3-xSx (x = 0, 0.5, and 1.5) alloys, utilizing t…
▽ More
The intricate interplay between spin and lattice degrees of freedom in two-dimensional magnetic materials plays a pivotal role in modifying their magnetic characteristics, engendering hybrid quasiparticles, and implementing functional devices. Herein, we present our comprehensive and in-depth investigations on magnetic and lattice excitations of MnPSe3-xSx (x = 0, 0.5, and 1.5) alloys, utilizing temperature- and polarization-dependent Raman scattering. Our experimental results reveal the occurrence of multiple phase transitions, evidenced by notable changes in phonon self-energy and the appearance or splitting of phonon modes. These emergent phases are tied to the development of long and short-range spin-spin correlations, as well as to spin reorientations or magnetic instabilities. Our analysis of two-magnon excitations as a function of temperature and composition showcases their hybridization with phonons whose degree weakens with increasing x. Moreover, the suppression of spin-dependent phonon intensity in chemically most-disordered MnPSe3-xSx (x = 1.5) suggests that chalcogen substitution offers a control knob of tuning spin and phonon dynamics by modulating concurrently superexchange pathways and a degree of trigonal distortions.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Incentives in Private Collaborative Machine Learning
Authors:
Rachael Hwee Ling Sim,
Yehong Zhang,
Trong Nghia Hoang,
Xinyi Xu,
Bryan Kian Hsiang Low,
Patrick Jaillet
Abstract:
Collaborative machine learning involves training models on data from multiple parties but must incentivize their participation. Existing data valuation methods fairly value and reward each party based on shared data or model parameters but neglect the privacy risks involved. To address this, we introduce differential privacy (DP) as an incentive. Each party can select its required DP guarantee and…
▽ More
Collaborative machine learning involves training models on data from multiple parties but must incentivize their participation. Existing data valuation methods fairly value and reward each party based on shared data or model parameters but neglect the privacy risks involved. To address this, we introduce differential privacy (DP) as an incentive. Each party can select its required DP guarantee and perturb its sufficient statistic (SS) accordingly. The mediator values the perturbed SS by the Bayesian surprise it elicits about the model parameters. As our valuation function enforces a privacy-valuation trade-off, parties are deterred from selecting excessive DP guarantees that reduce the utility of the grand coalition's model. Finally, the mediator rewards each party with different posterior samples of the model parameters. Such rewards still satisfy existing incentives like fairness but additionally preserve DP and a high similarity to the grand coalition's posterior. We empirically demonstrate the effectiveness and practicality of our approach on synthetic and real-world datasets.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
ToXCL: A Unified Framework for Toxic Speech Detection and Explanation
Authors:
Nhat M. Hoang,
Xuan Long Do,
Duc Anh Do,
Duc Anh Vu,
Luu Anh Tuan
Abstract:
The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively d…
▽ More
The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively detect and explain implicit toxic speech. Prior works mainly formulated the task of toxic speech detection and explanation as a text generation problem. Nonetheless, models trained using this strategy can be prone to suffer from the consequent error propagation problem. Moreover, our experiments reveal that the detection results of such models are much lower than those that focus only on the detection task. To bridge these gaps, we introduce ToXCL, a unified framework for the detection and explanation of implicit toxic speech. Our model consists of three modules: a (i) Target Group Generator to generate the targeted demographic group(s) of a given post; an (ii) Encoder-Decoder Model in which the encoder focuses on detecting implicit toxic speech and is boosted by a (iii) Teacher Classifier via knowledge distillation, and the decoder generates the necessary explanation. ToXCL achieves new state-of-the-art effectiveness, and outperforms baselines significantly.
△ Less
Submitted 20 May, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
An elaborate new proof of Cayley's formula
Authors:
Esther Banaian,
Anh Trong Nam Hoang,
Elizabeth Kelley,
Weston Miller,
Jason Stack,
Carolyn Stephen,
Nathan Williams
Abstract:
We construct a bijection between certain Deodhar components of a braid variety constructed from an affine Kac-Moody group of type $A_{n-1}$ and vertex-labeled trees on $n$ vertices. By an argument of Galashin, Lam, and Williams using Opdam's trace formula in the affine Hecke algebra and an identity due to Haglund, we obtain an elaborate new proof for the enumeration of the number of vertex-labeled…
▽ More
We construct a bijection between certain Deodhar components of a braid variety constructed from an affine Kac-Moody group of type $A_{n-1}$ and vertex-labeled trees on $n$ vertices. By an argument of Galashin, Lam, and Williams using Opdam's trace formula in the affine Hecke algebra and an identity due to Haglund, we obtain an elaborate new proof for the enumeration of the number of vertex-labeled trees on $n$ vertices.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Effective Multi-Stage Training Model For Edge Computing Devices In Intrusion Detection
Authors:
Thua Huynh Trong,
Thanh Nguyen Hoang
Abstract:
Intrusion detection poses a significant challenge within expansive and persistently interconnected environments. As malicious code continues to advance and sophisticated attack methodologies proliferate, various advanced deep learning-based detection approaches have been proposed. Nevertheless, the complexity and accuracy of intrusion detection models still need further enhancement to render them…
▽ More
Intrusion detection poses a significant challenge within expansive and persistently interconnected environments. As malicious code continues to advance and sophisticated attack methodologies proliferate, various advanced deep learning-based detection approaches have been proposed. Nevertheless, the complexity and accuracy of intrusion detection models still need further enhancement to render them more adaptable to diverse system categories, particularly within resource-constrained devices, such as those embedded in edge computing systems. This research introduces a three-stage training paradigm, augmented by an enhanced pruning methodology and model compression techniques. The objective is to elevate the system's effectiveness, concurrently maintaining a high level of accuracy for intrusion detection. Empirical assessments conducted on the UNSW-NB15 dataset evince that this solution notably reduces the model's dimensions, while upholding accuracy levels equivalent to similar proposals.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation
Authors:
Nhat M. Hoang,
Kehong Gong,
Chuan Guo,
Michael Bi Mi
Abstract:
Controllable generation of 3D human motions becomes an important topic as the world embraces digital transformation. Existing works, though making promising progress with the advent of diffusion models, heavily rely on meticulously captured and annotated (e.g., text) high-quality motion corpus, a resource-intensive endeavor in the real world. This motivates our proposed MotionMix, a simple yet eff…
▽ More
Controllable generation of 3D human motions becomes an important topic as the world embraces digital transformation. Existing works, though making promising progress with the advent of diffusion models, heavily rely on meticulously captured and annotated (e.g., text) high-quality motion corpus, a resource-intensive endeavor in the real world. This motivates our proposed MotionMix, a simple yet effective weakly-supervised diffusion model that leverages both noisy and unannotated motion sequences. Specifically, we separate the denoising objectives of a diffusion model into two stages: obtaining conditional rough motion approximations in the initial $T-T^*$ steps by learning the noisy annotated motions, followed by the unconditional refinement of these preliminary motions during the last $T^*$ steps using unannotated motions. Notably, though learning from two sources of imperfect data, our model does not compromise motion generation quality compared to fully supervised approaches that access gold data. Extensive experiments on several benchmarks demonstrate that our MotionMix, as a versatile framework, consistently achieves state-of-the-art performances on text-to-motion, action-to-motion, and music-to-dance tasks. Project page: https://nhathoang2002.github.io/MotionMix-page/
△ Less
Submitted 24 January, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
RFRL Gym: A Reinforcement Learning Testbed for Cognitive Radio Applications
Authors:
Daniel Rosen,
Illa Rochez,
Caleb McIrvin,
Joshua Lee,
Kevin D'Alessandro,
Max Wiecek,
Nhan Hoang,
Ramzy Saffarini,
Sam Philips,
Vanessa Jones,
Will Ivey,
Zavier Harris-Smart,
Zavion Harris-Smart,
Zayden Chin,
Amos Johnson,
Alyse M. Jones,
William C. Headley
Abstract:
Radio Frequency Reinforcement Learning (RFRL) is anticipated to be a widely applicable technology in the next generation of wireless communication systems, particularly 6G and next-gen military communications. Given this, our research is focused on develo** a tool to promote the development of RFRL techniques that leverage spectrum sensing. In particular, the tool was designed to address two cog…
▽ More
Radio Frequency Reinforcement Learning (RFRL) is anticipated to be a widely applicable technology in the next generation of wireless communication systems, particularly 6G and next-gen military communications. Given this, our research is focused on develo** a tool to promote the development of RFRL techniques that leverage spectrum sensing. In particular, the tool was designed to address two cognitive radio applications, specifically dynamic spectrum access and jamming. In order to train and test reinforcement learning (RL) algorithms for these applications, a simulation environment is necessary to simulate the conditions that an agent will encounter within the Radio Frequency (RF) spectrum. In this paper, such an environment has been developed, herein referred to as the RFRL Gym. Through the RFRL Gym, users can design their own scenarios to model what an RL agent may encounter within the RF spectrum as well as experiment with different spectrum sensing techniques. Additionally, the RFRL Gym is a subclass of OpenAI gym, enabling the use of third-party ML/RL Libraries. We plan to open-source this codebase to enable other researchers to utilize the RFRL Gym to test their own scenarios and RL algorithms, ultimately leading to the advancement of RL research in the wireless communications domain. This paper describes in further detail the components of the Gym, results from example scenarios, and plans for future additions.
Index Terms-machine learning, reinforcement learning, wireless communications, dynamic spectrum access, OpenAI gym
△ Less
Submitted 20 December, 2023;
originally announced January 2024.
-
On the Particle Acceleration Mechanisms in a Double Radio Relic Galaxy Cluster, Abell 1240
Authors:
Arnab Sarkar,
Felipe Andrade-Santos,
Reinout J. van Weeren,
Ralph P. Kraft,
Duy N. Hoang,
Timothy W. Shimwell,
Paul Nulsen,
William Forman,
Scott Randall,
Yuanyuan Su,
Priyanka Chakraborty,
Christine Jones,
Eric Miller,
Mark Bautz,
Catherine E. Grant
Abstract:
We present a 368 ks deep Chandra observation of Abell~1240, a binary merging galaxy cluster at a redshift of 0.195 with two Brightest Cluster Galaxies (BCGs) may have passed each other 0.3 Gyr ago. Building upon previous investigations involving GMRT, VLA, and LOFAR data, our study focuses on two prominent extended radio relics at the north-west (NW) and south-east (SE) of the cluster core. By lev…
▽ More
We present a 368 ks deep Chandra observation of Abell~1240, a binary merging galaxy cluster at a redshift of 0.195 with two Brightest Cluster Galaxies (BCGs) may have passed each other 0.3 Gyr ago. Building upon previous investigations involving GMRT, VLA, and LOFAR data, our study focuses on two prominent extended radio relics at the north-west (NW) and south-east (SE) of the cluster core. By leveraging the high-resolution Chandra imaging, we have identified two distinct surface brightness edges at $\sim$ 1 Mpc and 1.2 Mpc NW and SE of the cluster center, respectively, coinciding with the outer edges of both relics. Our temperature measurements hint the edges to be shock front edges. The Mach numbers, derived from the gas density jumps, yield $\cal{M}_{\rm SE}$ = 1.49$^{+0.22}_{-0.24}$ for the South Eastern shock and $\cal{M}_{\rm NW}$ = 1.41$^{+0.17}_{-0.19}$ for the North Western shock. Our estimated Mach numbers are remarkably smaller compared to those derived from radio observations ($\cal{M}_{\rm SE}$ = 2.3 and $\cal{M}_{\rm NW}$ = 2.4), highlighting the prevalence of a re-acceleration scenario over direct acceleration of electrons from the thermal pool. Furthermore, we compare the observed temperature profiles across both shocks with that of predictions from collisional vs. collisionless models. Both shocks favor the Coulomb collisional model, but we could not rule out a purely collisionless model due to pre-shock temperature uncertainties.
△ Less
Submitted 12 January, 2024; v1 submitted 3 January, 2024;
originally announced January 2024.
-
Prompting Large Language Models for Topic Modeling
Authors:
Han Wang,
Nirmalendu Prakash,
Nguyen Khoi Hoang,
Ming Shan Hee,
Usman Naseem,
Roy Ka-Wei Lee
Abstract:
Topic modeling is a widely used technique for revealing underlying thematic structures within textual data. However, existing models have certain limitations, particularly when dealing with short text datasets that lack co-occurring words. Moreover, these models often neglect sentence-level semantics, focusing primarily on token-level semantics. In this paper, we propose PromptTopic, a novel topic…
▽ More
Topic modeling is a widely used technique for revealing underlying thematic structures within textual data. However, existing models have certain limitations, particularly when dealing with short text datasets that lack co-occurring words. Moreover, these models often neglect sentence-level semantics, focusing primarily on token-level semantics. In this paper, we propose PromptTopic, a novel topic modeling approach that harnesses the advanced language understanding of large language models (LLMs) to address these challenges. It involves extracting topics at the sentence level from individual documents, then aggregating and condensing these topics into a predefined quantity, ultimately providing coherent topics for texts of varying lengths. This approach eliminates the need for manual parameter tuning and improves the quality of extracted topics. We benchmark PromptTopic against the state-of-the-art baselines on three vastly diverse datasets, establishing its proficiency in discovering meaningful topics. Furthermore, qualitative analysis showcases PromptTopic's ability to uncover relevant topics in multiple datasets.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
MATK: The Meme Analytical Tool Kit
Authors:
Ming Shan Hee,
Aditi Kumaresan,
Nguyen Khoi Hoang,
Nirmalendu Prakash,
Rui Cao,
Roy Ka-Wei Lee
Abstract:
The rise of social media platforms has brought about a new digital culture called memes. Memes, which combine visuals and text, can strongly influence public opinions on social and cultural issues. As a result, people have become interested in categorizing memes, leading to the development of various datasets and multimodal models that show promising results in this field. However, there is curren…
▽ More
The rise of social media platforms has brought about a new digital culture called memes. Memes, which combine visuals and text, can strongly influence public opinions on social and cultural issues. As a result, people have become interested in categorizing memes, leading to the development of various datasets and multimodal models that show promising results in this field. However, there is currently a lack of a single library that allows for the reproduction, evaluation, and comparison of these models using fair benchmarks and settings. To fill this gap, we introduce the Meme Analytical Tool Kit (MATK), an open-source toolkit specifically designed to support existing memes datasets and cutting-edge multimodal models. MATK aims to assist researchers and engineers in training and reproducing these multimodal models for meme classification tasks, while also providing analysis techniques to gain insights into their strengths and weaknesses. To access MATK, please visit \url{https://github.com/Social-AI-Studio/MATK}.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
PromptMTopic: Unsupervised Multimodal Topic Modeling of Memes using Large Language Models
Authors:
Nirmalendu Prakash,
Han Wang,
Nguyen Khoi Hoang,
Ming Shan Hee,
Roy Ka-Wei Lee
Abstract:
The proliferation of social media has given rise to a new form of communication: memes. Memes are multimodal and often contain a combination of text and visual elements that convey meaning, humor, and cultural significance. While meme analysis has been an active area of research, little work has been done on unsupervised multimodal topic modeling of memes, which is important for content moderation…
▽ More
The proliferation of social media has given rise to a new form of communication: memes. Memes are multimodal and often contain a combination of text and visual elements that convey meaning, humor, and cultural significance. While meme analysis has been an active area of research, little work has been done on unsupervised multimodal topic modeling of memes, which is important for content moderation, social media analysis, and cultural studies. We propose \textsf{PromptMTopic}, a novel multimodal prompt-based model designed to learn topics from both text and visual modalities by leveraging the language modeling capabilities of large language models. Our model effectively extracts and clusters topics learned from memes, considering the semantic interaction between the text and visual modalities. We evaluate our proposed model through extensive experiments on three real-world meme datasets, which demonstrate its superiority over state-of-the-art topic modeling baselines in learning descriptive topics in memes. Additionally, our qualitative analysis shows that \textsf{PromptMTopic} can identify meaningful and culturally relevant topics from memes. Our work contributes to the understanding of the topics and themes of memes, a crucial form of communication in today's society.\\ \red{\textbf{Disclaimer: This paper contains sensitive content that may be disturbing to some readers.}}
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
ChatGPT as a Math Questioner? Evaluating ChatGPT on Generating Pre-university Math Questions
Authors:
Phuoc Pham Van Long,
Duc Anh Vu,
Nhat M. Hoang,
Xuan Long Do,
Anh Tuan Luu
Abstract:
Mathematical questioning is crucial for assessing students problem-solving skills. Since manually creating such questions requires substantial effort, automatic methods have been explored. Existing state-of-the-art models rely on fine-tuning strategies and struggle to generate questions that heavily involve multiple steps of logical and arithmetic reasoning. Meanwhile, large language models(LLMs)…
▽ More
Mathematical questioning is crucial for assessing students problem-solving skills. Since manually creating such questions requires substantial effort, automatic methods have been explored. Existing state-of-the-art models rely on fine-tuning strategies and struggle to generate questions that heavily involve multiple steps of logical and arithmetic reasoning. Meanwhile, large language models(LLMs) such as ChatGPT have excelled in many NLP tasks involving logical and arithmetic reasoning. Nonetheless, their applications in generating educational questions are underutilized, especially in the field of mathematics. To bridge this gap, we take the first step to conduct an in-depth analysis of ChatGPT in generating pre-university math questions. Our analysis is categorized into two main settings: context-aware and context-unaware. In the context-aware setting, we evaluate ChatGPT on existing math question-answering benchmarks covering elementary, secondary, and ternary classes. In the context-unaware setting, we evaluate ChatGPT in generating math questions for each lesson from pre-university math curriculums that we crawl. Our crawling results in TopicMath, a comprehensive and novel collection of pre-university math curriculums collected from 121 math topics and 428 lessons from elementary, secondary, and tertiary classes. Through this analysis, we aim to provide insight into the potential of ChatGPT as a math questioner.
△ Less
Submitted 27 February, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
Pre-trained Recommender Systems: A Causal Debiasing Perspective
Authors:
Ziqian Lin,
Hao Ding,
Nghia Trong Hoang,
Branislav Kveton,
Anoop Deoras,
Hao Wang
Abstract:
Recent studies on pre-trained vision/language models have demonstrated the practical benefit of a new, promising solution-building paradigm in AI where models can be pre-trained on broad data describing a generic task space and then adapted successfully to solve a wide range of downstream tasks, even when training data is severely limited (e.g., in zero- or few-shot learning scenarios). Inspired b…
▽ More
Recent studies on pre-trained vision/language models have demonstrated the practical benefit of a new, promising solution-building paradigm in AI where models can be pre-trained on broad data describing a generic task space and then adapted successfully to solve a wide range of downstream tasks, even when training data is severely limited (e.g., in zero- or few-shot learning scenarios). Inspired by such progress, we investigate in this paper the possibilities and challenges of adapting such a paradigm to the context of recommender systems, which is less investigated from the perspective of pre-trained model. In particular, we propose to develop a generic recommender that captures universal interaction patterns by training on generic user-item interaction data extracted from different domains, which can then be fast adapted to improve few-shot learning performance in unseen new domains (with limited data).
However, unlike vision/language data which share strong conformity in the semantic space, universal patterns underlying recommendation data collected across different domains (e.g., different countries or different E-commerce platforms) are often occluded by both in-domain and cross-domain biases implicitly imposed by the cultural differences in their user and item bases, as well as their uses of different e-commerce platforms. As shown in our experiments, such heterogeneous biases in the data tend to hinder the effectiveness of the pre-trained model. To address this challenge, we further introduce and formalize a causal debiasing perspective, which is substantiated via a hierarchical Bayesian deep learning model, named PreRec. Our empirical studies on real-world data show that the proposed model could significantly improve the recommendation performance in zero- and few-shot learning settings under both cross-market and cross-platform scenarios.
△ Less
Submitted 8 January, 2024; v1 submitted 29 October, 2023;
originally announced October 2023.
-
Promoting Robustness of Randomized Smoothing: Two Cost-Effective Approaches
Authors:
Linbo Liu,
Trong Nghia Hoang,
Lam M. Nguyen,
Tsui-Wei Weng
Abstract:
Randomized smoothing has recently attracted attentions in the field of adversarial robustness to provide provable robustness guarantees on smoothed neural network classifiers. However, existing works show that vanilla randomized smoothing usually does not provide good robustness performance and often requires (re)training techniques on the base classifier in order to boost the robustness of the re…
▽ More
Randomized smoothing has recently attracted attentions in the field of adversarial robustness to provide provable robustness guarantees on smoothed neural network classifiers. However, existing works show that vanilla randomized smoothing usually does not provide good robustness performance and often requires (re)training techniques on the base classifier in order to boost the robustness of the resulting smoothed classifier. In this work, we propose two cost-effective approaches to boost the robustness of randomized smoothing while preserving its clean performance. The first approach introduces a new robust training method AdvMacerwhich combines adversarial training and robustness certification maximization for randomized smoothing. We show that AdvMacer can improve the robustness performance of randomized smoothing classifiers compared to SOTA baselines, while being 3x faster to train than MACER baseline. The second approach introduces a post-processing method EsbRS which greatly improves the robustness certificate based on building model ensembles. We explore different aspects of model ensembles that has not been studied by prior works and propose a novel design methodology to further improve robustness of the ensemble based on our theoretical analysis.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Do Compressed LLMs Forget Knowledge? An Experimental Study with Practical Implications
Authors:
Duc N. M Hoang,
Minsik Cho,
Thomas Merth,
Mohammad Rastegari,
Zhangyang Wang
Abstract:
Compressing Large Language Models (LLMs) often leads to reduced performance, especially for knowledge-intensive tasks. In this work, we dive into how compression damages LLMs' inherent knowledge and the possible remedies. We start by proposing two conjectures on the nature of the damage: one is certain knowledge being forgotten (or erased) after LLM compression, hence necessitating the compressed…
▽ More
Compressing Large Language Models (LLMs) often leads to reduced performance, especially for knowledge-intensive tasks. In this work, we dive into how compression damages LLMs' inherent knowledge and the possible remedies. We start by proposing two conjectures on the nature of the damage: one is certain knowledge being forgotten (or erased) after LLM compression, hence necessitating the compressed model to (re)learn from data with additional parameters; the other presumes that knowledge is internally displaced and hence one requires merely "inference re-direction" with input-side augmentation such as prompting, to recover the knowledge-related performance. Extensive experiments are then designed to (in)validate the two conjectures. We observe the promise of prompting in comparison to model tuning; we further unlock prompting's potential by introducing a variant called Inference-time Dynamic Prompting (IDP), that can effectively increase prompt diversity without incurring any inference overhead. Our experiments consistently suggest that compared to the classical re-training alternatives such as LoRA, prompting with IDP leads to better or comparable post-compression performance recovery, while saving the extra parameter size by 21x and reducing inference latency by 60%. Our experiments hence strongly endorse the conjecture of "knowledge displaced" over "knowledge forgotten", and shed light on a new efficient mechanism to restore compressed LLM performance. We additionally visualize and analyze the different attention and activation patterns between prompted and re-trained models, demonstrating they achieve performance recovery in two different regimes.
△ Less
Submitted 16 February, 2024; v1 submitted 1 October, 2023;
originally announced October 2023.
-
Federated Few-shot Learning for Cough Classification with Edge Devices
Authors:
Ngan Dao Hoang,
Dat Tran-Anh,
Manh Luong,
Cong Tran,
Cuong Pham
Abstract:
Automatically classifying cough sounds is one of the most critical tasks for the diagnosis and treatment of respiratory diseases. However, collecting a huge amount of labeled cough dataset is challenging mainly due to high laborious expenses, data scarcity, and privacy concerns. In this work, our aim is to develop a framework that can effectively perform cough classification even in situations whe…
▽ More
Automatically classifying cough sounds is one of the most critical tasks for the diagnosis and treatment of respiratory diseases. However, collecting a huge amount of labeled cough dataset is challenging mainly due to high laborious expenses, data scarcity, and privacy concerns. In this work, our aim is to develop a framework that can effectively perform cough classification even in situations when enormous cough data is not available, while also addressing privacy concerns. Specifically, we formulate a new problem to tackle these challenges and adopt few-shot learning and federated learning to design a novel framework, termed F2LCough, for solving the newly formulated problem. We illustrate the superiority of our method compared with other approaches on COVID-19 Thermal Face & Cough dataset, in which F2LCough achieves an average F1-Score of 86%. Our results show the feasibility of few-shot learning combined with federated learning to build a classification model of cough sounds. This new methodology is able to classify cough sounds in data-scarce situations and maintain privacy properties. The outcomes of this work can be a fundamental framework for building support systems for the detection and diagnosis of cough-related diseases.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
Time-to-Pattern: Information-Theoretic Unsupervised Learning for Scalable Time Series Summarization
Authors:
Alireza Ghods,
Trong Nghia Hoang,
Diane Cook
Abstract:
Data summarization is the process of generating interpretable and representative subsets from a dataset. Existing time series summarization approaches often search for recurring subsequences using a set of manually devised similarity functions to summarize the data. However, such approaches are fraught with limitations stemming from an exhaustive search coupled with a heuristic definition of serie…
▽ More
Data summarization is the process of generating interpretable and representative subsets from a dataset. Existing time series summarization approaches often search for recurring subsequences using a set of manually devised similarity functions to summarize the data. However, such approaches are fraught with limitations stemming from an exhaustive search coupled with a heuristic definition of series similarity. Such approaches affect the diversity and comprehensiveness of the generated data summaries. To mitigate these limitations, we introduce an approach to time series summarization, called Time-to-Pattern (T2P), which aims to find a set of diverse patterns that together encode the most salient information, following the notion of minimum description length. T2P is implemented as a deep generative model that learns informative embeddings of the discrete time series on a latent space specifically designed to be interpretable. Our synthetic and real-world experiments reveal that T2P discovers informative patterns, even in noisy and complex settings. Furthermore, our results also showcase the improved performance of T2P over previous work in pattern diversity and processing scalability, which conclusively demonstrate the algorithm's effectiveness for time series summarization.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Generalized Bradley-Terry Models for Score Estimation from Paired Comparisons
Authors:
Julien Fageot,
Sadegh Farhadkhani,
Lê Nguyên Hoang,
Oscar Villemaud
Abstract:
Many applications, e.g. in content recommendation, sports, or recruitment, leverage the comparisons of alternatives to score those alternatives. The classical Bradley-Terry model and its variants have been widely used to do so. The historical model considers binary comparisons (victory or defeat) between alternatives, while more recent developments allow finer comparisons to be taken into account.…
▽ More
Many applications, e.g. in content recommendation, sports, or recruitment, leverage the comparisons of alternatives to score those alternatives. The classical Bradley-Terry model and its variants have been widely used to do so. The historical model considers binary comparisons (victory or defeat) between alternatives, while more recent developments allow finer comparisons to be taken into account. In this article, we introduce a probabilistic model encompassing a broad variety of paired comparisons that can take discrete or continuous values. We do so by considering a well-behaved subset of the exponential family, which we call the family of generalized Bradley-Terry (GBT) models, as it includes the classical Bradley-Terry model and many of its variants. Remarkably, we prove that all GBT models are guaranteed to yield a strictly convex negative log-likelihood. Moreover, assuming a Gaussian prior on alternatives' scores, we prove that the maximum a posteriori (MAP) of GBT models, whose existence, uniqueness and fast computation are thus guaranteed, varies monotonically with respect to comparisons (the more A beats B, the better the score of A) and is Lipschitz-resilient with respect to each new comparison (a single new comparison can only have a bounded effect on all the estimated scores). These desirable properties make GBT models appealing for practical use. We illustrate some features of GBT models on simulations.
△ Less
Submitted 21 February, 2024; v1 submitted 16 August, 2023;
originally announced August 2023.
-
LOFAR detection of extended emission around a mini-halo in the galaxy cluster Abell 1413
Authors:
Giulia Lusetti,
Annalisa Bonafede,
Lorenzo Lovisari,
Myriam Gitti,
Stefano Ettori,
Rossella Cassano,
Christopher J. Riseley,
Federica Govoni,
Marcus Brüggen,
Luca Bruno,
Reinout J. van Weeren,
Andrea Botteon,
Duy N. Hoang,
Fabio Gastaldello,
Alessandro Ignesti,
Mariachiara Rossetti,
Timothy W. Shimwell
Abstract:
The relation between giant radio halos and mini-halos in galaxy clusters is not understood. The former are usually associated with merging clusters, the latter are found in relaxed systems. In the last years, the advent of low-frequency radio observations has challenged this dichotomy, finding intermediate objects with a hybrid radio morphology. We aim to investigate the presence of diffuse radio…
▽ More
The relation between giant radio halos and mini-halos in galaxy clusters is not understood. The former are usually associated with merging clusters, the latter are found in relaxed systems. In the last years, the advent of low-frequency radio observations has challenged this dichotomy, finding intermediate objects with a hybrid radio morphology. We aim to investigate the presence of diffuse radio emission in the cluster Abell 1413 and determine its dynamical status. We used LOFAR HBA observations centred at 144 MHz to study the diffuse emission hosted by this cluster.To investigate the dynamical state of the system, we complete our study with newly analysed XMM-Newton archival data. A1413 shows features that are typically present in both relaxed (e.g., peaked x-ray surface brightness distribution and little large-scale inhomogeneities) and disturbed (e.g., flatter temperature and metallicity profiles) clusters.This evidence supports the scenario that A1413 is neither a disturbed nor fully relaxed object. We argue that it is an intermediate-phase cluster.Using radio observations at 144 MHz, we discover the presence of a wider diffuse component surrounding the previously reported mini-halo at the cluster centre. By fitting the radio surface brightness profile with a double-exponential model, we can disentangle the two components. We find an inner mini-halo with an e-folding radius r_e1=28 kpc and the extended component with r_e2 = 290 kpc. We also performed point-to-point correlations between radio and X-ray surface brightness, finding a sub-linear relation for the outer emission and a super-linear relation for the mini-halo.The mini-halo and the diffuse emission extend over different scales and show different features, confirming the double nature of the radio emission and suggesting that the mechanisms responsible for the re-acceleration of the radio-emitting particle might be different.
△ Less
Submitted 9 January, 2024; v1 submitted 3 August, 2023;
originally announced August 2023.
-
Fox-Neuwirth cells, quantum shuffle algebras, and character sums of the resultant
Authors:
Anh Trong Nam Hoang
Abstract:
We give an upper bound on character sums of the resultant over pairs of monic square-free polynomials of given degrees, answering a question of Ellenberg and Shusterman in the quadratic case. Our approach is topological: we compute the homology of braid groups on multi-punctured planes and prove a vanishing range for the homology of mixed braid groups with rank-1 local coefficients associated to c…
▽ More
We give an upper bound on character sums of the resultant over pairs of monic square-free polynomials of given degrees, answering a question of Ellenberg and Shusterman in the quadratic case. Our approach is topological: we compute the homology of braid groups on multi-punctured planes and prove a vanishing range for the homology of mixed braid groups with rank-1 local coefficients associated to characters of finite fields. Our method involves constructing a cellular stratification for configuration spaces of multi-punctured planes and relating their twisted homology with more general exponential coefficients to the cohomology of certain bimodules over quantum shuffle algebras.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
A search for inter-cluster filaments with LOFAR and eROSITA
Authors:
D. N. Hoang,
M. Brüggen,
T. W. Shimwell,
A. Botteon,
S. P. O'Sullivan,
T. Pasini,
X. Zhang,
A. Bonafede,
A. Liu,
T. Liu,
G. Brunetti,
E. Bulbul,
G. Di Gennaro,
H. J. A. Röttgering,
T. Vernstrom,
R. J. van Weeren
Abstract:
Cosmological simulations predict the presence of warm hot thermal gas in the cosmic filaments that connect galaxy clusters. This gas is thought to constitute an important part of the missing baryons in the Universe. In addition to the thermal gas, cosmic filaments could contain a population of relativistic particles and magnetic fields. A detection of magnetic fields in filaments can constrain ear…
▽ More
Cosmological simulations predict the presence of warm hot thermal gas in the cosmic filaments that connect galaxy clusters. This gas is thought to constitute an important part of the missing baryons in the Universe. In addition to the thermal gas, cosmic filaments could contain a population of relativistic particles and magnetic fields. A detection of magnetic fields in filaments can constrain early magnetogenesis in the cosmos. So far, the resulting diffuse synchrotron emission has only been indirectly detected. We present our search for thermal and non-thermal diffuse emission from inter-cluster regions of 106 paired galaxy clusters by stacking the $0.6-2.3$~keV X-ray and 144~MHz radio data obtained with the eROSITA telescope on board the Spectrum-Roentgen-Gamma (SRG) observatory and LOw Frequency ARray (LOFAR), respectively. The stacked data do not show the presence of X-ray and radio diffuse emission in the inter-cluster regions. This could be due to the sensitivity of the data sets and/or the limited number of cluster pairs used in this study. Assuming a constant radio emissivity in the filaments, we find that the mean radio emissivity is not higher than $1.2\times10^{-44}\,{\rm erg \, s^{-1} \, cm^{-3} \, Hz^{-1}}$. Under equipartition conditions, our upper limit on the mean emissivity translates to an upper limit of $\sim75\,{\rm nG}$ for the mean magnetic field strength in the filaments, depending on the spectral index and the minimum energy cutoff. We discuss the constraint for the magnetic field strength in the context of the models for the formation of magnetic fields in cosmic filaments.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Personalized Federated Domain Adaptation for Item-to-Item Recommendation
Authors:
Ziwei Fan,
Hao Ding,
Anoop Deoras,
Trong Nghia Hoang
Abstract:
Item-to-Item (I2I) recommendation is an important function in most recommendation systems, which generates replacement or complement suggestions for a particular item based on its semantic similarities to other cataloged items. Given that subsets of items in a recommendation system might be co-interacted with by the same set of customers, graph-based models, such as graph neural networks (GNNs), p…
▽ More
Item-to-Item (I2I) recommendation is an important function in most recommendation systems, which generates replacement or complement suggestions for a particular item based on its semantic similarities to other cataloged items. Given that subsets of items in a recommendation system might be co-interacted with by the same set of customers, graph-based models, such as graph neural networks (GNNs), provide a natural framework to combine, ingest and extract valuable insights from such high-order relational interactions between cataloged items, as well as their metadata features, as has been shown in many recent studies. However, learning GNNs effectively for I2I requires ingesting a large amount of relational data, which might not always be available, especially in new, emerging market segments. To mitigate this data bottleneck, we postulate that recommendation patterns learned from existing mature market segments (with private data) could be adapted to build effective warm-start models for emerging ones. To achieve this, we propose and investigate a personalized federated modeling framework based on GNNs to summarize, assemble and adapt recommendation patterns across market segments with heterogeneous customer behaviors into effective local models. Our key contribution is a personalized graph adaptation model that bridges the gap between recent literature on federated GNNs and (non-graph) personalized federated learning, which either does not optimize for the adaptability of the federated model or is restricted to local models with homogeneous parameterization, excluding GNNs with heterogeneous local graphs.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Federated Learning of Models Pre-Trained on Different Features with Consensus Graphs
Authors:
Tengfei Ma,
Trong Nghia Hoang,
Jie Chen
Abstract:
Learning an effective global model on private and decentralized datasets has become an increasingly important challenge of machine learning when applied in practice. Existing distributed learning paradigms, such as Federated Learning, enable this via model aggregation which enforces a strong form of modeling homogeneity and synchronicity across clients. This is however not suitable to many practic…
▽ More
Learning an effective global model on private and decentralized datasets has become an increasingly important challenge of machine learning when applied in practice. Existing distributed learning paradigms, such as Federated Learning, enable this via model aggregation which enforces a strong form of modeling homogeneity and synchronicity across clients. This is however not suitable to many practical scenarios. For example, in distributed sensing, heterogeneous sensors reading data from different views of the same phenomenon would need to use different models for different data modalities. Local learning therefore happens in isolation but inference requires merging the local models to achieve consensus. To enable consensus among local models, we propose a feature fusion approach that extracts local representations from local models and incorporates them into a global representation that improves the prediction performance. Achieving this requires addressing two non-trivial problems. First, we need to learn an alignment between similar feature components which are arbitrarily arranged across clients to enable representation aggregation. Second, we need to learn a consensus graph that captures the high-order interactions between local feature spaces and how to combine them to achieve a better prediction. This paper presents solutions to these problems and demonstrates them in real-world applications on time series data such as power grids and traffic networks.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Correlation between Macroscopic and Microscopic Relaxation Dynamics of Water: Evidence for Two Liquid Forms
Authors:
Nguyen Q. Vinh,
Luan C. Doan,
Ngoc L. H. Hoang,
Jiarong R. Cui,
Ben Sindle
Abstract:
Water is vital for life, and without it biomolecules and cells cannot maintain their structures and functions. The remarkable properties of water originate from its ability to form hydrogen-bonding networks and dynamics, which the connectivity constantly alters because of the orientation rotation of individual water molecules. Experimental investigation of the dynamics of water, however, has prove…
▽ More
Water is vital for life, and without it biomolecules and cells cannot maintain their structures and functions. The remarkable properties of water originate from its ability to form hydrogen-bonding networks and dynamics, which the connectivity constantly alters because of the orientation rotation of individual water molecules. Experimental investigation of the dynamics of water, however, has proven challenging due to the strong absorption of water at terahertz frequencies. In response, by employing a high-precision terahertz spectrometer, we have measured and characterized the terahertz dielectric response of water from supercooled liquid to near the boiling point to explore the motions. The response reveals dynamic relaxation processes corresponding to the collective orientation, single-molecule rotation, and structural rearrangements resulting from breaking and reforming hydrogen bonds in water. We have observed the direct relationship between the macroscopic and microscopic relaxation dynamics of water, and the results have provided evidence of two liquid forms in water with different transition temperatures and thermal activation energies. The results reported here thus provide an unprecedented opportunity to directly test microscopic computational models of water dynamics.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Timescales of Chaos in the Inner Solar System: Lyapunov Spectrum and Quasi-integrals of Motion
Authors:
Federico Mogavero,
Nam H. Hoang,
Jacques Laskar
Abstract:
Numerical integrations of the Solar System reveal a remarkable stability of the orbits of the inner planets over billions of years, in spite of their chaotic variations characterized by a Lyapunov time of only 5 million years and the lack of integrals of motion able to constrain their dynamics. To open a window on such long-term behavior, we compute the entire Lyapunov spectrum of a forced secular…
▽ More
Numerical integrations of the Solar System reveal a remarkable stability of the orbits of the inner planets over billions of years, in spite of their chaotic variations characterized by a Lyapunov time of only 5 million years and the lack of integrals of motion able to constrain their dynamics. To open a window on such long-term behavior, we compute the entire Lyapunov spectrum of a forced secular model of the inner planets. We uncover a hierarchy of characteristic exponents that spans two orders of magnitude, manifesting a slow-fast dynamics with a broad separation of timescales. A systematic analysis of the Fourier harmonics of the Hamiltonian, based on computer algebra, reveals three symmetries that characterize the strongest resonances responsible for the orbital chaos. These symmetries are broken only by weak resonances, leading to the existence of quasi-integrals of motion that are shown to relate to the smallest Lyapunov exponents. A principal component analysis of the orbital solutions independently confirms that the quasi-integrals are among the slowest degrees of freedom of the dynamics. Strong evidence emerges that they effectively constrain the chaotic diffusion of the orbits, playing a crucial role in the statistical stability over the Solar System lifetime.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
TextANIMAR: Text-based 3D Animal Fine-Grained Retrieval
Authors:
Trung-Nghia Le,
Tam V. Nguyen,
Minh-Quan Le,
Trong-Thuan Nguyen,
Viet-Tham Huynh,
Trong-Le Do,
Khanh-Duy Le,
Mai-Khiem Tran,
Nhat Hoang-Xuan,
Thang-Long Nguyen-Ho,
Vinh-Tiep Nguyen,
Tuong-Nghiem Diep,
Khanh-Duy Ho,
Xuan-Hieu Nguyen,
Thien-Phuc Tran,
Tuan-Anh Yang,
Kim-Phat Tran,
Nhu-Vinh Hoang,
Minh-Quang Nguyen,
E-Ro Nguyen,
Minh-Khoi Nguyen-Nhat,
Tuan-An To,
Trung-Truc Huynh-Le,
Nham-Tan Nguyen,
Hoang-Chau Luong
, et al. (8 additional authors not shown)
Abstract:
3D object retrieval is an important yet challenging task that has drawn more and more attention in recent years. While existing approaches have made strides in addressing this issue, they are often limited to restricted settings such as image and sketch queries, which are often unfriendly interactions for common users. In order to overcome these limitations, this paper presents a novel SHREC chall…
▽ More
3D object retrieval is an important yet challenging task that has drawn more and more attention in recent years. While existing approaches have made strides in addressing this issue, they are often limited to restricted settings such as image and sketch queries, which are often unfriendly interactions for common users. In order to overcome these limitations, this paper presents a novel SHREC challenge track focusing on text-based fine-grained retrieval of 3D animal models. Unlike previous SHREC challenge tracks, the proposed task is considerably more challenging, requiring participants to develop innovative approaches to tackle the problem of text-based retrieval. Despite the increased difficulty, we believe this task can potentially drive useful applications in practice and facilitate more intuitive interactions with 3D objects. Five groups participated in our competition, submitting a total of 114 runs. While the results obtained in our competition are satisfactory, we note that the challenges presented by this task are far from fully solved. As such, we provide insights into potential areas for future research and improvements. We believe we can help push the boundaries of 3D object retrieval and facilitate more user-friendly interactions via vision-language technologies. https://aichallenge.hcmus.edu.vn/textanimar
△ Less
Submitted 9 August, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval
Authors:
Trung-Nghia Le,
Tam V. Nguyen,
Minh-Quan Le,
Trong-Thuan Nguyen,
Viet-Tham Huynh,
Trong-Le Do,
Khanh-Duy Le,
Mai-Khiem Tran,
Nhat Hoang-Xuan,
Thang-Long Nguyen-Ho,
Vinh-Tiep Nguyen,
Nhat-Quynh Le-Pham,
Huu-Phuc Pham,
Trong-Vu Hoang,
Quang-Binh Nguyen,
Trong-Hieu Nguyen-Mau,
Tuan-Luc Huynh,
Thanh-Danh Le,
Ngoc-Linh Nguyen-Ha,
Tuong-Vy Truong-Thuy,
Truong Hoai Phong,
Tuong-Nghiem Diep,
Khanh-Duy Ho,
Xuan-Hieu Nguyen,
Thien-Phuc Tran
, et al. (9 additional authors not shown)
Abstract:
The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this…
▽ More
The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this end, we introduce a novel SHREC challenge track that focuses on retrieving relevant 3D animal models from a dataset using sketch queries and expedites accessing 3D models through available sketches. Furthermore, a new dataset named ANIMAR was constructed in this study, comprising a collection of 711 unique 3D animal models and 140 corresponding sketch queries. Our contest requires participants to retrieve 3D models based on complex and detailed sketches. We receive satisfactory results from eight teams and 204 runs. Although further improvement is necessary, the proposed task has the potential to incentivize additional research in the domain of 3D object retrieval, potentially yielding benefits for a wide range of applications. We also provide insights into potential areas of future research, such as improving techniques for feature extraction and matching and creating more diverse datasets to evaluate retrieval performance. https://aichallenge.hcmus.edu.vn/sketchanimar
△ Less
Submitted 9 August, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Measuring and Evading Turkmenistan's Internet Censorship: A Case Study in Large-Scale Measurements of a Low-Penetration Country
Authors:
Sadia Nourin,
Van Tran,
Xi Jiang,
Kevin Bock,
Nick Feamster,
Nguyen Phong Hoang,
Dave Levin
Abstract:
Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies. Existing reports of filtering in Turkmenistan rely on a small number of vantage points or test a small number of websites. Yet, the country's poor Internet adoption rates and small population can make…
▽ More
Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies. Existing reports of filtering in Turkmenistan rely on a small number of vantage points or test a small number of websites. Yet, the country's poor Internet adoption rates and small population can make more comprehensive measurement challenging. With a population of only six million people and an Internet penetration rate of only 38%, it is challenging to either recruit in-country volunteers or obtain vantage points to conduct remote network measurements at scale.
We present the largest measurement study to date of Turkmenistan's Web censorship. To do so, we developed TMC, which tests the blocking status of millions of domains across the three foundational protocols of the Web (DNS, HTTP, and HTTPS). Importantly, TMC does not require access to vantage points in the country. We apply TMC to 15.5M domains, our results reveal that Turkmenistan censors more than 122K domains, using different blocklists for each protocol. We also reverse-engineer these censored domains, identifying 6K over-blocking rules causing incidental filtering of more than 5.4M domains. Finally, we use Geneva, an open-source censorship evasion tool, to discover five new censorship evasion strategies that can defeat Turkmenistan's censorship at both transport and application layers. We will publicly release both the data collected by TMC and the code for censorship evasion.
△ Less
Submitted 17 April, 2023; v1 submitted 10 April, 2023;
originally announced April 2023.
-
Retrieval of material properties of monolayer transition-metal dichalcogenides from magnetoexciton energy spectra
Authors:
Duy-Nhat Ly,
Dai-Nam Le,
Duy-Anh P. Nguyen,
Ngoc-Tram D. Hoang,
Ngoc-Hung Phan,
Hoang-Minh L. Nguyen,
Van-Hoang Le
Abstract:
Reduced exciton mass, polarizability, and dielectric constant of the surrounding medium are essential properties for semiconducting materials, and they have been extracted recently from the magnetoexciton energies. However, the acceptable accuracy of the suggested method requires very high magnetic intensity. Therefore, in the present paper, we propose an alternative method of extracting these mat…
▽ More
Reduced exciton mass, polarizability, and dielectric constant of the surrounding medium are essential properties for semiconducting materials, and they have been extracted recently from the magnetoexciton energies. However, the acceptable accuracy of the suggested method requires very high magnetic intensity. Therefore, in the present paper, we propose an alternative method of extracting these material properties from recently available experimental magnetoexciton s-state energies in monolayer transition-metal dichalcogenides (TMDCs). The method is based on the high sensitivity of exciton energies to the material parameters in the Rytova-Keldysh model. It allows us to vary the considered material parameters to get the best fit of the theoretical calculation to the experimental exciton energies for the $1s$, $2s$, and $3s$ states. This procedure gives values of the exciton reduced mass and $2D$ polarizability. Then, the experimental magnetoexciton spectra compared to the theoretical calculation also determine the average dielectric constant. Concrete applications are presented only for monolayers WSe$_2$ and WS$_2$ from the recently available experimental data; however, the presented approach is universal and can be applied to other monolayer TMDCs. The mentioned fitting procedure requires a fast and effective method of solving the Schrödinger equation of an exciton in monolayer TMDCs with a magnetic field. Therefore, we also develop such a method in this paper for highly accurate magnetoexciton energies.
△ Less
Submitted 24 April, 2023; v1 submitted 14 March, 2023;
originally announced March 2023.
-
Bender-Knuth involutions on linear extensions of posets
Authors:
Judy Hsin-Hui Chiang,
Anh Trong Nam Hoang,
Matthew Kendall,
Ryan Lynch,
Son Nguyen,
Benjamin Przybocki,
Janabel Xia
Abstract:
We study the permutation group $\mathcal{BK}_P$ generated by Bender-Knuth moves on linear extensions of a poset $P$, an analog of the Berenstein-Kirillov group on column-strict tableaux. We explore the group relations, with an emphasis on identifying posets $P$ for which the cactus relations hold in $\mathcal{BK}_P$. We also examine $\mathcal{BK}_P$ as a subgroup of the symmetric group…
▽ More
We study the permutation group $\mathcal{BK}_P$ generated by Bender-Knuth moves on linear extensions of a poset $P$, an analog of the Berenstein-Kirillov group on column-strict tableaux. We explore the group relations, with an emphasis on identifying posets $P$ for which the cactus relations hold in $\mathcal{BK}_P$. We also examine $\mathcal{BK}_P$ as a subgroup of the symmetric group $\mathfrak{S}_{\mathcal{L}(P)}$ on the set of linear extensions of $P$ with the focus on analyzing posets $P$ for which $\mathcal{BK}_P = \mathfrak{S}_{\mathcal{L}(P)}$.
△ Less
Submitted 24 March, 2024; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Augmenting Rule-based DNS Censorship Detection at Scale with Machine Learning
Authors:
Jacob Brown,
Xi Jiang,
Van Tran,
Arjun Nitin Bhagoji,
Nguyen Phong Hoang,
Nick Feamster,
Prateek Mittal,
Vinod Yegneswaran
Abstract:
The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-speci…
▽ More
The proliferation of global censorship has led to the development of a plethora of measurement platforms to monitor and expose it. Censorship of the domain name system (DNS) is a key mechanism used across different countries. It is currently detected by applying heuristics to samples of DNS queries and responses (probes) for specific destinations. These heuristics, however, are both platform-specific and have been found to be brittle when censors change their blocking behavior, necessitating a more reliable automated process for detecting censorship.
In this paper, we explore how machine learning (ML) models can (1) help streamline the detection process, (2) improve the potential of using large-scale datasets for censorship detection, and (3) discover new censorship instances and blocking signatures missed by existing heuristic methods. Our study shows that supervised models, trained using expert-derived labels on instances of known anomalies and possible censorship, can learn the detection heuristics employed by different measurement platforms. More crucially, we find that unsupervised models, trained solely on uncensored instances, can identify new instances and variations of censorship missed by existing heuristics. Moreover, both methods demonstrate the capability to uncover a substantial number of new DNS blocking signatures, i.e., injected fake IP addresses overlooked by existing heuristics. These results are underpinned by an important methodological finding: comparing the outputs of models trained using the same probes but with labels arising from independent processes allows us to more reliably detect cases of censorship in the absence of ground-truth labels of censorship.
△ Less
Submitted 15 June, 2023; v1 submitted 3 February, 2023;
originally announced February 2023.
-
The Planck clusters in the LOFAR sky. II. LoTSS-DR2: Recovering diffuse extended emission with LOFAR
Authors:
L. Bruno,
G. Brunetti,
A. Botteon,
V. Cuciti,
D. Dallacasa,
R. Cassano,
R. J. van Weeren,
T. Shimwell,
G. Taffoni,
S. A. Russo,
A. Bonafede,
M. Brüggen,
D. N. Hoang,
H. J. A. Röttgering,
C. Tasse
Abstract:
Extended radio sources in the sky require a dense sampling of short baselines to be properly imaged by interferometers. This problem arises in many areas of radio astronomy, such as in the study of galaxy clusters, which may host Mpc-scale diffuse synchrotron sources in the form of radio halos. In clusters where no radio halos are detected, owing to intrinsic absence of emission or extrinsic (inst…
▽ More
Extended radio sources in the sky require a dense sampling of short baselines to be properly imaged by interferometers. This problem arises in many areas of radio astronomy, such as in the study of galaxy clusters, which may host Mpc-scale diffuse synchrotron sources in the form of radio halos. In clusters where no radio halos are detected, owing to intrinsic absence of emission or extrinsic (instrumental and/or observational) effects, it is possible to determine upper limits. We consider a sample of Planck galaxy clusters from the Second Data Release of the LOFAR Two Meter Sky Survey (LoTSS-DR2) where no radio halos are detected. We use this sample to test the capabilities of LOFAR to recover diffuse extended emission and derive upper limits. Through the injection technique, we simulate radio halos with various surface brightness profiles. We then predict the corresponding visibilities and image them along with the real visibilities. This method allows us to test the fraction of flux density losses owing to inadequate uv-coverage and obtain thresholds at which the mock emission becomes undetectable by visual inspection. The dense uv-coverage of LOFAR at short spacings allows to recover $\gtrsim90\%$ of the flux density of targets with sizes up to $\sim 15'$. We find a relation that provides upper limits based on the image noise and extent (in terms of number of beams) of the mock halo. This relation can be safely adopted to obtain upper limits without injecting when artifacts introduced by the subtraction of the discrete sources are negligible in the central region of the cluster. Otherwise, the injection process and visual inspection of the images are necessary to determine more reliable limits. Through these methods, we obtain upper limits for 75 clusters to be exploited in ongoing statistical studies.
△ Less
Submitted 31 January, 2023; v1 submitted 19 January, 2023;
originally announced January 2023.
-
The Planck clusters in the LOFAR sky VI. LoTSS-DR2: Properties of radio relics
Authors:
A. Jones,
F. de Gasperin,
V. Cuciti,
A. Botteon,
X. Zhang,
F. Gastaldello,
T. Shimwell,
A. Simionescu,
M. Rossetti,
R. Cassano,
H. Akamatsu,
A. Bonafede,
M. Brüggen,
G. Brunetti,
L. Camillini,
G. Di Gennaro,
A. Drabent,
D. N. Hoang,
K. Rajpurohit,
R. Natale,
C. Tasse,
R. J. van Weeren
Abstract:
Context. It is well-established that shock waves in the intracluster medium launched by galaxy cluster mergers can produce synchrotron emission, which is visible to us at radio frequencies as radio relics. However, the particle acceleration mechanism producing these relics is still not fully understood. It is also unclear how relics relate to radio halos, which trace merger-induced turbulence in t…
▽ More
Context. It is well-established that shock waves in the intracluster medium launched by galaxy cluster mergers can produce synchrotron emission, which is visible to us at radio frequencies as radio relics. However, the particle acceleration mechanism producing these relics is still not fully understood. It is also unclear how relics relate to radio halos, which trace merger-induced turbulence in the intracluster medium. Aims. We aim to perform the first statistical analysis of radio relics in a mass-selected sample of galaxy clusters, using homogeneous observations. Methods. We analysed all relics observed by the Low Frequency Array Two Metre Sky Survey Data Release 2 (LoTSS DR2) at 144 MHz, hosted by galaxy clusters in the second Planck catalogue of SZ sources (PSZ2). We measured and compared the relic properties in a uniform, unbiased way. In particular, we developed a method to describe the characteristic downstream width in a statistical manner. Additionally, we searched for differences between radio relic-hosting clusters with and without radio halos. Results. We find that, in our sample, $\sim$ 10% of galaxy clusters host at least one radio relic. We confirm previous findings, at higher frequencies, of a correlation between the relic-cluster centre distance and the longest linear size, as well as the radio relic power and cluster mass. However, our findings suggest that we are still missing a population of low-power relics. We also find that relics are wider than theoretically expected, even with optimistic downstream conditions. Finally, we do not find evidence of a single property that separates relic-hosting clusters with and without radio halos.
△ Less
Submitted 18 January, 2023;
originally announced January 2023.
-
V-LoTSS: The Circularly-Polarised LOFAR Two-metre Sky Survey
Authors:
J. R. Callingham,
T. W. Shimwell,
H. K. Vedantham,
C. G. Bassa,
S. P. O'Sullivan,
T. W. H. Yiu,
S. Bloot,
P. N. Best,
M. J. Hardcastle,
M. Haverkorn,
R. D. Kavanagh,
L. Lamy,
B. J. S. Pope,
H. J. A. Röttgering,
D. J. Schwarz,
C. Tasse,
R. J. van Weeren,
G. J. White,
P. Zarka,
D. J. Bomans,
A. Bonafede,
M. Bonato,
A. Botteon,
M. Bruggen,
K. T. Chyży
, et al. (22 additional authors not shown)
Abstract:
We present the detection of 68 sources from the most sensitive radio survey in circular polarisation conducted to date. We use the second data release of the 144 MHz LOFAR Two-metre Sky Survey to produce circularly-polarised maps with median 140 $μ$Jy beam$^{-1}$ noise and resolution of 20$''$ for $\approx$27% of the northern sky (5634 deg$^{2}$). The leakage of total intensity into circular polar…
▽ More
We present the detection of 68 sources from the most sensitive radio survey in circular polarisation conducted to date. We use the second data release of the 144 MHz LOFAR Two-metre Sky Survey to produce circularly-polarised maps with median 140 $μ$Jy beam$^{-1}$ noise and resolution of 20$''$ for $\approx$27% of the northern sky (5634 deg$^{2}$). The leakage of total intensity into circular polarisation is measured to be $\approx$0.06%, and our survey is complete at flux densities $\geq1$ mJy. A detection is considered reliable when the circularly-polarised fraction exceeds 1%. We find the population of circularly-polarised sources is composed of four distinct classes: stellar systems, pulsars, active galactic nuclei, and sources unidentified in the literature. The stellar systems can be further separated into chromospherically-active stars, M dwarfs, and brown dwarfs. Based on the circularly-polarised fraction and lack of an optical counterpart, we show it is possible to infer whether the unidentified sources are likely unknown pulsars or brown dwarfs. By the completion of this survey of the northern sky, we expect to detect 300$\pm$100 circularly-polarised sources.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
Tournesol: Permissionless Collaborative Algorithmic Governance with Security Guarantees
Authors:
Lê Nguyên Hoang,
Romain Beylerian,
Bérangère Colbois,
Julien Fageot,
Louis Faucon,
Aidan Jungo,
Alain Le Noac'h,
Adrien Matissart,
Oscar Villemaud
Abstract:
Recommendation algorithms play an increasingly central role in our information ecosystem. Yet, so far, they are mostly designed, parameterized and updated unilaterally by private groups or governmental authorities, based on insecure data from increasingly many fake accounts. In this paper, we present an end-to-end permissionless collaborative algorithmic governance pipeline with security guarantee…
▽ More
Recommendation algorithms play an increasingly central role in our information ecosystem. Yet, so far, they are mostly designed, parameterized and updated unilaterally by private groups or governmental authorities, based on insecure data from increasingly many fake accounts. In this paper, we present an end-to-end permissionless collaborative algorithmic governance pipeline with security guarantees, which is deployed on the open-source platform https://tournesol.app. Our pipeline has essentially four steps. First, voting rights are assigned to the contributors, based on Sybil-resilient email domains and on a novel secure trust propagation algorithm. Second, a generalized Bradley-Terry model turns contributors' pairwise alternative comparisons into scores. Third, contributors' scores are collaboratively scaled, by an adaptation of the robust sparse voting solution Mehestan. Finally, scaled scores are post-processed and securely aggregated into human-readable global scores, which are used for recommendation and display. We believe that our pipeline lays an appealing foundation for any collaborative, effective, scalable, fair, interpretable and secure algorithmic governance.
△ Less
Submitted 15 August, 2023; v1 submitted 30 October, 2022;
originally announced November 2022.
-
Robust Collaborative Learning with Linear Gradient Overhead
Authors:
Sadegh Farhadkhani,
Rachid Guerraoui,
Nirupam Gupta,
Lê Nguyên Hoang,
Rafael Pinot,
John Stephan
Abstract:
Collaborative learning algorithms, such as distributed SGD (or D-SGD), are prone to faulty machines that may deviate from their prescribed algorithm because of software or hardware bugs, poisoned data or malicious behaviors. While many solutions have been proposed to enhance the robustness of D-SGD to such machines, previous works either resort to strong assumptions (trusted server, homogeneous da…
▽ More
Collaborative learning algorithms, such as distributed SGD (or D-SGD), are prone to faulty machines that may deviate from their prescribed algorithm because of software or hardware bugs, poisoned data or malicious behaviors. While many solutions have been proposed to enhance the robustness of D-SGD to such machines, previous works either resort to strong assumptions (trusted server, homogeneous data, specific noise model) or impose a gradient computational cost that is several orders of magnitude higher than that of D-SGD. We present MoNNA, a new algorithm that (a) is provably robust under standard assumptions and (b) has a gradient computation overhead that is linear in the fraction of faulty machines, which is conjectured to be tight. Essentially, MoNNA uses Polyak's momentum of local gradients for local updates and nearest-neighbor averaging (NNA) for global mixing, respectively. While MoNNA is rather simple to implement, its analysis has been more challenging and relies on two key elements that may be of independent interest. Specifically, we introduce the mixing criterion of $(α, λ)$-reduction to analyze the non-linear mixing of non-faulty machines, and present a way to control the tension between the momentum and the model drifts. We validate our theory by experiments on image classification and make our code available at https://github.com/LPD-EPFL/robust-collaborative-learning.
△ Less
Submitted 3 June, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Fox-Neuwirth cells, quantum shuffle algebras, and the homology of type-B Artin groups
Authors:
Anh Trong Nam Hoang
Abstract:
In this paper, we will develop a family of braid representations of Artin groups of type B from braided vector spaces, and identify the homology of these groups with these coefficients with the cohomology of a specific bimodule over a quantum shuffle algebra. As an application, we give a complete characterization of the homology of type-B Artin groups with coefficients in one-dimensional braid rep…
▽ More
In this paper, we will develop a family of braid representations of Artin groups of type B from braided vector spaces, and identify the homology of these groups with these coefficients with the cohomology of a specific bimodule over a quantum shuffle algebra. As an application, we give a complete characterization of the homology of type-B Artin groups with coefficients in one-dimensional braid representations over a field of characteristic 0. We will also discuss two different approaches to this computation: the first method extends a computation of the homology of braid groups due to Ellenberg-Tran-Westerland by means of induced representation, while the second method involves constructing a cellular stratification for configuration spaces of the punctured complex plane.
△ Less
Submitted 18 February, 2024; v1 submitted 25 July, 2022;
originally announced July 2022.
-
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
Authors:
Linbo Liu,
Youngsuk Park,
Trong Nghia Hoang,
Hilaf Hasson,
Jun Huan
Abstract:
This work studies the threats of adversarial attack on multivariate probabilistic forecasting models and viable defense mechanisms. Our studies discover a new attack pattern that negatively impact the forecasting of a target time series via making strategic, sparse (imperceptible) modifications to the past observations of a small number of other time series. To mitigate the impact of such attack,…
▽ More
This work studies the threats of adversarial attack on multivariate probabilistic forecasting models and viable defense mechanisms. Our studies discover a new attack pattern that negatively impact the forecasting of a target time series via making strategic, sparse (imperceptible) modifications to the past observations of a small number of other time series. To mitigate the impact of such attack, we have developed two defense strategies. First, we extend a previously developed randomized smoothing technique in classification to multivariate forecasting scenarios. Second, we develop an adversarial training algorithm that learns to create adversarial examples and at the same time optimizes the forecasting model to improve its robustness against such adversarial simulation. Extensive experiments on real-world datasets confirm that our attack schemes are powerful and our defense algorithms are more effective compared with baseline defense mechanisms.
△ Less
Submitted 14 April, 2023; v1 submitted 19 July, 2022;
originally announced July 2022.
-
Generalized explicit pseudo two-step Runge-Kutta-Nyström methods for solving second-order initial value problems
Authors:
Nguyen S. Hoang
Abstract:
A class of explicit pseudo two-step Runge-Kutta-Nyström (GEPTRKN) methods for solving second-order initial value problems $y'' = f(t,y,y')$, $y(t_0) = y_0$, $y'(t_0)=y'_0$ has been studied. This new class of methods can be considered a generalized version of the class of classical explicit pseudo two-step Runge-Kutta-Nyström methods. %The new methods will be denoted by GEPTRKN methods. We proved t…
▽ More
A class of explicit pseudo two-step Runge-Kutta-Nyström (GEPTRKN) methods for solving second-order initial value problems $y'' = f(t,y,y')$, $y(t_0) = y_0$, $y'(t_0)=y'_0$ has been studied. This new class of methods can be considered a generalized version of the class of classical explicit pseudo two-step Runge-Kutta-Nyström methods. %The new methods will be denoted by GEPTRKN methods. We proved that an $s$-stage GEPTRKN method has step order of accuracy $p=s$ and stage order of accuracy $r=s$ for any set of distinct collocation parameters $(c_i)_{i=1}^s$. Super-convergence for order of accuracy of these methods can be obtained if the collocation parameters $(c_i)_{i=1}^s$ satisfy some orthogonality conditions. We proved that an $s$-stage GEPTRKN method can attain order of accuracy $p=s+2$. Numerical experiments have shown that the new methods work better than classical methods for solving non-stiff problems even on sequential computing environments. By their structures, the new methods will be much more efficient when implemented on parallel computers.
△ Less
Submitted 17 July, 2022;
originally announced July 2022.
-
The NuSTAR and Chandra view of CL 0217+70 and Its Tell-Tale Radio Halo
Authors:
Ayşegül Tümer,
Daniel R. Wik,
Xiaoyuan Zhang,
Duy N. Hoang,
Massimo Gaspari,
Reinout J. van Weeren,
Lawrence Rudnick,
Chiara Stuardi,
François Mernier,
Aurora Simionescu,
Randall A. Rojas Bolivar,
Ralph Kraft,
Hiroki Akamatsu,
Jelle de Plaa
Abstract:
Mergers of galaxy clusters are the most energetic events in the universe, driving shock and cold fronts, generating turbulence, and accelerating particles that create radio halos and relics. The galaxy cluster CL 0217+70 is a remarkable late stage merger, with a double peripheral radio relic and a giant radio halo. A Chandra study detects surface brightness edges that correspond to radio features…
▽ More
Mergers of galaxy clusters are the most energetic events in the universe, driving shock and cold fronts, generating turbulence, and accelerating particles that create radio halos and relics. The galaxy cluster CL 0217+70 is a remarkable late stage merger, with a double peripheral radio relic and a giant radio halo. A Chandra study detects surface brightness edges that correspond to radio features within the halo. In this work, we present a study of this cluster with NuSTAR and Chandra data using spectro-imaging methods. The global temperature is found to be kT = 9.1 keV. We set an upper limit for the IC flux of ~2.7x10^(-12) erg s^(-1) cm^(-2), and a lower limit to the magnetic field of 0.08 microG. Our local IC search revealed a possibility that IC emission may have a significant contribution at the outskirts of a radio halo emission and on/near shock regions within ~0.6 r500 of clusters. We detected a "hot spot" feature in our temperature map coincident a surface brightness edge, but our investigation on its origin is inconclusive. If the "hot spot" is the downstream of a shock, we set a lower limit of kT > 21 keV to the plasma, that corresponds to M~2. We found three shock fronts within 0.5 r500. Multiple weak shocks within the cluster center hint at an ongoing merger activity and continued feeding of the giant radio halo. CL 0217+70 is the only example hosting these secondary shocks in multiple form.
△ Less
Submitted 8 November, 2022; v1 submitted 18 June, 2022;
originally announced June 2022.
-
Diffuse radio emission from non-Planck galaxy clusters in the LoTSS-DR2 fields
Authors:
D. N. Hoang,
M. Brüggen,
A. Botteon,
T. W. Shimwell,
X. Zhang,
A. Bonafede,
L. Bruno,
E. Bonnassieux,
R. Cassano,
V. Cuciti,
A. Drabent,
F. de Gasperin,
F. Gastaldello,
G. Di Gennaro,
M. Hoeft,
A. Jones,
G. V. Pignataro,
H. J. A. Röttgering,
A. Simionescu,
R. J. van Weeren
Abstract:
The presence of large-scale magnetic fields and ultra-relativistic electrons in the intra-cluster medium (ICM) is confirmed through the detection of diffuse radio synchrotron sources, so-called radio halos and relics. Due to their steep-spectrum nature, these sources are rarely detected at frequencies above a few GHz, especially in low-mass systems. The aim of this study is to discover and charact…
▽ More
The presence of large-scale magnetic fields and ultra-relativistic electrons in the intra-cluster medium (ICM) is confirmed through the detection of diffuse radio synchrotron sources, so-called radio halos and relics. Due to their steep-spectrum nature, these sources are rarely detected at frequencies above a few GHz, especially in low-mass systems. The aim of this study is to discover and characterise diffuse radio sources in low-mass galaxy clusters in order to understand their origin and their scaling with host cluster properties. We searched for cluster-scale radio emission from low-mass galaxy clusters in the Low Frequency Array (LOFAR) Two-metre Sky Survey - Data Release 2 (LoTSS-DR2) fields. We made use of existing optical (Abell, DESI, WHL) and X-ray (comPRASS, MCXC) catalogues. The LoTSS-DR2 data were processed further to improve the quality of the images that are used to detect and characterize diffuse sources. We have detected diffuse radio emission in 28 galaxy clusters. The number of confirmed (candidates) halos and relics are six (seven) and 10 (three), respectively. Among these, 11 halos and 10 relics, including candidates, are newly discovered by LOFAR. Beside these, five diffuse sources are detected in tailed radio galaxies and are probably associated with mergers during the formation of the host clusters. We are unable to classify other 13 diffuse sources. We compare our newly detected, diffuse sources to known sources by placing them on the scaling relation between the radio power and the mass of the host clusters.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Particle re-acceleration and diffuse radio sources in the galaxy cluster Abell 1550
Authors:
T. Pasini,
H. W. Edler,
M. Brüggen,
F. de Gasperin,
A. Botteon,
K. Rajpurohit,
R. J. van Weeren,
F. Gastaldello,
M. Gaspari,
G. Brunetti,
V. Cuciti,
C. Nanci,
G. di Gennaro,
M. Rossetti,
D. Dallacasa. D. N. Hoang,
C. J. Riseley
Abstract:
We study diffuse radio emission in the galaxy cluster A1550, with the aim of constraining particle re-acceleration in the intra-cluster medium. We exploit observations at four different frequencies: 54, 144, 400 and 1400 MHz. To complement our analysis, we make use of archival Chandra X-ray data. At all frequencies we detect an ultra-steep spectrum radio halo ($S_ν\propto ν^{-1.6}$) with an extent…
▽ More
We study diffuse radio emission in the galaxy cluster A1550, with the aim of constraining particle re-acceleration in the intra-cluster medium. We exploit observations at four different frequencies: 54, 144, 400 and 1400 MHz. To complement our analysis, we make use of archival Chandra X-ray data. At all frequencies we detect an ultra-steep spectrum radio halo ($S_ν\propto ν^{-1.6}$) with an extent of 1.2 Mpc at 54 MHz. Its morphology follows the distribution of the thermal intra-cluster medium inferred from the Chandra observation. West of the centrally located head-tail radio galaxy, we detect a radio relic with projected extent of 500 kpc. From the relic, a 600 kpc long bridge departs and connect it to the halo. Between the relic and the radio galaxy, we observe what is most likely a radio phoenix, given its curved spectrum. The phoenix is connected to the tail of the radio galaxy through two arms, which show a nearly constant spectral index for 300 kpc. The halo could be produced by turbulence induced by a major merger, with its axis lying in the NE-SW direction. This is supported by the position of the relic, whose origin could be attributed to a shock propagating along the merger axis. It is possible that the same shock has also produced the phoenix through adiabatic compression, while the bridge could be generated by electrons which were pre-accelerated by the shock, and then re-accelerated by turbulence. Finally, we detect hints of gentle re-energisation in the two arms which depart from the tail of the radio galaxy.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.