-
The Two Sides of the Coin: Hallucination Generation and Detection with LLMs as Evaluators for LLMs
Authors:
Anh Thu Maria Bui,
Saskia Felizitas Brech,
Natalie Hußfeldt,
Tobias Jennert,
Melanie Ullrich,
Timo Breuer,
Narjes Nikzad Khasmakhi,
Philipp Schaer
Abstract:
Hallucination detection in Large Language Models (LLMs) is crucial for ensuring their reliability. This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. We explored the capabilities of four LLMs: Llama 3, Gemma, GPT-3.5 Turbo, and GPT-4, for this purpose. We also employed ens…
▽ More
Hallucination detection in Large Language Models (LLMs) is crucial for ensuring their reliability. This work presents our participation in the CLEF ELOQUENT HalluciGen shared task, where the goal is to develop evaluators for both generating and detecting hallucinated content. We explored the capabilities of four LLMs: Llama 3, Gemma, GPT-3.5 Turbo, and GPT-4, for this purpose. We also employed ensemble majority voting to incorporate all four models for the detection task. The results provide valuable insights into the strengths and weaknesses of these LLMs in handling hallucination generation and detection tasks.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Good things come in three: Generating SO Post Titles with Pre-Trained Models, Self Improvement and Post Ranking
Authors:
Duc Anh Le,
Anh M. T. Bui,
Phuong T. Nguyen,
Davide Di Ruscio
Abstract:
Stack Overflow is a prominent Q and A forum, supporting developers in seeking suitable resources on programming-related matters. Having high-quality question titles is an effective means to attract developers' attention. Unfortunately, this is often underestimated, leaving room for improvement. Research has been conducted, predominantly leveraging pre-trained models to generate titles from code sn…
▽ More
Stack Overflow is a prominent Q and A forum, supporting developers in seeking suitable resources on programming-related matters. Having high-quality question titles is an effective means to attract developers' attention. Unfortunately, this is often underestimated, leaving room for improvement. Research has been conducted, predominantly leveraging pre-trained models to generate titles from code snippets and problem descriptions. Yet, getting high-quality titles is still a challenging task, attributed to both the quality of the input data (e.g., containing noise and ambiguity) and inherent constraints in sequence generation models. In this paper, we present FILLER as a solution to generating Stack Overflow post titles using a fine-tuned language model with self-improvement and post ranking. Our study focuses on enhancing pre-trained language models for generating titles for Stack Overflow posts, employing a training and subsequent fine-tuning paradigm for these models. To this end, we integrate the model's predictions into the training process, enabling it to learn from its errors, thereby lessening the effects of exposure bias. Moreover, we apply a post-ranking method to produce a variety of sample candidates, subsequently selecting the most suitable one. To evaluate FILLER, we perform experiments using benchmark datasets, and the empirical findings indicate that our model provides high-quality recommendations. Moreover, it significantly outperforms all the baselines, including Code2Que, SOTitle, CCBERT, M3NSCT5, and GPT3.5-turbo. A user study also shows that FILLER provides more relevant titles, with respect to SOTitle and GPT3.5-turbo.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Automating Attendance Management in Human Resources: A Design Science Approach Using Computer Vision and Facial Recognition
Authors:
Bao-Thien Nguyen-Tat,
Minh-Quoc Bui,
Vuong M. Ngo
Abstract:
Haar Cascade is a cost-effective and user-friendly machine learning-based algorithm for detecting objects in images and videos. Unlike Deep Learning algorithms, which typically require significant resources and expensive computing costs, it uses simple image processing techniques like edge detection and Haar features that are easy to comprehend and implement. By combining Haar Cascade with OpenCV2…
▽ More
Haar Cascade is a cost-effective and user-friendly machine learning-based algorithm for detecting objects in images and videos. Unlike Deep Learning algorithms, which typically require significant resources and expensive computing costs, it uses simple image processing techniques like edge detection and Haar features that are easy to comprehend and implement. By combining Haar Cascade with OpenCV2 on an embedded computer like the NVIDIA Jetson Nano, this system can accurately detect and match faces in a database for attendance tracking. This system aims to achieve several specific objectives that set it apart from existing solutions. It leverages Haar Cascade, enriched with carefully selected Haar features, such as Haar-like wavelets, and employs advanced edge detection techniques. These techniques enable precise face detection and matching in both images and videos, contributing to high accuracy and robust performance. By doing so, it minimizes manual intervention and reduces errors, thereby strengthening accountability. Additionally, the integration of OpenCV2 and the NVIDIA Jetson Nano optimizes processing efficiency, making it suitable for resource-constrained environments. This system caters to a diverse range of educational institutions, including schools, colleges, vocational training centers, and various workplace settings such as small businesses, offices, and factories. ... The system's affordability and efficiency democratize attendance management technology, making it accessible to a broader audience. Consequently, it has the potential to transform attendance tracking and management practices, ultimately leading to heightened productivity and accountability. In conclusion, this system represents a groundbreaking approach to attendance tracking and management...
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
The Trade-off between Performance, Efficiency, and Fairness in Adapter Modules for Text Classification
Authors:
Minh Duc Bui,
Katharina von der Wense
Abstract:
Current natural language processing (NLP) research tends to focus on only one or, less frequently, two dimensions - e.g., performance, privacy, fairness, or efficiency - at a time, which may lead to suboptimal conclusions and often overlooking the broader goal of achieving trustworthy NLP. Work on adapter modules (Houlsby et al., 2019; Hu et al., 2021) focuses on improving performance and efficien…
▽ More
Current natural language processing (NLP) research tends to focus on only one or, less frequently, two dimensions - e.g., performance, privacy, fairness, or efficiency - at a time, which may lead to suboptimal conclusions and often overlooking the broader goal of achieving trustworthy NLP. Work on adapter modules (Houlsby et al., 2019; Hu et al., 2021) focuses on improving performance and efficiency, with no investigation of unintended consequences on other aspects such as fairness. To address this gap, we conduct experiments on three text classification datasets by either (1) finetuning all parameters or (2) using adapter modules. Regarding performance and efficiency, we confirm prior findings that the accuracy of adapter-enhanced models is roughly on par with that of fully finetuned models, while training time is substantially reduced. Regarding fairness, we show that adapter modules result in mixed fairness across sensitive groups. Further investigation reveals that, when the standard fine-tuned model exhibits limited biases, adapter modules typically do not introduce extra bias. On the other hand, when the finetuned model exhibits increased bias, the impact of adapter modules on bias becomes more unpredictable, introducing the risk of significantly magnifying these biases for certain groups. Our findings highlight the need for a case-by-case evaluation rather than a one-size-fits-all judgment.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Knowledge Distillation vs. Pretraining from Scratch under a Fixed (Computation) Budget
Authors:
Minh Duc Bui,
Fabian David Schmidt,
Goran Glavaš,
Katharina von der Wense
Abstract:
Compared to standard language model (LM) pretraining (i.e., from scratch), Knowledge Distillation (KD) entails an additional forward pass through a teacher model that is typically substantially larger than the target student model. As such, KD in LM pretraining materially slows down throughput of pretraining instances vis-a-vis pretraining from scratch. Scaling laws of LM pretraining suggest that…
▽ More
Compared to standard language model (LM) pretraining (i.e., from scratch), Knowledge Distillation (KD) entails an additional forward pass through a teacher model that is typically substantially larger than the target student model. As such, KD in LM pretraining materially slows down throughput of pretraining instances vis-a-vis pretraining from scratch. Scaling laws of LM pretraining suggest that smaller models can close the gap to larger counterparts if trained on more data (i.e., processing more tokens)-and under a fixed computation budget, smaller models are able be process more data than larger models. We thus hypothesize that KD might, in fact, be suboptimal to pretraining from scratch for obtaining smaller LMs, when appropriately accounting for the compute budget. To test this, we compare pretraining from scratch against several KD strategies for masked language modeling (MLM) in a fair experimental setup, with respect to amount of computation as well as pretraining data. Downstream results on GLUE, however, do not confirm our hypothesis: while pretraining from scratch performs comparably to ordinary KD under a fixed computation budget, more sophisticated KD strategies, namely TinyBERT (Jiao et al., 2020) and MiniLM (Wang et al., 2023), outperform it by a notable margin. We further find that KD yields larger gains over pretraining from scratch when the data must be repeated under the fixed computation budget.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Real-time guidewire tracking and segmentation in intraoperative x-ray
Authors:
Baochang Zhang,
Mai Bui,
Cheng Wang,
Felix Bourier,
Heribert Schunkert,
Nassir Navab
Abstract:
During endovascular interventions, physicians have to perform accurate and immediate operations based on the available real-time information, such as the shape and position of guidewires observed on the fluoroscopic images, haptic information and the patients' physiological signals. For this purpose, real-time and accurate guidewire segmentation and tracking can enhance the visualization of guidew…
▽ More
During endovascular interventions, physicians have to perform accurate and immediate operations based on the available real-time information, such as the shape and position of guidewires observed on the fluoroscopic images, haptic information and the patients' physiological signals. For this purpose, real-time and accurate guidewire segmentation and tracking can enhance the visualization of guidewires and provide visual feedback for physicians during the intervention as well as for robot-assisted interventions. Nevertheless, this task often comes with the challenge of elongated deformable structures that present themselves with low contrast in the noisy fluoroscopic image sequences. To address these issues, a two-stage deep learning framework for real-time guidewire segmentation and tracking is proposed. In the first stage, a Yolov5s detector is trained, using the original X-ray images as well as synthetic ones, which is employed to output the bounding boxes of possible target guidewires. More importantly, a refinement module based on spatiotemporal constraints is incorporated to robustly localize the guidewire and remove false detections. In the second stage, a novel and efficient network is proposed to segment the guidewire in each detected bounding box. The network contains two major modules, namely a hessian-based enhancement embedding module and a dual self-attention module. Quantitative and qualitative evaluations on clinical intra-operative images demonstrate that the proposed approach significantly outperforms our baselines as well as the current state of the art and, in comparison, shows higher robustness to low quality images.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Solving Combinatorial Pricing Problems using Embedded Dynamic Programming Models
Authors:
Quang Minh Bui,
Margarida Carvalho,
José Neto
Abstract:
The combinatorial pricing problem (CPP) is a bilevel problem in which the leader maximizes their revenue by imposing tolls on certain items that they can control. Based on the tolls set by the leader, the follower selects a subset of items corresponding to an optimal solution of a combinatorial optimization problem. To accomplish the leader's goal, the tolls need to be sufficiently low to discoura…
▽ More
The combinatorial pricing problem (CPP) is a bilevel problem in which the leader maximizes their revenue by imposing tolls on certain items that they can control. Based on the tolls set by the leader, the follower selects a subset of items corresponding to an optimal solution of a combinatorial optimization problem. To accomplish the leader's goal, the tolls need to be sufficiently low to discourage the follower from choosing the items offered by the competitors. In this paper, we derive a single-level reformulation for the CPP by rewriting the follower's problem as a longest path problem using a dynamic programming model, and then taking its dual and applying strong duality. We proceed to solve the reformulation in a dynamic fashion with a cutting plane method. We apply this methodology to 2 distinct dynamic programming models, namely, a novel formulation designated as selection diagram and the well-known decision diagram. We also produce numerical results to evaluate their performances across 3 different specializations of the CPP and a closely related problem that is the knapsack interdiction problem. Our results showcase the potential of the 2 proposed reformulations over the natural value function approach, expanding the set of tools to solve combinatorial bilevel programs.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
LEGION: Harnessing Pre-trained Language Models for GitHub Topic Recommendations with Distribution-Balance Loss
Authors:
Yen-Trang Dang,
Thanh-Le Cong,
Phuc-Thanh Nguyen,
Anh M. T. Bui,
Phuong T. Nguyen,
Bach Le,
Quyet-Thang Huynh
Abstract:
Open-source development has revolutionized the software industry by promoting collaboration, transparency, and community-driven innovation. Today, a vast amount of various kinds of open-source software, which form networks of repositories, is often hosted on GitHub - a popular software development platform. To enhance the discoverability of the repository networks, i.e., groups of similar reposito…
▽ More
Open-source development has revolutionized the software industry by promoting collaboration, transparency, and community-driven innovation. Today, a vast amount of various kinds of open-source software, which form networks of repositories, is often hosted on GitHub - a popular software development platform. To enhance the discoverability of the repository networks, i.e., groups of similar repositories, GitHub introduced repository topics in 2017 that enable users to more easily explore relevant projects by type, technology, and more. It is thus crucial to accurately assign topics for each GitHub repository. Current methods for automatic topic recommendation rely heavily on TF-IDF for encoding textual data, presenting challenges in understanding semantic nuances. This paper addresses the limitations of existing techniques by proposing Legion, a novel approach that leverages Pre-trained Language Models (PTMs) for recommending topics for GitHub repositories. The key novelty of Legion is three-fold. First, Legion leverages the extensive capabilities of PTMs in language understanding to capture contextual information and semantic meaning in GitHub repositories. Second, Legion overcomes the challenge of long-tailed distribution, which results in a bias toward popular topics in PTMs, by proposing a Distribution-Balanced Loss (DB Loss) to better train the PTMs. Third, Legion employs a filter to eliminate vague recommendations, thereby improving the precision of PTMs. Our empirical evaluation on a benchmark dataset of real-world GitHub repositories shows that Legion can improve vanilla PTMs by up to 26% on recommending GitHubs topics. Legion also can suggest GitHub topics more precisely and effectively than the state-of-the-art baseline with an average improvement of 20% and 5% in terms of Precision and F1-score, respectively.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
Density-Regression: Efficient and Distance-Aware Deep Regressor for Uncertainty Estimation under Distribution Shifts
Authors:
Ha Manh Bui,
Anqi Liu
Abstract:
Morden deep ensembles technique achieves strong uncertainty estimation performance by going through multiple forward passes with different models. This is at the price of a high storage space and a slow speed in the inference (test) time. To address this issue, we propose Density-Regression, a method that leverages the density function in uncertainty estimation and achieves fast inference by a sin…
▽ More
Morden deep ensembles technique achieves strong uncertainty estimation performance by going through multiple forward passes with different models. This is at the price of a high storage space and a slow speed in the inference (test) time. To address this issue, we propose Density-Regression, a method that leverages the density function in uncertainty estimation and achieves fast inference by a single forward pass. We prove it is distance aware on the feature space, which is a necessary condition for a neural network to produce high-quality uncertainty estimation under distribution shifts. Empirically, we conduct experiments on regression tasks with the cubic toy dataset, benchmark UCI, weather forecast with time series, and depth estimation under real-world shifted applications. We show that Density-Regression has competitive uncertainty estimation performance under distribution shifts with modern deep regressors while using a lower model size and a faster inference speed.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Resonant Raman scattering of surface phonon polaritons mediated by excitons in WSe$_2$ films
Authors:
L. Zhou,
K. Wirth,
M. N. Bui,
R. Rani,
D. Grützmacher,
T. Taubner,
B. E. Kardynał
Abstract:
Surface phonon-polaritons propagating along interfaces of polar dielectrics coexist with excitons in many van der Waals heterostructures, so understanding their mutual interactions is of great interest. Here, we investigate the type I surface phonon polariton of hBN via low-temperature resonant-Raman spectroscopy in hBN/WSe2 heterostructures. The resonantly enhanced hBN surface phonon polariton (S…
▽ More
Surface phonon-polaritons propagating along interfaces of polar dielectrics coexist with excitons in many van der Waals heterostructures, so understanding their mutual interactions is of great interest. Here, we investigate the type I surface phonon polariton of hBN via low-temperature resonant-Raman spectroscopy in hBN/WSe2 heterostructures. The resonantly enhanced hBN surface phonon polariton (SPhP) Raman signal, when laser energy is such that the scattered photons have energy close to that of the WSe2 excitons, enables detailed characterization of type I SPhP in hBN even when hBN is one monolayer thick. We find that the measured bandwidth of the SPhP Raman signal depends on the thicknesses of the hBN layer. We are able explain the experimental data using transfer matrix method simulations of SPhP dispersions providing that we assume the Raman scattering to be momentum non-conserving, as could be the case if localized WSe2 exciton states participated in the process. We further show that resonant Raman scattering from SiO2 SPhP can also be mediated by WSe$_2$.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Landscape of nuclear deformation softness with spherical quasi-particle random phase approximation
Authors:
Le-Anh Nguyen,
Minh-Loc Bui,
Panagiota Papakonstantinou,
Naftali Auerbach
Abstract:
We investigate the stability and softness of nuclei against quadrupole, octupole, and hexadecapole deformation. By applying the spherical Skyrme-force Hartree-Fock Bardeen-Cooper-Schrieffer quasi-particle random phase approximation, we diagnose ground-state deformation when imaginary solutions are obtained, i.e., the spherical ground state {\em collapses}. We also calculate the multipole polarizab…
▽ More
We investigate the stability and softness of nuclei against quadrupole, octupole, and hexadecapole deformation. By applying the spherical Skyrme-force Hartree-Fock Bardeen-Cooper-Schrieffer quasi-particle random phase approximation, we diagnose ground-state deformation when imaginary solutions are obtained, i.e., the spherical ground state {\em collapses}. We also calculate the multipole polarizability in spherical nuclei with no collapse, as a measure of softness. This numerically light and theoretically sound method is found able to capture deformation patterns across the nuclide chart. The connection between the intrinsic shape of nuclei and the dynamics of their low-lying collective states is established and the role of shell structure is discussed.
△ Less
Submitted 11 January, 2024;
originally announced January 2024.
-
DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular Video
Authors:
Minh-Quan Viet Bui,
Jongmin Park,
Jihyong Oh,
Munchurl Kim
Abstract:
Neural Radiance Fields (NeRF), initially developed for static scenes, have inspired many video novel view synthesis techniques. However, the challenge for video view synthesis arises from motion blur, a consequence of object or camera movement during exposure, which hinders the precise synthesis of sharp spatio-temporal views. In response, we propose a novel dynamic deblurring NeRF framework for b…
▽ More
Neural Radiance Fields (NeRF), initially developed for static scenes, have inspired many video novel view synthesis techniques. However, the challenge for video view synthesis arises from motion blur, a consequence of object or camera movement during exposure, which hinders the precise synthesis of sharp spatio-temporal views. In response, we propose a novel dynamic deblurring NeRF framework for blurry monocular video, called DyBluRF, consisting of a Base Ray Initialization (BRI) stage and a Motion Decomposition-based Deblurring (MDD) stage. Our DyBluRF is the first that handles the novel view synthesis for blurry monocular video with a novel two-stage framework. In the BRI stage, we coarsely reconstruct dynamic 3D scenes and jointly initialize the base ray, which is further used to predict latent sharp rays, using the inaccurate camera pose information from the given blurry frames. In the MDD stage, we introduce a novel Incremental Latent Sharp-rays Prediction (ILSP) approach for the blurry monocular video frames by decomposing the latent sharp rays into global camera motion and local object motion components. We further propose two loss functions for effective geometry regularization and decomposition of static and dynamic scene components without any mask supervision. Experiments show that DyBluRF outperforms qualitatively and quantitatively the SOTA methods.
△ Less
Submitted 29 March, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
ProNeRF: Learning Efficient Projection-Aware Ray Sampling for Fine-Grained Implicit Neural Radiance Fields
Authors:
Juan Luis Gonzalez Bello,
Minh-Quan Viet Bui,
Munchurl Kim
Abstract:
Recent advances in neural rendering have shown that, albeit slow, implicit compact models can learn a scene's geometries and view-dependent appearances from multiple views. To maintain such a small memory footprint but achieve faster inference times, recent works have adopted `sampler' networks that adaptively sample a small subset of points along each ray in the implicit neural radiance fields. A…
▽ More
Recent advances in neural rendering have shown that, albeit slow, implicit compact models can learn a scene's geometries and view-dependent appearances from multiple views. To maintain such a small memory footprint but achieve faster inference times, recent works have adopted `sampler' networks that adaptively sample a small subset of points along each ray in the implicit neural radiance fields. Although these methods achieve up to a 10$\times$ reduction in rendering time, they still suffer from considerable quality degradation compared to the vanilla NeRF. In contrast, we propose ProNeRF, which provides an optimal trade-off between memory footprint (similar to NeRF), speed (faster than HyperReel), and quality (better than K-Planes). ProNeRF is equipped with a novel projection-aware sampling (PAS) network together with a new training strategy for ray exploration and exploitation, allowing for efficient fine-grained particle sampling. Our ProNeRF yields state-of-the-art metrics, being 15-23x faster with 0.65dB higher PSNR than NeRF and yielding 0.95dB higher PSNR than the best published sampler-based method, HyperReel. Our exploration and exploitation training strategy allows ProNeRF to learn the full scenes' color and density distributions while also learning efficient ray sampling focused on the highest-density regions. We provide extensive experimental results that support the effectiveness of our method on the widely adopted forward-facing and 360 datasets, LLFF and Blender, respectively.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Mechanical Attributes of Fractal Dragons
Authors:
Huy T. Q. Phan,
Duc M. Bui,
Cong T. Than,
Trung V. Phan
Abstract:
Fractals are ubiquitous natural emergences that have gained increased attention in engineering applications, thanks to recent technological advancements enabling the fabrication of structures spanning across many spatial scales. We show how the geometries of fractals can be exploited to determine their important mechanical properties, such as the first and second moments, which physically correspo…
▽ More
Fractals are ubiquitous natural emergences that have gained increased attention in engineering applications, thanks to recent technological advancements enabling the fabrication of structures spanning across many spatial scales. We show how the geometries of fractals can be exploited to determine their important mechanical properties, such as the first and second moments, which physically correspond to the center of mass and the moment of inertia, using a family of complex fractals known as the dragons.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Integral Resolvent and Proximal Mixtures
Authors:
Minh N. Bùi,
Patrick L. Combettes
Abstract:
Using the theory of Hilbert direct integrals, we introduce and study a monotonicity-preserving operation, termed the integral resolvent mixture. It combines arbitrary families of monotone operators acting on different spaces and linear operators. As a special case, we investigate the resolvent expectation, an operation which combines monotone operators in such a way that the resulting resolvent is…
▽ More
Using the theory of Hilbert direct integrals, we introduce and study a monotonicity-preserving operation, termed the integral resolvent mixture. It combines arbitrary families of monotone operators acting on different spaces and linear operators. As a special case, we investigate the resolvent expectation, an operation which combines monotone operators in such a way that the resulting resolvent is the Lebesgue expectation of the individual resolvents. Along the same lines, we introduce an operation that mixes arbitrary families of convex functions defined on different spaces and linear operators to create a composite convex function. Such constructs have so far been limited to finite families of operators and functions. The subdifferential of the integral proximal mixture is shown to be the integral resolvent mixture of the individual subdifferentials. Applications to the relaxation of systems of composite monotone inclusions are presented.
△ Less
Submitted 20 May, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Hilbert Direct Integrals of Monotone Operators
Authors:
Minh N. Bùi,
Patrick L. Combettes
Abstract:
Finite Cartesian products of operators play a central role in monotone operator theory and its applications. Extending such products to arbitrary families of operators acting on different Hilbert spaces is an open problem, which we address by introducing the Hilbert direct integral of a family of monotone operators. The properties of this construct are studied and conditions under which the direct…
▽ More
Finite Cartesian products of operators play a central role in monotone operator theory and its applications. Extending such products to arbitrary families of operators acting on different Hilbert spaces is an open problem, which we address by introducing the Hilbert direct integral of a family of monotone operators. The properties of this construct are studied and conditions under which the direct integral inherits the properties of the factor operators are provided. The question of determining whether the Hilbert direct integral of a family of subdifferentials of convex functions is itself a subdifferential leads us to introducing the Hilbert direct integral of a family of functions. We establish explicit expressions for evaluating the Legendre conjugate, subdifferential, recession function, Moreau envelope, and proximity operator of such integrals. Next, we propose a duality framework for monotone inclusion problems involving integrals of linearly composed monotone operators and show its pertinence towards the development of numerical solution methods. Applications to inclusion and variational problems are discussed.
△ Less
Submitted 15 May, 2024; v1 submitted 7 November, 2023;
originally announced November 2023.
-
Negative discrete moments of the derivative of the Riemann zeta-function
Authors:
Hung M. Bui,
Alexandra Florea,
Micah B. Milinovich
Abstract:
We obtain conditional upper bounds for negative discrete moments of the derivative of the Riemann zeta-function averaged over a subfamily of zeros of the zeta function which is expected to have full density inside the set of all zeros. For $k\leq 1/2$, our bounds for the $2k$-th moments are expected to be almost optimal. Assuming a conjecture about the maximum size of the argument of the zeta func…
▽ More
We obtain conditional upper bounds for negative discrete moments of the derivative of the Riemann zeta-function averaged over a subfamily of zeros of the zeta function which is expected to have full density inside the set of all zeros. For $k\leq 1/2$, our bounds for the $2k$-th moments are expected to be almost optimal. Assuming a conjecture about the maximum size of the argument of the zeta function on the critical line, we obtain upper bounds for these negative moments of the same strength while summing over a larger subfamily of zeta zeros.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Multi-Agent Reach-Avoid Games: Two Attackers Versus One Defender and Mixed Integer Programming
Authors:
Hanyang Hu,
Minh Bui,
Mo Chen
Abstract:
We propose a hybrid approach that combines Hamilton-Jacobi (HJ) reachability and mixed-integer optimization for solving a reach-avoid game with multiple attackers and defenders. The reach-avoid game is an important problem with potential applications in air traffic control and multi-agent motion planning; however, solving this game for many attackers and defenders is intractable due to the adversa…
▽ More
We propose a hybrid approach that combines Hamilton-Jacobi (HJ) reachability and mixed-integer optimization for solving a reach-avoid game with multiple attackers and defenders. The reach-avoid game is an important problem with potential applications in air traffic control and multi-agent motion planning; however, solving this game for many attackers and defenders is intractable due to the adversarial nature of the agents and the high problem dimensionality. In this paper, we first propose an HJ reachability-based method for solving the reach-avoid game in which 2 attackers are playing against 1 defender; we derive the numerically convergent optimal winning sets for the two sides in environments with obstacles. Utilizing this result and previous results for the 1 vs. 1 game, we further propose solving the general multi-agent reach-avoid game by determining the defender assignments that can maximize the number of attackers captured via a Mixed Integer Program (MIP). Our method generalizes previous state-of-the-art results and is especially useful when there are fewer defenders than attackers. We validate our theoretical results in numerical simulations.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
Numerical coupling of aerosol emissions, dry removal, and turbulent mixing in the E3SM Atmosphere Model version 1 (EAMv1), part II: a semi-discrete error analysis framework for assessing coupling schemes
Authors:
Christopher J. Vogl,
Hui Wan,
Carol S. Woodward,
Quan M. Bui
Abstract:
This paper complements the empirical justification of the revised scheme in Part I of this work with a mathematical justification leveraging a semi-discrete analysis framework for assessing the splitting error of process coupling methods. The novelty of the framework is that splitting error is distinguished from the process time integration errors, i.e., the errors caused by discrete time integrat…
▽ More
This paper complements the empirical justification of the revised scheme in Part I of this work with a mathematical justification leveraging a semi-discrete analysis framework for assessing the splitting error of process coupling methods. The novelty of the framework is that splitting error is distinguished from the process time integration errors, i.e., the errors caused by discrete time integration of individual processes, leading to expressions that are more easily interpreted utilizing existing physical understanding of the processes that the terms represent. This application of this framework to dust life cycle in EAMv1 showcases such an interpretation, using the leading-order splitting error that results from the framework to confirm (i) that the original EAMv1 scheme artificially strengthens the effect of dry removal processes, and (ii) that the revised splitting reduces that artificial strengthening. While the error analysis framework is presented in the context of the dust life cycle in EAMv1, the framework can be broadly leveraged to evaluate process coupling schemes, both in other physical problems and for any number of processes. This framework will be particularly powerful when the various process implementations support a variety of time integration approaches. Whereas traditional local truncation error approaches require separate consideration of each combination of time integration methods, this framework enables evaluation of coupling schemes independent of particular time integration approaches for each process while still allowing for the incorporation of these specific time integration errors if so desired. The framework also explains how the splitting error terms result from (i) the integration of individual processes in isolation from other processes, and (ii) the choices of input state and timestep size for the isolated integration of processes.
△ Less
Submitted 20 February, 2024; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Development of a Power Quality Based Digital Energy Meter Educational Platform
Authors:
Mislav Bui,
Marko Jurčević
Abstract:
Phasor Measurement Units (PMUs) are being used extensively for electrical grid monitoring and control. However, their cost prohibits further adoption on the distribution grid and easy access for educational purposes. This paper proposes that simple and fundamental functions of a PMU can be achieved using an energy metering IC and integrated into smart electricity meters, providing a lower cost and…
▽ More
Phasor Measurement Units (PMUs) are being used extensively for electrical grid monitoring and control. However, their cost prohibits further adoption on the distribution grid and easy access for educational purposes. This paper proposes that simple and fundamental functions of a PMU can be achieved using an energy metering IC and integrated into smart electricity meters, providing a lower cost and more widely available method of monitoring and control of distribution grids, and presents a proof-of-concept platform with aforementioned functionality. The described platform's construction and flexibility emphasizes its educational capabilities PMUs and electricity meters as well.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
A Generic Approach to Integrating Time into Spatial-Temporal Forecasting via Conditional Neural Fields
Authors:
Minh-Thanh Bui,
Duc-Thinh Ngo,
Demin Lu,
Zonghua Zhang
Abstract:
Self-awareness is the key capability of autonomous systems, e.g., autonomous driving network, which relies on highly efficient time series forecasting algorithm to enable the system to reason about the future state of the environment, as well as its effect on the system behavior as time progresses. Recently, a large number of forecasting algorithms using either convolutional neural networks or gra…
▽ More
Self-awareness is the key capability of autonomous systems, e.g., autonomous driving network, which relies on highly efficient time series forecasting algorithm to enable the system to reason about the future state of the environment, as well as its effect on the system behavior as time progresses. Recently, a large number of forecasting algorithms using either convolutional neural networks or graph neural networks have been developed to exploit the complex temporal and spatial dependencies present in the time series. While these solutions have shown significant advantages over statistical approaches, one open question is to effectively incorporate the global information which represents the seasonality patterns via the time component of time series into the forecasting models to improve their accuracy. This paper presents a general approach to integrating the time component into forecasting models. The main idea is to employ conditional neural fields to represent the auxiliary features extracted from the time component to obtain the global information, which will be effectively combined with the local information extracted from autoregressive neural networks through a layer-wise gated fusion module. Extensive experiments on road traffic and cellular network traffic datasets prove the effectiveness of the proposed approach.
△ Less
Submitted 17 May, 2023; v1 submitted 11 May, 2023;
originally announced May 2023.
-
Interchange Rules for Integral Functions
Authors:
Minh N. Bùi,
Patrick L. Combettes
Abstract:
We first present an abstract principle for the interchange of infimization and integration over spaces of map**s taking values in topological spaces. New conditions on the underlying space and the integrand are then introduced to convert this principle into concrete scenarios that are shown to capture those of various existing interchange rules. These results are leveraged to improve state-of-th…
▽ More
We first present an abstract principle for the interchange of infimization and integration over spaces of map**s taking values in topological spaces. New conditions on the underlying space and the integrand are then introduced to convert this principle into concrete scenarios that are shown to capture those of various existing interchange rules. These results are leveraged to improve state-of-the-art interchange rules for evaluating Legendre conjugates, subdifferentials, recessions, Moreau envelopes, and proximity operators of integral functions by bringing the corresponding operations under the integral sign.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Optical properties of MoSe$_2$ monolayer implanted with ultra-low energy Cr ions
Authors:
Minh N. Bui,
Stefan Rost,
Manuel Auge,
Lanqing Zhou,
Christoph Friedrich,
Stefan Blügel,
Silvan Kretschmer,
Arkady V. Krasheninnikov,
Kenji Watanabe,
Takashi Taniguchi,
Hans C. Hofsäss,
Detlev Grützmacher,
Beata E. Kardynał
Abstract:
The paper explores the optical properties of an exfoliated MoSe$_2$ monolayer implanted with Cr$^+$ ions, accelerated to 25 eV. Photoluminescence of the implanted MoSe$_2$ reveals an emission line from Cr-related defects that is present only under weak electron do**. Unlike band-to-band transition, the Cr-introduced emission is characterised by non-zero activation energy, long lifetimes, and wea…
▽ More
The paper explores the optical properties of an exfoliated MoSe$_2$ monolayer implanted with Cr$^+$ ions, accelerated to 25 eV. Photoluminescence of the implanted MoSe$_2$ reveals an emission line from Cr-related defects that is present only under weak electron do**. Unlike band-to-band transition, the Cr-introduced emission is characterised by non-zero activation energy, long lifetimes, and weak response to the magnetic field. To rationalise the experimental results and get insights into the atomic structure of the defects, we modelled the Cr-ion irradiation process using ab-initio molecular dynamics simulations followed by the electronic structure calculations of the system with defects. The experimental and theoretical results suggest that the recombination of electrons on the acceptors, which could be introduced by the Cr implantation-induced defects, with the valence band holes is the most likely origin of the low energy emission. Our results demonstrate the potential of low-energy ion implantation as a tool to tailor the properties of 2D materials by do**.
△ Less
Submitted 25 April, 2023; v1 submitted 21 April, 2023;
originally announced April 2023.
-
A note on the zeros of the derivatives of Hardy's function $Z(t)$
Authors:
Hung M. Bui,
R. R. Hall
Abstract:
Using the twisted fourth moment of the Riemann zeta-function we study large gaps between consecutive zeros of the derivatives of Hardy's function $Z(t)$, improving upon previous results of Conrey and Ghosh [J. London Math. Soc. 32 (1985), 193--202], and of the second named author [Acta Arith. 111 (2004), 125--140]. We also exhibit small distances between the zeros of $Z(t)$ and the zeros of…
▽ More
Using the twisted fourth moment of the Riemann zeta-function we study large gaps between consecutive zeros of the derivatives of Hardy's function $Z(t)$, improving upon previous results of Conrey and Ghosh [J. London Math. Soc. 32 (1985), 193--202], and of the second named author [Acta Arith. 111 (2004), 125--140]. We also exhibit small distances between the zeros of $Z(t)$ and the zeros of $Z^{(2k)}(t)$ for every $k\in\mathbb{N}$, in support of our numerical observation that the zeros of $Z^{(k)}(t)$ and $Z^{(\ell)}(t)$, when $k$ and $\ell$ have the same parity, seem to come in pairs which are very close to each other. The latter result is obtained using the mollified discrete second moment of the Riemann zeta-function.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
On the derivatives of Hardy's function $Z(t)$
Authors:
Hung M. Bui,
R. R. Hall
Abstract:
Let $Z^{(k)}(t)$ be the $k$-th derivative of Hardy's $Z$-function. The numerics seem to suggest that if $k$ and $\ell$ have the same parity, then the zeros of $Z^{(k)}(t)$ and $Z^{(\ell)}(t)$ come in pairs which are very close to each other. That is to say that $Z^{(k)}(t)Z^{(\ell)}(t)$ has constant sign for the majority, if not almost all, of values $t$. In this paper we show that this is true a…
▽ More
Let $Z^{(k)}(t)$ be the $k$-th derivative of Hardy's $Z$-function. The numerics seem to suggest that if $k$ and $\ell$ have the same parity, then the zeros of $Z^{(k)}(t)$ and $Z^{(\ell)}(t)$ come in pairs which are very close to each other. That is to say that $Z^{(k)}(t)Z^{(\ell)}(t)$ has constant sign for the majority, if not almost all, of values $t$. In this paper we show that this is true a positive proportion of times. We also study the sign of the product of four derivatives of Hardy's function, $Z^{(k)}(t)Z^{(\ell)}(t)Z^{(m)}(t)Z^{(n)}(t)$.
△ Less
Submitted 11 April, 2023;
originally announced April 2023.
-
Origin of octupole deformation softness in atomic nuclei
Authors:
Minh-Loc Bui,
Le-Anh Nguyen,
Panagiota Papakonstantinou,
Naftali Auerbach
Abstract:
Recent high-energy heavy ion collision experiments have revealed that some atomic nuclei exhibit unusual softness and significant shape fluctuations. In this work, we use the fully self-consistent mean-field theory to identify all even-even nuclei that are unstable or soft against octupole deformation. All exceptional cases of enhanced octupole transition strengths in stable even-even nuclei throu…
▽ More
Recent high-energy heavy ion collision experiments have revealed that some atomic nuclei exhibit unusual softness and significant shape fluctuations. In this work, we use the fully self-consistent mean-field theory to identify all even-even nuclei that are unstable or soft against octupole deformation. All exceptional cases of enhanced octupole transition strengths in stable even-even nuclei throughout the nuclide chart are resolved and the origin is found in basic shell structure. The presence of atomic nuclei exhibiting significant softness to quadrupole-octupole deformation is suggested. These results represent a significant advance in our understanding of the underlying mechanisms of nuclear octupole deformation and have implications for further experimental and theoretical studies.
△ Less
Submitted 25 July, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Proton \textit{s}-resonance states of $^{12}$C and $^{14,15}$O within the Skyrme Hartree-Fock mean-field framework
Authors:
Le-Anh Nguyen,
Young-ho Song,
Minh-Loc Bui
Abstract:
The excitation functions of proton elastic scattering on $^{12}$C and $^{14,15}$O nuclei at the energies near the proton-emission threshold are calculated using the Skyrme Hartree-Fock (SHF) in continuum approach. For each excitation function, the first resonance is identified as the $s$-state resonance of the mean-field theory. For $^{15}$O, whose ground-state spin is nonzero, the $s$-state reson…
▽ More
The excitation functions of proton elastic scattering on $^{12}$C and $^{14,15}$O nuclei at the energies near the proton-emission threshold are calculated using the Skyrme Hartree-Fock (SHF) in continuum approach. For each excitation function, the first resonance is identified as the $s$-state resonance of the mean-field theory. For $^{15}$O, whose ground-state spin is nonzero, the $s$-state resonance splits into two resonances via the spin-spin component of the optical potential. With a slight adjustment of the strength of central potential, which is obtained from the SHF in continuum approach, the excitation functions of proton elastic scattering for the three nuclei can be explained with high accuracy. The proposed framework can provide a practical method to explain nuclear scattering at the energies near the proton-emission threshold with minimal experimental input.
△ Less
Submitted 1 March, 2023;
originally announced March 2023.
-
Negative moments of the Riemann zeta-function
Authors:
Hung M. Bui,
Alexandra Florea
Abstract:
Assuming the Riemann Hypothesis we study negative moments of the Riemann zeta-function and obtain asymptotic formulas in certain ranges of the shift in $ζ(s)$. For example, integrating $|ζ(1/2+α+it)|^{-2k}$ with respect to $t$ from $T$ to $2T$, we obtain an asymptotic formula when the shift $α$ is roughly bigger than $\frac{1}{\log T}$ and $k < 1/2$. We also obtain non-trivial upper bounds for muc…
▽ More
Assuming the Riemann Hypothesis we study negative moments of the Riemann zeta-function and obtain asymptotic formulas in certain ranges of the shift in $ζ(s)$. For example, integrating $|ζ(1/2+α+it)|^{-2k}$ with respect to $t$ from $T$ to $2T$, we obtain an asymptotic formula when the shift $α$ is roughly bigger than $\frac{1}{\log T}$ and $k < 1/2$. We also obtain non-trivial upper bounds for much smaller shifts, as long as $\log\frac{1}α \ll \log \log T$. This provides partial progress towards a conjecture of Gonek on negative moments of the Riemann zeta-function, and settles the conjecture in certain ranges. As an application, we also obtain an upper bound for the average of the generalized Möbius function.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
Density-Softmax: Efficient Test-time Model for Uncertainty Estimation and Robustness under Distribution Shifts
Authors:
Ha Manh Bui,
Anqi Liu
Abstract:
Sampling-based methods, e.g., Deep Ensembles and Bayesian Neural Nets have become promising approaches to improve the quality of uncertainty estimation and robust generalization. However, they suffer from a large model size and high latency at test-time, which limits the scalability needed for low-resource devices and real-time applications. To resolve these computational issues, we propose Densit…
▽ More
Sampling-based methods, e.g., Deep Ensembles and Bayesian Neural Nets have become promising approaches to improve the quality of uncertainty estimation and robust generalization. However, they suffer from a large model size and high latency at test-time, which limits the scalability needed for low-resource devices and real-time applications. To resolve these computational issues, we propose Density-Softmax, a sampling-free deterministic framework via combining a density function built on a Lipschitz-constrained feature extractor with the softmax layer. Theoretically, we show that our model is the solution of minimax uncertainty risk and is distance-aware on feature space, thus reducing the over-confidence of the standard softmax under distribution shifts. Empirically, our method enjoys competitive results with state-of-the-art techniques in terms of uncertainty and robustness, while having a lower number of model parameters and a lower latency at test-time.
△ Less
Submitted 27 May, 2024; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Benchmark for Uncertainty & Robustness in Self-Supervised Learning
Authors:
Ha Manh Bui,
Iliana Maifeld-Carucci
Abstract:
Self-Supervised Learning (SSL) is crucial for real-world applications, especially in data-hungry domains such as healthcare and self-driving cars. In addition to a lack of labeled data, these applications also suffer from distributional shifts. Therefore, an SSL method should provide robust generalization and uncertainty estimation in the test dataset to be considered a reliable model in such high…
▽ More
Self-Supervised Learning (SSL) is crucial for real-world applications, especially in data-hungry domains such as healthcare and self-driving cars. In addition to a lack of labeled data, these applications also suffer from distributional shifts. Therefore, an SSL method should provide robust generalization and uncertainty estimation in the test dataset to be considered a reliable model in such high-stakes domains. However, existing approaches often focus on generalization, without evaluating the model's uncertainty. The ability to compare SSL techniques for improving these estimates is therefore critical for research on the reliability of self-supervision models. In this paper, we explore variants of SSL methods, including Jigsaw Puzzles, Context, Rotation, Geometric Transformations Prediction for vision, as well as BERT and GPT for language tasks. We train SSL in auxiliary learning for vision and pre-training for language model, then evaluate the generalization (in-out classification accuracy) and uncertainty (expected calibration error) across different distribution covariate shift datasets, including MNIST-C, CIFAR-10-C, CIFAR-10.1, and MNLI. Our goal is to create a benchmark with outputs from experiments, providing a starting point for new SSL methods in Reliable Machine Learning. All source code to reproduce results is available at https://github.com/hamanhbui/reliable_ssl_baselines.
△ Less
Submitted 23 December, 2022;
originally announced December 2022.
-
Asymmetry in the Complexity of the Multi-Commodity Network Pricing Problem
Authors:
Quang Minh Bui,
Margarida Carvalho,
José Neto
Abstract:
The network pricing problem (NPP) is a bilevel problem, where the leader optimizes its revenue by deciding on the prices of certain arcs in a graph, while expecting the followers (also known as the commodities) to choose a shortest path based on those prices. In this paper, we investigate the complexity of the NPP with respect to two parameters: the number of tolled arcs, and the number of commodi…
▽ More
The network pricing problem (NPP) is a bilevel problem, where the leader optimizes its revenue by deciding on the prices of certain arcs in a graph, while expecting the followers (also known as the commodities) to choose a shortest path based on those prices. In this paper, we investigate the complexity of the NPP with respect to two parameters: the number of tolled arcs, and the number of commodities. We devise a simple algorithm showing that if the number of tolled arcs is fixed, then the problem can be solved in polynomial time with respect to the number of commodities. In contrast, even if there is only one commodity, once the number of tolled arcs is not fixed, the problem becomes NP-hard. We characterize this asymmetry in the complexity with a novel property named strong bilevel feasibility. Finally, we describe an algorithm to generate valid inequalities to the NPP based on this property, accommodated with numerical results to demonstrate its effectiveness in solving the NPP with a high number of commodities.
△ Less
Submitted 12 January, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
A problem of Erdős-Graham-Granville-Selfridge on integral points on hyperelliptic curves
Authors:
Hung M. Bui,
Kyle Pratt,
Alexandru Zaharescu
Abstract:
Erdős, Graham, and Selfridge considered, for each positive integer $n$, the least value of $t_n$ so that the integers $n+1, n+2, \dots, n+t_n $ contain a subset the product of whose members with $n$ is a square. An open problem posed by Granville concerns the size of $t_n$, under the assumption of the ABC Conjecture. We establish some results on the distribution of $t_n$, and in the process solve…
▽ More
Erdős, Graham, and Selfridge considered, for each positive integer $n$, the least value of $t_n$ so that the integers $n+1, n+2, \dots, n+t_n $ contain a subset the product of whose members with $n$ is a square. An open problem posed by Granville concerns the size of $t_n$, under the assumption of the ABC Conjecture. We establish some results on the distribution of $t_n$, and in the process solve Granville's problem unconditionally.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
An Effective Deep Network for Head Pose Estimation without Keypoints
Authors:
Chien Thai,
Viet Tran,
Minh Bui,
Huong Ninh,
Hai Tran
Abstract:
Human head pose estimation is an essential problem in facial analysis in recent years that has a lot of computer vision applications such as gaze estimation, virtual reality, and driver assistance. Because of the importance of the head pose estimation problem, it is necessary to design a compact model to resolve this task in order to reduce the computational cost when deploying on facial analysis-…
▽ More
Human head pose estimation is an essential problem in facial analysis in recent years that has a lot of computer vision applications such as gaze estimation, virtual reality, and driver assistance. Because of the importance of the head pose estimation problem, it is necessary to design a compact model to resolve this task in order to reduce the computational cost when deploying on facial analysis-based applications such as large camera surveillance systems, AI cameras while maintaining accuracy. In this work, we propose a lightweight model that effectively addresses the head pose estimation problem. Our approach has two main steps. 1) We first train many teacher models on the synthesis dataset - 300W-LPA to get the head pose pseudo labels. 2) We design an architecture with the ResNet18 backbone and train our proposed model with the ensemble of these pseudo labels via the knowledge distillation process. To evaluate the effectiveness of our model, we use AFLW-2000 and BIWI - two real-world head pose datasets. Experimental results show that our proposed model significantly improves the accuracy in comparison with the state-of-the-art head pose estimation methods. Furthermore, our model has the real-time speed of $\sim$300 FPS when inferring on Tesla V100.
△ Less
Submitted 24 October, 2022;
originally announced October 2022.
-
Small gaps and small spacings between zeta zeros
Authors:
Hung M. Bui,
Daniel A. Goldston,
Micah B. Milinovich,
Hugh L. Montgomery
Abstract:
We show assuming RH that phenomena concerning pairs of zeros established $via$ pair correlations occur with positive density (with at most a slight adjustment of the constants). Also, while a double zero is commonly considered to be a close pair, we consider the difference between two $distinct$ zeros.
We show assuming RH that phenomena concerning pairs of zeros established $via$ pair correlations occur with positive density (with at most a slight adjustment of the constants). Also, while a double zero is commonly considered to be a close pair, we consider the difference between two $distinct$ zeros.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Low-energy $^7$Li($n,γ$)$^8$Li and $^7$Be($p,γ$)$^8$B radiative capture reactions within the Skyrme Hartree-Fock approach
Authors:
Le-Anh Nguyen,
Minh-Loc Bui
Abstract:
The electromagnetic dipole transitions in $^7$Be($p,γ$)$^8$B and $^7$Li($n,γ$)$^8$Li reactions at the keV-energy region were analyzed simultaneously within the Skyrme Hartree-Fock potential model. The Skyrme Hartree-Fock calculation is adopted as a microscopic approach to obtain consistently the single-particle bound and scattering states in the calculation of the radial overlap function within th…
▽ More
The electromagnetic dipole transitions in $^7$Be($p,γ$)$^8$B and $^7$Li($n,γ$)$^8$Li reactions at the keV-energy region were analyzed simultaneously within the Skyrme Hartree-Fock potential model. The Skyrme Hartree-Fock calculation is adopted as a microscopic approach to obtain consistently the single-particle bound and scattering states in the calculation of the radial overlap function within the potential model. All non-resonant and resonant electromagnetic dipole transitions are taken into account. The electric dipole transitions are successfully described with the slightest adjustment. The resonant magnetic dipole transitions at $633$ keV and $2184$ keV of $^7$Be($p,γ$)$^8$B reaction, and the one at $222$ keV of $^7$Li($n,γ$)$^8$Li are also analyzed. The astrophysical $\mathcal{S}_{17}(0)$ factor of $^7$Be($p,γ$)$^8$B reaction is found to be $22.3$ eV\,b.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
Single-particle properties of the near-threshold proton-emitting resonance in $^{11}$B
Authors:
Le-Anh Nguyen,
Minh-Loc Bui,
Naftali Auerbach,
Vladimir Zelevinsky
Abstract:
The excitation function of proton elastic scattering from $^{10}$Be at keV energy is calculated using the self-consistent Skyrme Hartree-Fock in the continuum method. The calculation successfully reproduces the narrow near-threshold proton-emitting resonance ($E_x = 11.4$ MeV, $Γ= 6$ keV, and quantum number $J^π = 1/2^+$) in $^{11}$B relevant to the $β$-delayed proton emission of $^{11}$Be. This s…
▽ More
The excitation function of proton elastic scattering from $^{10}$Be at keV energy is calculated using the self-consistent Skyrme Hartree-Fock in the continuum method. The calculation successfully reproduces the narrow near-threshold proton-emitting resonance ($E_x = 11.4$ MeV, $Γ= 6$ keV, and quantum number $J^π = 1/2^+$) in $^{11}$B relevant to the $β$-delayed proton emission of $^{11}$Be. This supports the recent experimental result of Y. Ayyad \textit{et al.} at the ReA3 re-accelerator facility of the National Superconducting Cyclotron Laboratory (NSCL) at the Michigan State University. The resonance is interpreted as the $s_{1/2}$ single-proton resonance state in the Skyrme Hartree-Fock mean-field theory.
△ Less
Submitted 2 June, 2022;
originally announced June 2022.
-
Power savings for counting solutions to polynomial-factorial equations
Authors:
Hung M. Bui,
Kyle Pratt,
Alexandru Zaharescu
Abstract:
Let $P$ be a polynomial with integer coefficients and degree at least two. We prove an upper bound on the number of integer solutions $n\leq N$ to $n! = P(x)$ which yields a power saving over the trivial bound. In particular, this applies to a century-old problem of Brocard and Ramanujan. The previous best result was that the number of solutions is $o(N)$. The proof uses techniques of Diophantine…
▽ More
Let $P$ be a polynomial with integer coefficients and degree at least two. We prove an upper bound on the number of integer solutions $n\leq N$ to $n! = P(x)$ which yields a power saving over the trivial bound. In particular, this applies to a century-old problem of Brocard and Ramanujan. The previous best result was that the number of solutions is $o(N)$. The proof uses techniques of Diophantine and Padé approximation.
△ Less
Submitted 18 April, 2022;
originally announced April 2022.
-
OptimizedDP: An Efficient, User-friendly Library For Optimal Control and Dynamic Programming
Authors:
Minh Bui,
George Giovanis,
Mo Chen,
Arrvindh Shriraman
Abstract:
This paper introduces OptimizedDP, a high-performance software library that solves time-dependent Hamilton-Jacobi partial differential equation (PDE), computes backward reachable sets with application in robotics, and contains value iterations algorithm implementation for continuous action-state space Markov Decision Process (MDP) while leveraging user-friendliness of Python for different problem…
▽ More
This paper introduces OptimizedDP, a high-performance software library that solves time-dependent Hamilton-Jacobi partial differential equation (PDE), computes backward reachable sets with application in robotics, and contains value iterations algorithm implementation for continuous action-state space Markov Decision Process (MDP) while leveraging user-friendliness of Python for different problem specifications without sacrificing efficiency of the core computation. These algorithms are all based on dynamic programming, and hence can both be challenging to implement and have bad execution runtime due to the large high-dimensional tabular arrays. Although there are existing toolboxes for level set methods that are used to solve the HJ PDE, our toolbox makes solving the PDE at higher dimensions possible as well as having an order of magnitude improvement in execution times compared to other toolboxes while kee** the interface easy to specify different dynamical systems description. Our toolbox is available online at https://github.com/SFU-MARS/optimized_dp.
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
Transformer-based Approaches for Legal Text Processing
Authors:
Ha-Thanh Nguyen,
Minh-Phuong Nguyen,
Thi-Hai-Yen Vuong,
Minh-Quan Bui,
Minh-Chau Nguyen,
Tran-Binh Dang,
Vu Tran,
Le-Minh Nguyen,
Ken Satoh
Abstract:
In this paper, we introduce our approaches using Transformer-based models for different problems of the COLIEE 2021 automatic legal text processing competition. Automated processing of legal documents is a challenging task because of the characteristics of legal documents as well as the limitation of the amount of data. With our detailed experiments, we found that Transformer-based pretrained lang…
▽ More
In this paper, we introduce our approaches using Transformer-based models for different problems of the COLIEE 2021 automatic legal text processing competition. Automated processing of legal documents is a challenging task because of the characteristics of legal documents as well as the limitation of the amount of data. With our detailed experiments, we found that Transformer-based pretrained language models can perform well with automated legal text processing problems with appropriate approaches. We describe in detail the processing steps for each task such as problem formulation, data processing and augmentation, pretraining, finetuning. In addition, we introduce to the community two pretrained models that take advantage of parallel translations in legal domain, NFSP and NMSP. In which, NFSP achieves the state-of-the-art result in Task 5 of the competition. Although the paper focuses on technical reporting, the novelty of its approaches can also be an useful reference in automated legal document processing using Transformer-based models.
△ Less
Submitted 13 February, 2022;
originally announced February 2022.
-
Analysis and Numerical Solution of a Modular Convex Nash Equilibrium Problem
Authors:
Minh N. Bùi,
Patrick L. Combettes
Abstract:
We investigate a modular convex Nash equilibrium problem involving nonsmooth functions acting on linear mixtures of strategies, as well as smooth coupling functions. An asynchronous block-iterative decomposition method is proposed to solve it.
We investigate a modular convex Nash equilibrium problem involving nonsmooth functions acting on linear mixtures of strategies, as well as smooth coupling functions. An asynchronous block-iterative decomposition method is proposed to solve it.
△ Less
Submitted 2 November, 2021;
originally announced November 2021.
-
Personalized breath based biometric authentication with wearable multimodality
Authors:
Manh-Ha Bui,
Viet-Anh Tran,
Cuong Pham
Abstract:
Breath with nose sound features has been shown as a potential biometric in personal identification and verification. In this paper, we show that information that comes from other modalities captured by motion sensors on the chest in addition to audio features could further improve the performance. Our work is composed of three main contributions: hardware creation, dataset publication, and propose…
▽ More
Breath with nose sound features has been shown as a potential biometric in personal identification and verification. In this paper, we show that information that comes from other modalities captured by motion sensors on the chest in addition to audio features could further improve the performance. Our work is composed of three main contributions: hardware creation, dataset publication, and proposed multimodal models. To be more specific, we design new hardware which consists of an acoustic sensor to collect audio features from the nose, as well as an accelerometer and gyroscope to collect movement on the chest as a result of an individual's breathing. Using this hardware, we publish a collected dataset from a number of sessions from different volunteers, each session includes three common gestures: normal, deep, and strong breathing. Finally, we experiment with two multimodal models based on Convolutional Long Short Term Memory (CNN-LSTM) and Temporal Convolutional Networks (TCN) architectures. The results demonstrate the suitability of our new hardware for both verification and identification tasks.
△ Less
Submitted 29 October, 2021;
originally announced October 2021.
-
Exploiting Domain-Specific Features to Enhance Domain Generalization
Authors:
Manh-Ha Bui,
Toan Tran,
Anh Tuan Tran,
Dinh Phung
Abstract:
Domain Generalization (DG) aims to train a model, from multiple observed source domains, in order to perform well on unseen target domains. To obtain the generalization capability, prior DG approaches have focused on extracting domain-invariant information across sources to generalize on target domains, while useful domain-specific information which strongly correlates with labels in individual do…
▽ More
Domain Generalization (DG) aims to train a model, from multiple observed source domains, in order to perform well on unseen target domains. To obtain the generalization capability, prior DG approaches have focused on extracting domain-invariant information across sources to generalize on target domains, while useful domain-specific information which strongly correlates with labels in individual domains and the generalization to target domains is usually ignored. In this paper, we propose meta-Domain Specific-Domain Invariant (mDSDI) - a novel theoretically sound framework that extends beyond the invariance view to further capture the usefulness of domain-specific information. Our key insight is to disentangle features in the latent space while jointly learning both domain-invariant and domain-specific features in a unified framework. The domain-specific representation is optimized through the meta-learning framework to adapt from source domains, targeting a robust generalization on unseen domains. We empirically show that mDSDI provides competitive results with state-of-the-art techniques in DG. A further ablation study with our generated dataset, Background-Colored-MNIST, confirms the hypothesis that domain-specific is essential, leading to better results when compared with only using domain-invariant.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Non-chromatic-adherence of the DP Color Function via Generalized Theta Graphs
Authors:
Manh Vu Bui,
Hemanshu Kaul,
Michael Maxfield,
Jeffrey A. Mudrock,
Paul Shin,
Seth Thomason
Abstract:
DP-coloring (also called correspondence coloring) is a generalization of list coloring that has been widely studied in recent years after its introduction by Dvořák and Postle in 2015. The chromatic polynomial of a graph is an extensively studied notion in combinatorics since its introduction by Birkhoff in 1912; denoted $P(G,m)$, it equals the number of proper $m$-colorings of graph $G$. Counting…
▽ More
DP-coloring (also called correspondence coloring) is a generalization of list coloring that has been widely studied in recent years after its introduction by Dvořák and Postle in 2015. The chromatic polynomial of a graph is an extensively studied notion in combinatorics since its introduction by Birkhoff in 1912; denoted $P(G,m)$, it equals the number of proper $m$-colorings of graph $G$. Counting function analogues of the chromatic polynomial have been introduced and studied for list colorings: $P_{\ell}$, the list color function (1990); DP colorings: $P_{DP}$, the DP color function (2019), and $P^*_{DP}$, the dual DP color function (2021). For any graph $G$ and $m \in \mathbb{N}$, $P_{DP}(G, m) \leq P_\ell(G,m) \leq P(G,m) \leq P_{DP}^*(G,m)$. A function $f$ is chromatic-adherent if for every graph $G$, $f(G,a) = P(G,a)$ for some $a \geq χ(G)$ implies that $f(G,m) = P(G,m)$ for all $m \geq a$. It is not known if the list color function and the DP color function are chromatic-adherent. We show that the DP color function is not chromatic-adherent by studying the DP color function of Generalized Theta graphs. The tools we develop along with the Rearrangement Inequality give a new method for determining the DP color function of all Theta graphs and the dual DP color function of all Generalized Theta graphs.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
Targeted Ads and/as Racial Discrimination: Exploring Trends in New York City Ads for College Scholarships
Authors:
Ho-Chun Herbert Chang,
Matt Bui,
Charlton McIlwain
Abstract:
This paper uses and recycles data from a third-party digital marketing firm, to explore how targeted ads contribute to larger systems of racial discrimination. Focusing on a case study of targeted ads for educational searches in New York City, it discusses data visualizations and map**s of trends in the advertisements' targeted populations alongside U.S census data corresponding to these target…
▽ More
This paper uses and recycles data from a third-party digital marketing firm, to explore how targeted ads contribute to larger systems of racial discrimination. Focusing on a case study of targeted ads for educational searches in New York City, it discusses data visualizations and map**s of trends in the advertisements' targeted populations alongside U.S census data corresponding to these target zipcodes. We summarize and reflect on the results to consider how internet platforms systemically and differentially target advertising messages to users based on race; the tangible harms and risks that result from an internet traffic system designed to discriminate; and finally, novel approaches and frameworks for further auditing systems amid opaque, black-boxed processes forestalling transparency and accountability.
△ Less
Submitted 30 September, 2021;
originally announced September 2021.
-
Penalized Likelihood Methods for Modeling Count Data
Authors:
Minh Thu Bui,
Cornelis J. Potgieter,
Akihito Kamata
Abstract:
The paper considers parameter estimation in count data models using penalized likelihood methods. The motivating data consists of multiple independent count variables with a moderate sample size per variable. The data were collected during the assessment of oral reading fluency (ORF) in school-aged children. A sample of fourth-grade students were given one of ten available passages to read with th…
▽ More
The paper considers parameter estimation in count data models using penalized likelihood methods. The motivating data consists of multiple independent count variables with a moderate sample size per variable. The data were collected during the assessment of oral reading fluency (ORF) in school-aged children. A sample of fourth-grade students were given one of ten available passages to read with these differing in length and difficulty. The observed number of words read incorrectly (WRI) is used to measure ORF. Three models are considered for WRI scores, namely the binomial, the zero-inflated binomial, and the beta-binomial. We aim to efficiently estimate passage difficulty, a quantity expressed as a function of the underlying model parameters. Two types of penalty functions are considered for penalized likelihood with respective goals of shrinking parameter estimates closer to zero or closer to one another. A simulation study evaluates the efficacy of the shrinkage estimates using Mean Square Error (MSE) as metric. Big reductions in MSE relative to unpenalized maximum likelihood are observed. The paper concludes with an analysis of the motivating ORF data.
△ Less
Submitted 12 May, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
The Ratios Conjecture and upper bounds for negative moments of $L$-functions over function fields
Authors:
Hung M. Bui,
Alexandra Florea,
Jonathan P. Keating
Abstract:
We prove special cases of the Ratios Conjecture for the family of quadratic Dirichlet $L$--functions over function fields. More specifically, we study the average of $L(1/2+α,χ_D)/L(1/2+β,χ_D)$, when $D$ varies over monic, square-free polynomials of degree $2g+1$ over $\mathbb{F}_q[x]$, as $g \to \infty$, and we obtain an asymptotic formula when $\Re β\gg g^{-1/2+\varepsilon}$. We also study avera…
▽ More
We prove special cases of the Ratios Conjecture for the family of quadratic Dirichlet $L$--functions over function fields. More specifically, we study the average of $L(1/2+α,χ_D)/L(1/2+β,χ_D)$, when $D$ varies over monic, square-free polynomials of degree $2g+1$ over $\mathbb{F}_q[x]$, as $g \to \infty$, and we obtain an asymptotic formula when $\Re β\gg g^{-1/2+\varepsilon}$. We also study averages of products of $2$ over $2$ and $3$ over $3$ $L$--functions, and obtain asymptotic formulas when the shifts in the denominator have real part bigger than $g^{-1/4+\varepsilon}$ and $g^{-1/6+\varepsilon}$ respectively. The main ingredient in the proof is obtaining upper bounds for negative moments of $L$--functions. The upper bounds we obtain are expected to be almost sharp in the ranges described above. As an application, we recover the asymptotic formula for the one-level density of zeros in the family with the support of the Fourier transform in $(-2,2)$.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
Weighted central limit theorems for central values of $L$-functions
Authors:
Hung M. Bui,
Natalie Evans,
Stephen Lester,
Kyle Pratt
Abstract:
We establish a central limit theorem for the central values of Dirichlet $L$-functions with respect to a weighted measure on the set of primitive characters modulo $q$ as $q \rightarrow \infty$. Under the Generalized Riemann Hypothesis (GRH), we also prove a weighted central limit theorem for the joint distribution of the central $L$-values corresponding to twists of two distinct primitive Hecke e…
▽ More
We establish a central limit theorem for the central values of Dirichlet $L$-functions with respect to a weighted measure on the set of primitive characters modulo $q$ as $q \rightarrow \infty$. Under the Generalized Riemann Hypothesis (GRH), we also prove a weighted central limit theorem for the joint distribution of the central $L$-values corresponding to twists of two distinct primitive Hecke eigenforms. As applications, we obtain (under GRH) positive proportions of twists for which the central $L$-values simultaneously grow or shrink with $q$ as well as a positive proportion of twists for which linear combinations of the central $L$-values are nonzero.
△ Less
Submitted 29 September, 2021; v1 submitted 14 September, 2021;
originally announced September 2021.
-
Generatively Augmented Neural Network Watchdog for Image Classification Networks
Authors:
Justin M. Bui,
Glauco A. Amigo,
Robert J. Marks II
Abstract:
The identification of out-of-distribution data is vital to the deployment of classification networks. For example, a generic neural network that has been trained to differentiate between images of dogs and cats can only classify an input as either a dog or a cat. If a picture of a car or a kumquat were to be supplied to this classifier, the result would still be either a dog or a cat. In order to…
▽ More
The identification of out-of-distribution data is vital to the deployment of classification networks. For example, a generic neural network that has been trained to differentiate between images of dogs and cats can only classify an input as either a dog or a cat. If a picture of a car or a kumquat were to be supplied to this classifier, the result would still be either a dog or a cat. In order to mitigate this, techniques such as the neural network watchdog have been developed. The compression of the image input into the latent layer of the autoencoder defines the region of in-distribution in the image space. This in-distribution set of input data has a corresponding boundary in the image space. The watchdog assesses whether inputs are in inside or outside this boundary. This paper demonstrates how to sharpen this boundary using generative network training data augmentation thereby bettering the discrimination and overall performance of the watchdog.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
HYDRA -- Hyper Dependency Representation Attentions
Authors:
Ha-Thanh Nguyen,
Vu Tran,
Tran-Binh Dang,
Minh-Quan Bui,
Minh-Phuong Nguyen,
Le-Minh Nguyen
Abstract:
Attention is all we need as long as we have enough data. Even so, it is sometimes not easy to determine how much data is enough while the models are becoming larger and larger. In this paper, we propose HYDRA heads, lightweight pretrained linguistic self-attention heads to inject knowledge into transformer models without pretraining them again. Our approach is a balanced paradigm between leaving t…
▽ More
Attention is all we need as long as we have enough data. Even so, it is sometimes not easy to determine how much data is enough while the models are becoming larger and larger. In this paper, we propose HYDRA heads, lightweight pretrained linguistic self-attention heads to inject knowledge into transformer models without pretraining them again. Our approach is a balanced paradigm between leaving the models to learn unsupervised and forcing them to conform to linguistic knowledge rigidly as suggested in previous studies. Our experiment proves that the approach is not only the boost performance of the model but also lightweight and architecture friendly. We empirically verify our framework on benchmark datasets to show the contribution of linguistic knowledge to a transformer model. This is a promising result for a new approach to transferring knowledge from linguistic resources into transformer-based models.
△ Less
Submitted 11 September, 2021;
originally announced September 2021.
-
Potential model with bound-to-continuum approach for low-energy nucleon radiative capture by $^{12}$C and $^{16}$O
Authors:
Le-Anh Nguyen,
Nhut-Huan Phan,
Minh-Loc Bui
Abstract:
The nucleon radiative capture reactions are important in pure and applied nuclear physics, especially in nuclear astrophysics. The keV-nucleon radiative capture reactions are studied with $^{12}$C and $^{16}$O targets using the bound-to-continuum potential model in which both scattering and bound states are treated simultaneously and based on the Skyrme Hartree-Fock approximation. The obtained res…
▽ More
The nucleon radiative capture reactions are important in pure and applied nuclear physics, especially in nuclear astrophysics. The keV-nucleon radiative capture reactions are studied with $^{12}$C and $^{16}$O targets using the bound-to-continuum potential model in which both scattering and bound states are treated simultaneously and based on the Skyrme Hartree-Fock approximation. The obtained results are shown to be in good agreement with the available experimental data. Alongside astrophysical aspects, the nuclear structure features were revisited for enlarging the prospect of adopting the nucleon radiative capture processes as a spectroscopic tool.
△ Less
Submitted 8 September, 2021;
originally announced September 2021.