-
Magnetic properties and field-induced phenomena in the Jeff = 1/2 distorted kagome antiferromagnet
Authors:
A. Yadav,
A. Elghandour,
T. Arh,
D. T. Adroja,
M. D. Le,
G. B. G. Stenning,
M. Aouane,
S. Luther,
F. Hotz,
T. J. Hicken,
H. Luetkens,
A. Zorko,
R. Klingeler,
P. Khuntia
Abstract:
The intertwining between competing degrees of freedom, anisotropy, and frustration-induced quantum fluctuations offers an ideal ground to realize exotic quantum phenomena in the rare-earth-based kagome lattice. The magnetic susceptibility reveals the presence of two energy scales in agreement with the INS results. The higher energy state is dominated by CEF excitations, where the lowest Kramers gr…
▽ More
The intertwining between competing degrees of freedom, anisotropy, and frustration-induced quantum fluctuations offers an ideal ground to realize exotic quantum phenomena in the rare-earth-based kagome lattice. The magnetic susceptibility reveals the presence of two energy scales in agreement with the INS results. The higher energy state is dominated by CEF excitations, where the lowest Kramers ground-state doublet is well separated from the excited state suggesting that the compound realizes a low-energy state at low temperatures. The second energy scale is witnessed via thermodynamic results that reveal an anomaly at 0.3 K typical of a phase transition, which is attributed to the presence of complex magnetic ordering phenomena. The broad maximum in the specific heat well above 0.3 K indicates the presence of short-range spin correlations that is corroborated by muon spin relaxation rate results. The isothermal magnetization reveals a field-induced 1/3 magnetization plateau at low temperatures. muSR relaxation rate experiments, on the other hand, neither show the signature of a phase transition nor spin-freezing down to 34 mK. The ZF muSR relaxation is governed by the Orbach process and reveals the presence of a fluctuating state owing to the depopulation of crystal field levels reflected as a constant value of relaxation rate in the temperature range 0.4-10 K. NMR results indicate the presence of fluctuating Nd3+ moments down to 1.8 K consistent with muSR experiments. Our comprehensive results reveal that a field-induced quantum critical phenomenon is at play in this frustrated kagome magnet and enable us to construct a phase diagram exemplifying the proximity effect of competing magnetic states. This sets the stage to investigate the broad RE3BWO9 family of rare-earth kagome magnets promising to host exotic quantum states driven by spin-orbit coupling and geometrical frustration.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
SE(3)-bi-equivariant Transformers for Point Cloud Assembly
Authors:
Ziming Wang,
Rebecka Jörnsten
Abstract:
Given a pair of point clouds, the goal of assembly is to recover a rigid transformation that aligns one point cloud to the other. This task is challenging because the point clouds may be non-overlapped, and they may have arbitrary initial positions. To address these difficulties, we propose a method, called SE(3)-bi-equivariant transformer (BITR), based on the SE(3)-bi-equivariance prior of the ta…
▽ More
Given a pair of point clouds, the goal of assembly is to recover a rigid transformation that aligns one point cloud to the other. This task is challenging because the point clouds may be non-overlapped, and they may have arbitrary initial positions. To address these difficulties, we propose a method, called SE(3)-bi-equivariant transformer (BITR), based on the SE(3)-bi-equivariance prior of the task: it guarantees that when the inputs are rigidly perturbed, the output will transform accordingly. Due to its equivariance property, BITR can not only handle non-overlapped PCs, but also guarantee robustness against initial positions. Specifically, BITR first extracts features of the inputs using a novel $SE(3) \times SE(3)$-transformer, and then projects the learned feature to group SE(3) as the output. Moreover, we theoretically show that swap and scale equivariances can be incorporated into BITR, thus it further guarantees stable performance under scaling and swap** the inputs. We experimentally show the effectiveness of BITR in practical tasks.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
68-Channel Highly-Integrated Neural Signal Processing PSoC with On-Chip Feature Extraction, Compression, and Hardware Accelerators for Neuroprosthetics in 22nm FDSOI
Authors:
Liyuan Guo,
Annika Weiße,
Seyed Mohammad Ali Zeinolabedin,
Franz Marcus Schüffny,
Marco Stolba,
Qier Ma,
Zhuo Wang,
Stefan Scholze,
Andreas Dixius,
Marc Berthel,
Johannes Partzsch,
Dennis Walter,
Georg Ellguth,
Sebastian Höppner,
Richard George,
Christian Mayr
Abstract:
Multi-channel electrophysiology systems for recording of neuronal activity face significant data throughput limitations, hampering real-time, data-informed experiments. These limitations impact both experimental neurobiology research and next-generation neuroprosthetics. We present a novel solution that leverages the high integration density of 22nm FDSOI CMOS technology to address these challenge…
▽ More
Multi-channel electrophysiology systems for recording of neuronal activity face significant data throughput limitations, hampering real-time, data-informed experiments. These limitations impact both experimental neurobiology research and next-generation neuroprosthetics. We present a novel solution that leverages the high integration density of 22nm FDSOI CMOS technology to address these challenges. The proposed highly integrated programmable System-on-Chip comprises 68-channel 0.41 \textmu W/Ch recording frontends, spike detectors, 16-channel 0.87-4.39 \textmu W/Ch action potential and 8-channel 0.32 \textmu W/Ch local field potential codecs, as well as a MAC-assisted power-efficient processor operating at 25 MHz (5.19 \textmu W/MHz). The system supports on-chip training processes for compression, training and inference for neural spike sorting. The spike sorting achieves an average accuracy of 91.48% or 94.12% depending on the utilized features. The proposed PSoC is optimized for reduced area (9 mm2) and power. On-chip processing and compression capabilities free up the data bottlenecks in data transmission (up to 91% space saving ratio), and moreover enable a fully autonomous yet flexible processor-driven operation. Combined, these design considerations overcome data-bottlenecks by allowing on-chip feature extraction and subsequent compression.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Exploring State Space and Reasoning by Elimination in Tsetlin Machine
Authors:
Ahmed K. Kadhim,
Ole-Christoffer Granmo,
Lei Jiao,
Rishad Shafik
Abstract:
The Tsetlin Machine (TM) has gained significant attention in Machine Learning (ML). By employing logical fundamentals, it facilitates pattern learning and representation, offering an alternative approach for develo** comprehensible Artificial Intelligence (AI) with a specific focus on pattern classification in the form of conjunctive clauses. In the domain of Natural Language Processing (NLP), T…
▽ More
The Tsetlin Machine (TM) has gained significant attention in Machine Learning (ML). By employing logical fundamentals, it facilitates pattern learning and representation, offering an alternative approach for develo** comprehensible Artificial Intelligence (AI) with a specific focus on pattern classification in the form of conjunctive clauses. In the domain of Natural Language Processing (NLP), TM is utilised to construct word embedding and describe target words using clauses. To enhance the descriptive capacity of these clauses, we study the concept of Reasoning by Elimination (RbE) in clauses' formulation, which involves incorporating feature negations to provide a more comprehensive representation. In more detail, this paper employs the Tsetlin Machine Auto-Encoder (TM-AE) architecture to generate dense word vectors, aiming at capturing contextual information by extracting feature-dense vectors for a given vocabulary. Thereafter, the principle of RbE is explored to improve descriptivity and optimise the performance of the TM. Specifically, the specificity parameter s and the voting margin parameter T are leveraged to regulate feature distribution in the state space, resulting in a dense representation of information for each clause. In addition, we investigate the state spaces of TM-AE, especially for the forgotten/excluded features. Empirical investigations on artificially generated data, the IMDB dataset, and the 20 Newsgroups dataset showcase the robustness of the TM, with accuracy reaching 90.62\% for the IMDB.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
High-precision analysis of the critical point in heavy-quark QCD at $N_t=6$
Authors:
Ryo Ashikawa,
Masakiyo Kitazawa,
Shinji Ejiri,
Kazuyuki Kanaya
Abstract:
Binder-cumulant analysis of the critical point in the heavy-quark region of QCD is performed by Monte-Carlo simulations with the hop**-parameter expansion at $N_t=6$. We extend our previous analysis at $N_t=4$ to finer lattices and perform high-precision analyses on large spatial volumes up to the aspect ratio $LT=N_s/N_t=18$. Higher order terms in the hop**-parameter expansion are incorporate…
▽ More
Binder-cumulant analysis of the critical point in the heavy-quark region of QCD is performed by Monte-Carlo simulations with the hop**-parameter expansion at $N_t=6$. We extend our previous analysis at $N_t=4$ to finer lattices and perform high-precision analyses on large spatial volumes up to the aspect ratio $LT=N_s/N_t=18$. Higher order terms in the hop**-parameter expansion are incorporated effectively up to 14th order. The numerical results show that the violation of the finite-size scaling becomes more prominent on the finer lattice at a given aspect ratio.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Simple Linear Loops: Algebraic Invariants and Applications
Authors:
Rida Ait El Manssour,
George Kenison,
Mahsa Shirmohammadi,
Anton Varonka
Abstract:
Automatic generation of loop invariants is a fundamental challenge in software verification. While this task is undecidable in general, it is decidable for certain restricted classes of programs. This work focuses on invariant generation for (branching-free) loops with a single linear update.
Our primary contribution is a polynomial-space algorithm that computes the strongest algebraic invariant…
▽ More
Automatic generation of loop invariants is a fundamental challenge in software verification. While this task is undecidable in general, it is decidable for certain restricted classes of programs. This work focuses on invariant generation for (branching-free) loops with a single linear update.
Our primary contribution is a polynomial-space algorithm that computes the strongest algebraic invariant for simple linear loops, generating all polynomial equations that hold among program variables across all reachable states. The key to achieving our complexity bounds lies in mitigating the blowup associated with variable elimination and Gröbner basis computation, as seen in prior works. Our procedure runs in polynomial time when the number of program variables is fixed.
We examine various applications of our results on invariant generation, focusing on invariant verification and loop synthesis. The invariant verification problem investigates whether a polynomial ideal defining an algebraic set serves as an invariant for a given linear loop. We show that this problem is coNP-complete and lies in PSPACE when the input ideal is given in dense or sparse representations, respectively. In the context of loop synthesis, we aim to construct a loop with an infinite set of reachable states that upholds a specified algebraic property as an invariant. The strong synthesis variant of this problem requires the construction of loops for which the given property is the strongest invariant. In terms of hardness, synthesising loops over integers (or rationals) is as hard as Hilbert's Tenth problem (or its analogue over the rationals). When loop constants are constrained to bit-bounded rational numbers, we demonstrate that loop synthesis and its strong variant are both decidable in PSPACE, and in NP when the number of program variables is fixed.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Accuracy is Not All You Need
Authors:
Abhinav Dutta,
Sanjeev Krishnan,
Nipun Kwatra,
Ramachandran Ramjee
Abstract:
When Large Language Models (LLMs) are compressed using techniques such as quantization, the predominant way to demonstrate the validity of such techniques is by measuring the model's accuracy on various benchmarks.If the accuracies of the baseline model and the compressed model are close, it is assumed that there was negligible degradation in quality.However, even when the accuracy of baseline and…
▽ More
When Large Language Models (LLMs) are compressed using techniques such as quantization, the predominant way to demonstrate the validity of such techniques is by measuring the model's accuracy on various benchmarks.If the accuracies of the baseline model and the compressed model are close, it is assumed that there was negligible degradation in quality.However, even when the accuracy of baseline and compressed model are similar, we observe the phenomenon of flips, wherein answers change from correct to incorrect and vice versa in proportion.We conduct a detailed study of metrics across multiple compression techniques, models and datasets, demonstrating that the behavior of compressed models as visible to end-users is often significantly different from the baseline model, even when accuracy is similar.We further evaluate compressed models qualitatively and quantitatively using MT-Bench and show that compressed models are significantly worse than baseline models in this free-form generative task.Thus, we argue that compression techniques should also be evaluated using distance metrics.We propose two such metrics, KL-Divergence and flips, and show that they are well correlated.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
FlyEye Ground-Based Telescope: Unveiling New Frontiers in Astronomical Science
Authors:
Carmelo Arcidiacono,
Matteo Simioni,
Roberto Ragazzoni,
Piero Gregori,
Paolo Lorenzi,
Francesco Cerutti,
Roberto Ziano,
Matteo Bisiani,
Roberta Pellegrini,
Andrea Guazzora,
Silvano Pieri,
Marco Dima,
Silvio Di Rosa,
Simone Zaggia,
Jacopo Farinato,
Demetrio Magrin,
Andrea Grazian,
Marco Gullieuszik
Abstract:
The FlyEye design makes its debut in the ESA's NEOSTEL developed by OHB-Italia. This pioneering FlyEye telescope integrates a monolithic 1-meter class primary mirror feeding 16 CCD cameras for discovering Near-Earth Object (NEO) and any class of transient phenomena. OHB-Italia is the prime contractor, receiving extended support from the Italian National Institute for Astrophysics (INAF) in the ESA…
▽ More
The FlyEye design makes its debut in the ESA's NEOSTEL developed by OHB-Italia. This pioneering FlyEye telescope integrates a monolithic 1-meter class primary mirror feeding 16 CCD cameras for discovering Near-Earth Object (NEO) and any class of transient phenomena. OHB-Italia is the prime contractor, receiving extended support from the Italian National Institute for Astrophysics (INAF) in the ESA's NEOSTED program's integration and testing. The FlyEye distinctive design splits the Field of View into 16 channels, creating a unique multi-telescope system with a panoramic 44 square degree Field of View and a seeing-size pixel-scale, enabling NEOs detection down to apparent magnitudes 21.5 insisting on a 1m diameter spherical mirror. The scientific products of a similar FlyEye telescope can complement facilities such as Vera Rubin (former LSST) and ZTF. The FlyEye has the ability to survey two-thirds of the visible sky about three times per night can revolutionize time-domain astronomy, enabling comprehensive studies of transient phenomena, placing FlyEye in a new era of exploration of the dynamic universe. Efforts to develop automated calibration and testing procedures are keys to realizing this transformative potential.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (414 additional authors not shown)
Abstract:
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det…
▽ More
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
A Look Into News Avoidance Through AWRS: An Avoidance-Aware Recommender System
Authors:
Igor L. R. Azevedo,
Toyotaro Suzumura,
Yuichiro Yasui
Abstract:
In recent years, journalists have expressed concerns about the increasing trend of news article avoidance, especially within specific domains. This issue has been exacerbated by the rise of recommender systems. Our research indicates that recommender systems should consider avoidance as a fundamental factor. We argue that news articles can be characterized by three principal elements: exposure, re…
▽ More
In recent years, journalists have expressed concerns about the increasing trend of news article avoidance, especially within specific domains. This issue has been exacerbated by the rise of recommender systems. Our research indicates that recommender systems should consider avoidance as a fundamental factor. We argue that news articles can be characterized by three principal elements: exposure, relevance, and avoidance, all of which are closely interconnected. To address these challenges, we introduce AWRS, an Avoidance-Aware Recommender System. This framework incorporates avoidance awareness when recommending news, based on the premise that news article avoidance conveys significant information about user preferences. Evaluation results on three news datasets in different languages (English, Norwegian, and Japanese) demonstrate that our method outperforms existing approaches.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
The MICADO first light imager for the ELT: the PSF Reconstruction Software
Authors:
Andrea Grazian,
Elisa Portaluri,
Matteo Simioni,
Carmelo Arcidiacono,
Marco Gullieuszik,
Johanna Hartke,
Daniel Jodlbauer,
Fernando Pedichini,
Roberto Piazzesi,
Piero Vaccari,
Benedetta Vulcani,
Roland Wagner,
Anita Zanella
Abstract:
MICADO is the first-light camera of the ESO ELT, allowing NIR imaging and long-slit spectroscopy assisted by adaptive optics. MICADO is now entering its construction phase, and the software for data reduction is reaching an adequate maturity level. The PSF Reconstruction (PSF-R) of MICADO is a software tool for the blind derivation of the PSF, only using adaptive optics telemetry data. An update o…
▽ More
MICADO is the first-light camera of the ESO ELT, allowing NIR imaging and long-slit spectroscopy assisted by adaptive optics. MICADO is now entering its construction phase, and the software for data reduction is reaching an adequate maturity level. The PSF Reconstruction (PSF-R) of MICADO is a software tool for the blind derivation of the PSF, only using adaptive optics telemetry data. An update of the status of the PSF-R service is provided here. The PSF-R prototype has been tested on ERIS@VLT data in order to check the reconstruction of on- and off-axis PSFs. The on-axis PSF-R is accurate at a few percent level on Strehl, FWHM, Encircled Energy, and half light radius, while for the off-axis case the match is within 10-15 percent at a distance of half isoplanatic angle. The first version of the workflow for the PSF-R pipeline has been developed and verified using the latest release of the ESO data processing system. A set of simulations has been implemented on the morphological analysis of distant galaxies, showing that the accuracy of the PSF-R matches the goals needed to study their morphology. In summary, the PSF-R team is on the right track towards the ELT first light.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
On the Problem of Defining Charge Operators for the Dirac Quantum Field
Authors:
Pablo Costa Rico,
Roderich Tumulka
Abstract:
It is well known how to define the operator $Q$ for the total charge (i.e., positron number minus electron number) on the standard Hilbert space of the second-quantized Dirac equation. Here we ask about operators $Q_A$ representing the charge content of a region $A\subseteq \mathbb{R}^3$ in 3d physical space. There is a natural formula for $Q_A$ but, as we explain, there are difficulties about tur…
▽ More
It is well known how to define the operator $Q$ for the total charge (i.e., positron number minus electron number) on the standard Hilbert space of the second-quantized Dirac equation. Here we ask about operators $Q_A$ representing the charge content of a region $A\subseteq \mathbb{R}^3$ in 3d physical space. There is a natural formula for $Q_A$ but, as we explain, there are difficulties about turning it into a mathematically precise definition. First, $Q_A$ can be written as a series but its convergence seems hopeless. Second, we show for some choices of $A$ that if $Q_A$ could be defined then its domain could not contain either the vacuum vector or any vector obtained from the vacuum by applying a polynomial in creation and annihilation operators. Both observations speak against the existence of $Q_A$ for generic $A$.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Decentralized multi-agent reinforcement learning algorithm using a cluster-synchronized laser network
Authors:
Shun Kotoku,
Takatomo Mihana,
André Röhm,
Ryoichi Horisaki
Abstract:
Multi-agent reinforcement learning (MARL) studies crucial principles that are applicable to a variety of fields, including wireless networking and autonomous driving. We propose a photonic-based decision-making algorithm to address one of the most fundamental problems in MARL, called the competitive multi-armed bandit (CMAB) problem. Our numerical simulations demonstrate that chaotic oscillations…
▽ More
Multi-agent reinforcement learning (MARL) studies crucial principles that are applicable to a variety of fields, including wireless networking and autonomous driving. We propose a photonic-based decision-making algorithm to address one of the most fundamental problems in MARL, called the competitive multi-armed bandit (CMAB) problem. Our numerical simulations demonstrate that chaotic oscillations and cluster synchronization of optically coupled lasers, along with our proposed decentralized coupling adjustment, efficiently balance exploration and exploitation while facilitating cooperative decision-making without explicitly sharing information among agents. Our study demonstrates how decentralized reinforcement learning can be achieved by exploiting complex physical processes controlled by simple algorithms.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Enhanced quantum state transfer via feedforward cancellation of optical phase noise
Authors:
Benjamin P. Maddox,
Jonathan M. Mortlock,
Tom R. Hepworth,
Adarsh P. Raghuram,
Philip D. Gregory,
Alexander Guttridge,
Simon L. Cornish
Abstract:
Many experimental platforms for quantum science depend on state control via laser fields. Frequently, however, the control fidelity is limited by optical phase noise. This is exacerbated in stabilized laser systems where high-frequency phase noise is an unavoidable consequence of feedback. Here we implement an optical feedforward technique to suppress laser phase noise in the STIRAP state transfer…
▽ More
Many experimental platforms for quantum science depend on state control via laser fields. Frequently, however, the control fidelity is limited by optical phase noise. This is exacerbated in stabilized laser systems where high-frequency phase noise is an unavoidable consequence of feedback. Here we implement an optical feedforward technique to suppress laser phase noise in the STIRAP state transfer of ultracold RbCs molecules, across 114 THz, from a weakly bound Feshbach state to the rovibrational ground state. By performing over 100 state transfers on single molecules, we measure a significantly enhanced transfer efficiency of 98.7(1)% limited only by available laser intensity.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Assessing the Efficacy of IoT-based Forest Fire Detection: a Practical Use Case
Authors:
Belcher Anthony,
Esteva Miguel A.,
Lam Anthea,
Ramadhani Rizki,
Rayhan Achmad,
Xu Wangkun,
Tuncer Daphne
Abstract:
The implementation of early warning mechanisms that can be used to detect forest fires in rural areas is essential to mitigate their deleterious effects, in particular by notifying local fire authorities to mount timely emergency responses. 6G-enabled Internet of Things (IoT) infrastructures are promising technological developments in that direction. However, in practice, the ability to detect for…
▽ More
The implementation of early warning mechanisms that can be used to detect forest fires in rural areas is essential to mitigate their deleterious effects, in particular by notifying local fire authorities to mount timely emergency responses. 6G-enabled Internet of Things (IoT) infrastructures are promising technological developments in that direction. However, in practice, the ability to detect forest fires in an effective way using distributed sensor nodes is challenging to achieve. In this short paper, we exemplify this challenge based on a case study that uses real data collected from the Low-Cost Internet of Things Sensor of Haze Air Quality Disasters in Communities in Thailand and Southeast Asia (SEA-HAZEMON) platform. The work is a preliminary step towards assessing the efficacy of a real-life fire detection system based on distributed sensor nodes. More generally, the objective is to develop a set of practical guidelines for the design of a 6G-enabled IoT-based fire detection mechanism.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Enhancing Training Efficiency Using Packing with Flash Attention
Authors:
Achintya Kundu,
Rhui Dih Lee,
Laura Wynter,
Raghu Kiran Ganti
Abstract:
Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. On the other hand, the Hugging Face SFT trainer offers the option to use packin…
▽ More
Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. On the other hand, the Hugging Face SFT trainer offers the option to use packing to combine multiple training examples up to the maximum sequence length. This allows for maximal utilization of GPU resources. However, without proper masking of each packed training example, attention will not be computed correctly when using SFT trainer. We enable and then analyse packing and Flash Attention with proper attention masking of each example and show the benefits of this training paradigm.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos
Authors:
Polina Turishcheva,
Paul G. Fahey,
Michaela Vystrčilová,
Laura Hansel,
Rachel Froebe,
Kayla Ponder,
Yongrong Qiu,
Konstantin F. Willeke,
Mohammad Bashiri,
Ruslan Baikulov,
Yu Zhu,
Lei Ma,
Shan Yu,
Tiejun Huang,
Bryan M. Li,
Wolf De Wulf,
Nina Kudryashova,
Matthias H. Hennig,
Nathalie L. Rochefort,
Arno Onken,
Eric Wang,
Zhiwei Ding,
Andreas S. Tolias,
Fabian H. Sinz,
Alexander S Ecker
Abstract:
Understanding how biological visual systems process information is challenging because of the nonlinear relationship between visual input and neuronal responses. Artificial neural networks allow computational neuroscientists to create predictive models that connect biological and machine vision. Machine learning has benefited tremendously from benchmarks that compare different model on the same ta…
▽ More
Understanding how biological visual systems process information is challenging because of the nonlinear relationship between visual input and neuronal responses. Artificial neural networks allow computational neuroscientists to create predictive models that connect biological and machine vision. Machine learning has benefited tremendously from benchmarks that compare different model on the same task under standardized conditions. However, there was no standardized benchmark to identify state-of-the-art dynamic models of the mouse visual system. To address this gap, we established the Sensorium 2023 Benchmark Competition with dynamic input, featuring a new large-scale dataset from the primary visual cortex of ten mice. This dataset includes responses from 78,853 neurons to 2 hours of dynamic stimuli per neuron, together with the behavioral measurements such as running speed, pupil dilation, and eye movements. The competition ranked models in two tracks based on predictive performance for neuronal responses on a held-out test set: one focusing on predicting in-domain natural stimuli and another on out-of-distribution (OOD) stimuli to assess model generalization. As part of the NeurIPS 2023 competition track, we received more than 160 model submissions from 22 teams. Several new architectures for predictive models were proposed, and the winning teams improved the previous state-of-the-art model by 50%. Access to the dataset as well as the benchmarking infrastructure will remain online at www.sensorium-competition.net.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Diagnosing quantum transport from wave function snapshots
Authors:
Devendra Singh Bhakuni,
Roberto Verdel,
Cristiano Muzzi,
Riccardo Andreoni,
Monika Aidelsburger,
Marcello Dalmonte
Abstract:
We study nonequilibrium quantum dynamics of spin chains by employing principal component analysis (PCA) on data sets of wave function snapshots and examine how information propagates within these data sets. The quantities we employ are derived from the spectrum of the sample second moment matrix, built directly from data sets. Our investigations on several interacting spin chains featuring distinc…
▽ More
We study nonequilibrium quantum dynamics of spin chains by employing principal component analysis (PCA) on data sets of wave function snapshots and examine how information propagates within these data sets. The quantities we employ are derived from the spectrum of the sample second moment matrix, built directly from data sets. Our investigations on several interacting spin chains featuring distinct spin or energy transport reveal that the growth of data information spreading follows the same dynamical exponents as that of the underlying quantum transport of spin or energy. Specifically, our approach enables an easy, data-driven, and importantly interpretable diagnostic to track energy transport with a limited number of samples, which is usually challenging without any assumption on the Hamiltonian form. These observations are obtained at a modest finite size and evolution time, which aligns with experimental and numerical constraints. Our framework directly applies to experimental quantum simulator data sets of dynamics in higher-dimensional systems, where classical simulation methods usually face significant limitations and apply equally to both near- and far-from-equilibrium quenches.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
The MeerKAT Fornax Survey. III. Ram-pressure strip** of the tidally interacting galaxy NGC 1427A in the Fornax cluster
Authors:
P. Serra,
T. A. Oosterloo,
P. Kamphuis,
G. I. G. Jozsa,
W. J. G. de Blok,
G. L. Bryan,
J. H. van Gorkom,
E. Iodice,
D. Kleiner,
A. Loni,
S. I. Loubser,
F. M. Maccagni,
D. Molnar,
R. Peletier,
D. J. Pisano,
M. Ramatsoku,
M. W. L. Smith,
M. A. W. Verheijen,
N. Zabel
Abstract:
We present MeerKAT Fornax Survey HI observations of NGC 1427A, a blue irregular galaxy with a stellar mass of 2e+9 Msun located near the centre of the Fornax galaxy cluster. Thanks to the excellent resolution (1 to 6 kpc spatially, 1.4 km/s in velocity) and HI column density sensitivity (4e+19/cm^2 to 1e+18/cm^2 depending on resolution), our data deliver new insights on the long-debated interactio…
▽ More
We present MeerKAT Fornax Survey HI observations of NGC 1427A, a blue irregular galaxy with a stellar mass of 2e+9 Msun located near the centre of the Fornax galaxy cluster. Thanks to the excellent resolution (1 to 6 kpc spatially, 1.4 km/s in velocity) and HI column density sensitivity (4e+19/cm^2 to 1e+18/cm^2 depending on resolution), our data deliver new insights on the long-debated interaction of this galaxy with the cluster environment. We confirm the presence of a broad, one-sided, starless HI tail stretching from the outer regions of the stellar body and pointing away from the cluster centre. We find the tail to have 50% more HI (4e+8 Msun) and to be 3 times longer (70 kpc) than in previous observations. In fact, we detect scattered HI clouds out to 300 kpc from the galaxy in the direction of the tail -- possibly the most ancient remnant of the passage of NGC 1427A through the intracluster medium of Fornax. Both the velocity gradient along the HI tail and the peculiar kinematics of HI in the outer region of the stellar body are consistent with the effect of ram pressure given the line-of-sight motion of the galaxy within the cluster. However, several properties cannot be explained solely by ram pressure and suggest an ongoing tidal interaction. This includes: the close match between dense HI and stars within the disturbed stellar body; the abundant kinematically-anomalous HI; and the inversion of the HI velocity gradient near the base of the HI tail. We rule out an interaction with the cluster tidal field, and conclude that NGC 1427A is the result of a high-speed galaxy encounter or of a merger started at least 300 Myr ago, where ram pressure shapes the distribution and kinematics of the HI in the perturbed outer stellar body and in the tidal tails.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Open Vocabulary Multi-Label Video Classification
Authors:
Rohit Gupta,
Mamshad Nayeem Rizve,
Jayakrishnan Unnikrishnan,
Ashish Tawari,
Son Tran,
Mubarak Shah,
Benjamin Yao,
Trishul Chilimbi
Abstract:
Pre-trained vision-language models (VLMs) have enabled significant progress in open vocabulary computer vision tasks such as image classification, object detection and image segmentation. Some recent works have focused on extending VLMs to open vocabulary single label action classification in videos. However, previous methods fall short in holistic video understanding which requires the ability to…
▽ More
Pre-trained vision-language models (VLMs) have enabled significant progress in open vocabulary computer vision tasks such as image classification, object detection and image segmentation. Some recent works have focused on extending VLMs to open vocabulary single label action classification in videos. However, previous methods fall short in holistic video understanding which requires the ability to simultaneously recognize multiple actions and entities e.g., objects in the video in an open vocabulary setting. We formulate this problem as open vocabulary multilabel video classification and propose a method to adapt a pre-trained VLM such as CLIP to solve this task. We leverage large language models (LLMs) to provide semantic guidance to the VLM about class labels to improve its open vocabulary performance with two key contributions. First, we propose an end-to-end trainable architecture that learns to prompt an LLM to generate soft attributes for the CLIP text-encoder to enable it to recognize novel classes. Second, we integrate a temporal modeling module into CLIP's vision encoder to effectively model the spatio-temporal dynamics of video concepts as well as propose a novel regularized finetuning technique to ensure strong open vocabulary classification performance in the video domain. Our extensive experimentation showcases the efficacy of our approach on multiple benchmark datasets.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
The natural extension to PDEs of Lie's reduction of order algorithm for ODEs
Authors:
George W. Bluman,
Rafael de la Rosa
Abstract:
In this paper, we further consider the symmetry-based method for seeking nonlocally related systems for partial differential equations. In particular, we show that the symmetry-based method for partial differential equations is the natural extension of Lie's reduction of order algorithm for ordinary differential equations by looking at this algorithm from a different point of view. Many examples e…
▽ More
In this paper, we further consider the symmetry-based method for seeking nonlocally related systems for partial differential equations. In particular, we show that the symmetry-based method for partial differential equations is the natural extension of Lie's reduction of order algorithm for ordinary differential equations by looking at this algorithm from a different point of view. Many examples exhibit various situations that can arise.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
KUNPENG: An Embodied Large Model for Intelligent Maritime
Authors:
Naiyao Wang,
Tongbang Jiang,
Ye Wang,
Shaoyang Qiu,
Bo Zhang,
Xinqiang Xie,
Munan Li,
Chunliu Wang,
Yiyang Wang,
Hongxiang Ren,
Ruili Wang,
Hongjun Shan,
Hongbo Liu
Abstract:
Intelligent maritime, as an essential component of smart ocean construction, deeply integrates advanced artificial intelligence technology and data analysis methods, which covers multiple aspects such as smart vessels, route optimization, safe navigation, aiming to enhance the efficiency of ocean resource utilization and the intelligence of transportation networks. However, the complex and dynamic…
▽ More
Intelligent maritime, as an essential component of smart ocean construction, deeply integrates advanced artificial intelligence technology and data analysis methods, which covers multiple aspects such as smart vessels, route optimization, safe navigation, aiming to enhance the efficiency of ocean resource utilization and the intelligence of transportation networks. However, the complex and dynamic maritime environment, along with diverse and heterogeneous large-scale data sources, present challenges for real-time decision-making in intelligent maritime. In this paper, We propose KUNPENG, the first-ever embodied large model for intelligent maritime in the smart ocean construction, which consists of six systems. The model perceives multi-source heterogeneous data for the cognition of environmental interaction and make autonomous decision strategies, which are used for intelligent vessels to perform navigation behaviors under safety and emergency guarantees and continuously optimize power to achieve embodied intelligence in maritime. In comprehensive maritime task evaluations, KUNPENG has demonstrated excellent performance.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Optimization of Long-Haul C+L+S Systems by means of a Closed Form EGN Model
Authors:
Y. Jiang,
J. Sarkis,
A. Nespola,
F. Forghieri,
S. Piciaccia,
A. Tanzi,
M. Ranjbar Zefreh,
P. Poggiolini
Abstract:
We investigate C+L+S long-haul systems using a closed-form GN/EGN non-linearity model. We perform accurate launch power and Raman pump optimization. We show a potential 4x throughput increase over legacy C-band systems in 1000 km links, using moderate S-only Raman amplification. We simultaneously achieve extra-flat GSNR, within +/-0.5 dB across the whole C+L+S spectrum.
We investigate C+L+S long-haul systems using a closed-form GN/EGN non-linearity model. We perform accurate launch power and Raman pump optimization. We show a potential 4x throughput increase over legacy C-band systems in 1000 km links, using moderate S-only Raman amplification. We simultaneously achieve extra-flat GSNR, within +/-0.5 dB across the whole C+L+S spectrum.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Overcoming Catastrophic Forgetting in Tabular Data Classification: A Pseudorehearsal-based approach
Authors:
Pablo García-Santaclara,
Bruno Fernández-Castro,
Rebeca P. Díaz-Redondo
Abstract:
Continual learning (CL) poses the important challenge of adapting to evolving data distributions without forgetting previously acquired knowledge while consolidating new knowledge. In this paper, we introduce a new methodology, coined as Tabular-data Rehearsal-based Incremental Lifelong Learning framework (TRIL3), designed to address the phenomenon of catastrophic forgetting in tabular data classi…
▽ More
Continual learning (CL) poses the important challenge of adapting to evolving data distributions without forgetting previously acquired knowledge while consolidating new knowledge. In this paper, we introduce a new methodology, coined as Tabular-data Rehearsal-based Incremental Lifelong Learning framework (TRIL3), designed to address the phenomenon of catastrophic forgetting in tabular data classification problems. TRIL3 uses the prototype-based incremental generative model XuILVQ to generate synthetic data to preserve old knowledge and the DNDF algorithm, which was modified to run in an incremental way, to learn classification tasks for tabular data, without storing old samples. After different tests to obtain the adequate percentage of synthetic data and to compare TRIL3 with other CL available proposals, we can conclude that the performance of TRIL3 outstands other options in the literature using only 50% of synthetic data.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
DRM Revisited: A Complete Error Analysis
Authors:
Yuling Jiao,
Ruoxuan Li,
Peiying Wu,
Jerry Zhijian Yang,
**wen Zhang
Abstract:
In this work, we address a foundational question in the theoretical analysis of the Deep Ritz Method (DRM) under the over-parameteriztion regime: Given a target precision level, how can one determine the appropriate number of training samples, the key architectural parameters of the neural networks, the step size for the projected gradient descent optimization procedure, and the requisite number o…
▽ More
In this work, we address a foundational question in the theoretical analysis of the Deep Ritz Method (DRM) under the over-parameteriztion regime: Given a target precision level, how can one determine the appropriate number of training samples, the key architectural parameters of the neural networks, the step size for the projected gradient descent optimization procedure, and the requisite number of iterations, such that the output of the gradient descent process closely approximates the true solution of the underlying partial differential equation to the specified precision?
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
3M-Health: Multimodal Multi-Teacher Knowledge Distillation for Mental Health Detection
Authors:
Rina Carines Cabral,
Siwen Luo,
Soyeon Caren Han,
Josiah Poon
Abstract:
The significance of mental health classification is paramount in contemporary society, where digital platforms serve as crucial sources for monitoring individuals' well-being. However, existing social media mental health datasets primarily consist of text-only samples, potentially limiting the efficacy of models trained on such data. Recognising that humans utilise cross-modal information to compr…
▽ More
The significance of mental health classification is paramount in contemporary society, where digital platforms serve as crucial sources for monitoring individuals' well-being. However, existing social media mental health datasets primarily consist of text-only samples, potentially limiting the efficacy of models trained on such data. Recognising that humans utilise cross-modal information to comprehend complex situations or issues, we present a novel approach to address the limitations of current methodologies. In this work, we introduce a Multimodal and Multi-Teacher Knowledge Distillation model for Mental Health Classification, leveraging insights from cross-modal human understanding. Unlike conventional approaches that often rely on simple concatenation to integrate diverse features, our model addresses the challenge of appropriately representing inputs of varying natures (e.g., texts and sounds). To mitigate the computational complexity associated with integrating all features into a single model, we employ a multimodal and multi-teacher architecture. By distributing the learning process across multiple teachers, each specialising in a particular feature extraction aspect, we enhance the overall mental health classification performance. Through experimental validation, we demonstrate the efficacy of our model in achieving improved performance. All relevant codes will be made available upon publication.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
AI-Driven Guided Response for Security Operation Centers with Microsoft Copilot for Security
Authors:
Scott Freitas,
Jovan Kalajdjieski,
Amir Gharib,
Rob McCann
Abstract:
Security operation centers contend with a constant stream of security incidents, ranging from straightforward to highly complex. To address this, we developed Copilot Guided Response (CGR), an industry-scale ML architecture that guides security analysts across three key tasks -- (1) investigation, providing essential historical context by identifying similar incidents; (2) triaging to ascertain th…
▽ More
Security operation centers contend with a constant stream of security incidents, ranging from straightforward to highly complex. To address this, we developed Copilot Guided Response (CGR), an industry-scale ML architecture that guides security analysts across three key tasks -- (1) investigation, providing essential historical context by identifying similar incidents; (2) triaging to ascertain the nature of the incident -- whether it is a true positive, false positive, or benign positive; and (3) remediation, recommending tailored containment actions. CGR is integrated into the Microsoft Defender XDR product and deployed worldwide, generating millions of recommendations across thousands of customers. Our extensive evaluation, incorporating internal evaluation, collaboration with security experts, and customer feedback, demonstrates that CGR delivers high-quality recommendations across all three tasks. We provide a comprehensive overview of the CGR architecture, setting a precedent as the first cybersecurity company to openly discuss these capabilities in such depth. Additionally, we GUIDE, the largest public collection of real-world security incidents, spanning 13M evidences across 1M annotated incidents. By enabling researchers and practitioners to conduct research on real-world data, GUIDE advances the state of cybersecurity and supports the development of next-generation machine learning systems.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Procedural Content Generation via Generative Artificial Intelligence
Authors:
Xinyu Mao,
Wanli Yu,
Kazunori D Yamada,
Michael R. Zielewski
Abstract:
The attempt to utilize machine learning in PCG has been made in the past. In this survey paper, we investigate how generative artificial intelligence (AI), which saw a significant increase in interest in the mid-2010s, is being used for PCG. We review applications of generative AI for the creation of various types of content, including terrains, items, and even storylines. While generative AI is e…
▽ More
The attempt to utilize machine learning in PCG has been made in the past. In this survey paper, we investigate how generative artificial intelligence (AI), which saw a significant increase in interest in the mid-2010s, is being used for PCG. We review applications of generative AI for the creation of various types of content, including terrains, items, and even storylines. While generative AI is effective for PCG, one significant issues it faces is that building high-performance generative AI requires vast amounts of training data. Because content generally highly customized, domain-specific training data is scarce, and straightforward approaches to generative AI models may not work well. For PCG research to advance further, issues related to limited training data must be overcome. Thus, we also give special consideration to research that addresses the challenges posed by limited training data.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Chiral orbital texture in nonlinear electrical conduction
Authors:
Suguru Okumura,
Ryutaro Tanaka,
Daichi Hirobe
Abstract:
Nonlinear electrical conduction primarily mediated by an orbital texture is observed in chiral semiconductor Te. We determine the enantiospecific sign of the nonlinear conductance and identify anomalies in its carrier-density dependence. Our findings, combined with the Boltzmann equation, are attributed to a chiral orbital texture, namely a chiral distribution of the orbital magnetic moment in rec…
▽ More
Nonlinear electrical conduction primarily mediated by an orbital texture is observed in chiral semiconductor Te. We determine the enantiospecific sign of the nonlinear conductance and identify anomalies in its carrier-density dependence. Our findings, combined with the Boltzmann equation, are attributed to a chiral orbital texture, namely a chiral distribution of the orbital magnetic moment in reciprocal space. This study underscores the efficacy of nonlinear transport measurements in probing orbital-related effects, whose differentiation from spin counterparts is often demanding in the linear response regime of electron transport.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs
Authors:
Aobo Kong,
Shiwan Zhao,
Hao Chen,
Qicheng Li,
Yong Qin,
Ruiqi Sun,
Xin Zhou,
Jiaming Zhou,
Haoqin Sun
Abstract:
Recent advancements in LLMs have showcased their remarkable role-playing capabilities, able to accurately simulate the dialogue styles and cognitive processes of various roles based on different instructions and contexts. Studies indicate that assigning LLMs the roles of experts, a strategy known as role-play prompting, can enhance their performance in the corresponding domains. However, the promp…
▽ More
Recent advancements in LLMs have showcased their remarkable role-playing capabilities, able to accurately simulate the dialogue styles and cognitive processes of various roles based on different instructions and contexts. Studies indicate that assigning LLMs the roles of experts, a strategy known as role-play prompting, can enhance their performance in the corresponding domains. However, the prompt needs to be manually designed for the given problem, requiring certain expertise and iterative modifications. To this end, we propose self-prompt tuning, making LLMs themselves generate role-play prompts through fine-tuning. Leveraging the LIMA dataset as our foundational corpus, we employ GPT-4 to annotate role-play prompts for each data points, resulting in the creation of the LIMA-Role dataset. We then fine-tune LLMs like Llama-2-7B and Mistral-7B on LIMA-Role. Consequently, the self-prompt tuned LLMs can automatically generate expert role prompts for any given question. We extensively evaluate self-prompt tuned LLMs on widely used NLP benchmarks and open-ended question test. Our empirical results illustrate that self-prompt tuned LLMs outperform standard instruction tuned baselines across most datasets. This highlights the great potential of utilizing fine-tuning to enable LLMs to self-prompt, thereby automating complex prompting strategies. We release the dataset, models, and code at this \href{https://anonymous.4open.science/r/Self-Prompt-Tuning-739E/}{url}.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (385 additional authors not shown)
Abstract:
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I…
▽ More
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Enabling Elastic Model Serving with MultiWorld
Authors:
Myung** Lee,
Akshay Jajoo,
Ramana Rao Kompella
Abstract:
Machine learning models have been exponentially growing in terms of their parameter size over the past few years. We are now seeing the rise of trillion-parameter models. The large models cannot fit into a single GPU and thus require partitioned deployment across GPUs and even hosts. A high-performance collective communication library (CCL) such as NCCL is essential to fully utilize expensive GPU…
▽ More
Machine learning models have been exponentially growing in terms of their parameter size over the past few years. We are now seeing the rise of trillion-parameter models. The large models cannot fit into a single GPU and thus require partitioned deployment across GPUs and even hosts. A high-performance collective communication library (CCL) such as NCCL is essential to fully utilize expensive GPU resources. However, CCL is not a great fit for inference. Unlike training for which a fixed amount of GPU resources is used for fixed workloads (e.g., input datasets), the inference workloads can change dynamically over time. Failures at the serving time can also impact individual user's experiences directly. In contrast, workers in a CCL process group share a single fault domain and the process group cannot grow as the workloads increase. The gap between the unique characteristics of model serving and CCL's nature makes it hard to serve large models elastically. To bridge the gap, we propose MultiWorld that enables fault tolerance and online scaling at the granularity of workers for model serving. Our evaluation showcases that enabling these new functionalities incurs small overheads (1.4-4.3% throughput loss) for most of the scenarios we tested.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Neutron matter from local chiral EFT interactions at large cutoffs
Authors:
I. Tews,
R. Somasundaram,
D. Lonardoni,
H. Göttling,
R. Seutin,
J. Carlson,
S. Gandolfi,
K. Hebeler,
A. Schwenk
Abstract:
Neutron matter is an important many-body system that provides valuable constraints for the equation of state (EOS) of neutron stars. Neutron-matter calculations employing chiral effective field theory (EFT) interactions have been extensively used for this purpose. Among the various many-body methods, quantum Monte Carlo (QMC) methods stand out due to their nonperturbative nature and the achievable…
▽ More
Neutron matter is an important many-body system that provides valuable constraints for the equation of state (EOS) of neutron stars. Neutron-matter calculations employing chiral effective field theory (EFT) interactions have been extensively used for this purpose. Among the various many-body methods, quantum Monte Carlo (QMC) methods stand out due to their nonperturbative nature and the achievable precision. However, QMC methods require local interactions as input, which leads to the appearance of stronger regulator artifacts as compared to non-local interactions. To circumvent this, we employ large-cutoff interactions derived within chiral EFT (400 MeV $\leq Λ_c \leq$ 700 MeV) for studies of pure neutron matter. These interactions have been adjusted to nucleon-nucleon scattering phase shifts, the triton binding energy, as well as the triton beta-decay half life. We find that regulator artifacts significantly decrease with increasing cutoff, leading to a significant reduction of uncertainties in the neutron-matter EOS. We discuss implications for the symmetry energy and demonstrate how our new calculations lead to a reduction in the theoretical uncertainty of predicted neutron-star radii by up to 30% for low-mass stars.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining on Chest CT
Authors:
Jie Zheng,
Ru Wen,
Haiqin Hu,
Lina Wei,
Kui Su,
Wei Chen,
Chen Liu,
Jun Wang
Abstract:
Existing Masked Image Modeling (MIM) depends on a spatial patch-based masking-reconstruction strategy to perceive objects'features from unlabeled images, which may face two limitations when applied to chest CT: 1) inefficient feature learning due to complex anatomical details presented in CT images, and 2) suboptimal knowledge transfer owing to input disparity between upstream and downstream model…
▽ More
Existing Masked Image Modeling (MIM) depends on a spatial patch-based masking-reconstruction strategy to perceive objects'features from unlabeled images, which may face two limitations when applied to chest CT: 1) inefficient feature learning due to complex anatomical details presented in CT images, and 2) suboptimal knowledge transfer owing to input disparity between upstream and downstream models. To address these issues, we propose a new MIM method named Tissue-Contrastive Semi-Masked Autoencoder (TCS-MAE) for modeling chest CT images. Our method has two novel designs: 1) a tissue-based masking-reconstruction strategy to capture more fine-grained anatomical features, and 2) a dual-AE architecture with contrastive learning between the masked and original image views to bridge the gap of the upstream and downstream models. To validate our method, we systematically investigate representative contrastive, generative, and hybrid self-supervised learning methods on top of tasks involving segmenting pneumonia, mediastinal tumors, and various organs. The results demonstrate that, compared to existing methods, our TCS-MAE more effectively learns tissue-aware representations, thereby significantly enhancing segmentation performance across all tasks.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Hot Spot Offset Variability from Magnetohydrodynamical Thermoresistive Instability in Hot Jupiters
Authors:
Raphaël Hardy,
Paul Charbonneau,
Andrew Cumming
Abstract:
Hot Jupiter atmospheres are possibly subject to a thermoresistive instability. Such an instability may develop as the ohmic heating increases the electrical conductivity in a positive feedback loop, which ultimately leads to a runaway of the atmospheric temperature. We extend our previous axisymmetric one-dimensional radial model, by representing the temperature and magnetic diffusivity as a first…
▽ More
Hot Jupiter atmospheres are possibly subject to a thermoresistive instability. Such an instability may develop as the ohmic heating increases the electrical conductivity in a positive feedback loop, which ultimately leads to a runaway of the atmospheric temperature. We extend our previous axisymmetric one-dimensional radial model, by representing the temperature and magnetic diffusivity as a first order Fourier expansion in longitude. This allows us to predict the hot spot offset during the unfolding of the thermoresistive instability and following Alfvénic oscillations. We show a representative simulation undergoing the thermoresistive instability, in which the peak flux offset varies between approximately $\pm 60^{\circ}$ on timescales of a few days with potentially observable brightness variations. Therefore, this thermoresistive instability could be an observable feature of hot Jupiters, given the right timing of observation and transit and the right planetary parameters.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
High throughput screening, crystal structure prediction, and carrier mobility calculations of organic molecular semiconductors as hole transport layer materials in perovskite solar cells
Authors:
Md Omar Faruque,
Suchona Akter,
Dil K. Limbu,
Kathleen Kilway,
Zhonghua Peng,
Mohammad R. Momeni
Abstract:
Using a representative translational dimer model, high throughput calculations are implemented for fast screening of a total of 74 diacenaphtho-extended heterocycle (DAH) derivatives as hole transport layer (HTL) materials in perovskite solar cells (PVSCs). Different electronic properties, including band structures, band gaps, and band edges compared to methylammonium and formamidinium lead iodide…
▽ More
Using a representative translational dimer model, high throughput calculations are implemented for fast screening of a total of 74 diacenaphtho-extended heterocycle (DAH) derivatives as hole transport layer (HTL) materials in perovskite solar cells (PVSCs). Different electronic properties, including band structures, band gaps, and band edges compared to methylammonium and formamidinium lead iodide perovskites, along with reorganization energies, electronic couplings, and hole mobilities are calculated in order to decipher the effects of different parameters, including the polarity, steric and pi-conjugation, as well as the presence of explicit hydrogen bond interactions on the computed carrier mobilities of the studied materials. Full crystal structure predictions and hole mobility calculations of the top candidates resulted in some mobilities exceeding 10 cm2/V.s, further validating the employed translational dimer model as a robust approach for inverse design and fast high throughput screening of new HTL organic semiconductors with superior properties. The studied models and simulations performed in this work are instructive in designing next-generation HTL materials for higher-performance PVSCs.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Velocity gradient partitioning in turbulent flows
Authors:
Rahul Arun,
Tim Colonius
Abstract:
The velocity gradient tensor can be decomposed into axial straining, pure shearing, and rigid rotation tensors, each with distinct symmetry and normality properties. We partition the strength of velocity gradient fluctuations based on the relative contributions of these constituents in several turbulent flows. These flows include forced isotropic turbulence, channels and boundary layers, and subso…
▽ More
The velocity gradient tensor can be decomposed into axial straining, pure shearing, and rigid rotation tensors, each with distinct symmetry and normality properties. We partition the strength of velocity gradient fluctuations based on the relative contributions of these constituents in several turbulent flows. These flows include forced isotropic turbulence, channels and boundary layers, and subsonic and transonic jets. For forced isotropic turbulence, the partitioning is in excellent agreement with previous results. For wall-bounded turbulence, the partitioning collapses onto the isotropic partitioning far from the wall, where the mean shearing is relatively weak. By contrast, the near-wall partitioning is dominated by shearing. Between these two regimes, the partitioning collapses well at sufficiently high friction Reynolds numbers and its variations in the buffer layer and the log-law region can be reasonably modeled as a function of the mean shearing strength. The isotropic partitioning also applies throughout much of the turbulent jets due to the rapid decay of the mean flow shear layer near the nozzle lip. Before reaching the exterior potential flow regime, the relative contribution of rigid rotation around the turbulent/non-turbulent interface is enhanced with respect to the isotropic partitioning. Altogether, our results highlight the broad applicability of the velocity gradient partitioning to turbulence modeling.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Bora: Biomedical Generalist Video Generation Model
Authors:
Weixiang Sun,
Xiaocao You,
Ruizhe Zheng,
Zhengqing Yuan,
Xiang Li,
Lifang He,
Quanzheng Li,
Lichao Sun
Abstract:
Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for medical AI development. Diffusion models can now generate realistic images from text prompts, while recent advancements have demonstrated their ability to create diverse, high-quality videos. However, these models often struggle with generating accurate representations of medical…
▽ More
Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for medical AI development. Diffusion models can now generate realistic images from text prompts, while recent advancements have demonstrated their ability to create diverse, high-quality videos. However, these models often struggle with generating accurate representations of medical procedures and detailed anatomical structures. This paper introduces Bora, the first spatio-temporal diffusion probabilistic model designed for text-guided biomedical video generation. Bora leverages Transformer architecture and is pre-trained on general-purpose video generation tasks. It is fine-tuned through model alignment and instruction tuning using a newly established medical video corpus, which includes paired text-video data from various biomedical fields. To the best of our knowledge, this is the first attempt to establish such a comprehensive annotated biomedical video dataset. Bora is capable of generating high-quality video data across four distinct biomedical domains, adhering to medical expert standards and demonstrating consistency and diversity. This generalist video generative model holds significant potential for enhancing medical consultation and decision-making, particularly in resource-limited settings. Additionally, Bora could pave the way for immersive medical training and procedure planning. Extensive experiments on distinct medical modalities such as endoscopy, ultrasound, MRI, and cell tracking validate the effectiveness of our model in understanding biomedical instructions and its superior performance across subjects compared to state-of-the-art generation models.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Social dilemmas, network reciprocity and the small-world property
Authors:
F. B. Pereira,
R. S. Ferreira,
D. S. M. Alencar,
T. F. A. Alves,
G. A. Alves,
F. W. S. Lima,
A. Macedo-Filho
Abstract:
We revisit two evolutionary game theory models, namely the Prisoner and the Snowdrift dilemmas, on top of small-world networks. These dynamics on networked populations (individuals occupying nodes of a graph) are mainly concerning on the competition between to cooperate or to defect, by allowing some process of revision of strategies. Cooperators avoid defectors by forming clusters in a process kn…
▽ More
We revisit two evolutionary game theory models, namely the Prisoner and the Snowdrift dilemmas, on top of small-world networks. These dynamics on networked populations (individuals occupying nodes of a graph) are mainly concerning on the competition between to cooperate or to defect, by allowing some process of revision of strategies. Cooperators avoid defectors by forming clusters in a process known as network reciprocity. This defense strategy is based on the fact that any individual interact only with its nearest neighbors. The minimum cluster, in turn, is formed by a set of three completely connected nodes and the bulk of these triplets is associated with the transitivity property of a network. Particularly, we show that the transitivity increases eventually assuming a constant behavior when observed as a function of the number of contacts of an individual. We investigate the influence of the network reciprocity on that transitivity increasing regime on the promotion of a cooperative behavior. The dynamics on small-world networks are compared with those random regular, and annealed networks, the later typically studied as the well-mixed approach. We observe that the Snowdrift Game converge to an annealed scenario as randonness and coordination number increase, whereas the Prisoner's Dilemma becomes more severe against the cooperative behavior under the regime of an increasing network reciprocity.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Machine Learning in High Volume Media Manufacturing
Authors:
Siddarth Reddy Karuka,
Abhinav Sunderrajan,
Zheng Zheng,
Yong Woon Tiean,
Ganesh Nagappan,
Allan Luk
Abstract:
Errors or failures in a high-volume manufacturing environment can have significant impact that can result in both the loss of time and money. Identifying such failures early has been a top priority for manufacturing industries and various rule-based algorithms have been developed over the years. However, catching these failures is time consuming and such algorithms cannot adapt well to changes in…
▽ More
Errors or failures in a high-volume manufacturing environment can have significant impact that can result in both the loss of time and money. Identifying such failures early has been a top priority for manufacturing industries and various rule-based algorithms have been developed over the years. However, catching these failures is time consuming and such algorithms cannot adapt well to changes in designs, and sometimes variations in everyday behavior. More importantly, the number of units to monitor in a high-volume manufacturing environment is too big for manual monitoring or for a simple program. Here we develop a novel program that combines both rule-based decisions and machine learning models that can not only learn and adapt to such day-to-day variations or long-term design changes, but also can be applied at scale to the high number of manufacturing units in use today. Using the current state-of-the-art technologies, we then deploy this program at-scale to handle the needs of ever-increasing demand from the manufacturing environment.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Resource-aware scheduling of multiple quantum circuits on a hardware device
Authors:
Debasmita Bhoumik,
Ritajit Majumdar,
Susmita Sur-Kolay
Abstract:
Recent quantum technologies and quantum error-correcting codes emphasize the requirement for arranging interacting qubits in a nearest-neighbor (NN) configuration while map** a quantum circuit onto a given hardware device, in order to avoid undesirable noise. It is equally important to minimize the wastage of qubits in a quantum hardware device with m qubits while running circuits of n qubits in…
▽ More
Recent quantum technologies and quantum error-correcting codes emphasize the requirement for arranging interacting qubits in a nearest-neighbor (NN) configuration while map** a quantum circuit onto a given hardware device, in order to avoid undesirable noise. It is equally important to minimize the wastage of qubits in a quantum hardware device with m qubits while running circuits of n qubits in total, with n < m. In order to prevent cross-talk between two circuits, a buffer distance between their layouts is needed. Furthermore, not all the qubits and all the two-qubit interactions are at the same noise-level. Scheduling multiple circuits on the same hardware may create a possibility that some circuits are executed on a noisier layout than the others. In this paper, we consider an optimization problem which schedules as many circuits as possible for execution in parallel on the hardware, while maintaining a pre-defined layout quality for each. An integer linear programming formulation to ensure maximum fidelity while preserving the nearest neighbor arrangement among interacting qubits is presented. Our assertion is supported by comprehensive investigations involving various well-known quantum circuit benchmarks. As this scheduling problem is shown to be NP Hard, we also propose a greedy heuristic method which provides 2x and 3x better utilization for 27-qubit and 127-qubit hardware devices respectively in terms of qubits and time.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Commissioning of a compact multibend achromat lattice: A new 3 GeV synchrotron radiation facility
Authors:
Shuhei Obara,
Kota Ueshima,
Takao Asaka,
Yuji Hosaka,
Koichi Kan,
Nobuyuki Nishimori,
Toshitaka Aoki,
Hiroyuki Asano,
Koichi Haga,
Yuto Iba,
Akira Ihara,
Katsumasa Ito,
Taiki Iwashita,
Masaya Kadowaki,
Rento Kanahama,
Hajime Kobayashi,
Hideki Kobayashi,
Hideo Nishihara,
Masaaki Nishikawa,
Haruhiko Oikawa,
Ryota Saida,
Keisuke Sakuraba,
Kento Sugimoto,
Masahiro Suzuki,
Kouki Takahashi
, et al. (57 additional authors not shown)
Abstract:
NanoTerasu, a new 3 GeV synchrotron light source in Japan, began user operation in April 2024. It provides high-brilliance soft to tender X-rays and covers a wide spectral range from ultraviolet to tender X-rays. Its compact storage ring with a circumference of 349 m is based on a four-bend achromat lattice to provide two straight sections in each cell for insertion devices with a natural horizont…
▽ More
NanoTerasu, a new 3 GeV synchrotron light source in Japan, began user operation in April 2024. It provides high-brilliance soft to tender X-rays and covers a wide spectral range from ultraviolet to tender X-rays. Its compact storage ring with a circumference of 349 m is based on a four-bend achromat lattice to provide two straight sections in each cell for insertion devices with a natural horizontal emittance of 1.14 nm rad, which is small enough for soft X-rays users. The NanoTerasu accelerator incorporates several innovative technologies, including a full-energy injector C-band linear accelerator with a length of 110 m, an in-vacuum off-axis injection system, a four-bend achromat with B-Q combined bending magnets, and a TM020 mode accelerating cavity with built-in higher-order-mode dampers in the storage ring. This paper presents the accelerator machine commissioning over a half-year period and our model-consistent ring optics correction. The first user operation with a stored beam current of 160 mA is also reported. We summarize the storage ring parameters obtained from the commissioning. This is helpful for estimating the effective optical properties of synchrotron radiation at NanoTerasu.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Observational bounds on a possible electron-to-proton mass ratio variation and constraints in the lepton-specific 2HDM
Authors:
R. G. Albuquerque,
R. F. L. Holanda,
I. E. T. R. Mendonça,
P. S. Rodrigues da Silva
Abstract:
In this work, we test a possible redshift variation of the electron-to-proton mass ratio, $μ= m_e/m_p$, directly from galaxy cluster gas mass fraction measurements and type Ia Supernovae observations. Our analysis is completely independent of any cosmological model. Our result reveals no variation of $μ$ within 1 $σ$ confidence level. From the point of view of Particle Physics, we can use the prec…
▽ More
In this work, we test a possible redshift variation of the electron-to-proton mass ratio, $μ= m_e/m_p$, directly from galaxy cluster gas mass fraction measurements and type Ia Supernovae observations. Our analysis is completely independent of any cosmological model. Our result reveals no variation of $μ$ within 1 $σ$ confidence level. From the point of view of Particle Physics, we can use the precision on these results to constrain the parameter space of models beyond the Standard Model of electroweak interactions. We exemplify this by focusing in a specific Two Higgs Doublet model (2HDM), where the second scalar doublet couples exclusively to leptons. An important parameter in the model concerns the ratio between its vacuum expectation values, defined by $\tanβ$. In our approach we can constrain the inverse parameter (cot$β$) to an optimal value, (tan$β)^{-1}=$ 0.02127 $\pm$ 0.0029, with the largest vacuum expectation value for 2HDM, $v_2$, estimated at around 240.033 $\pm$ 0.21~GeV. Also, by taking into account the $(g-2)_μ$ discrepancy found between theory and experiment, we can reduce the validity region for this model and establish bounds on the scalar masses, in the light of our findings from galaxy clusters data for $μ$. This study contributes valuable insights to the understanding of Particle Physics and Astrophysics interface, establishing a new interplay between data from large scale structure of the Universe and subatomic Physics.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Redefinition of Digital Twin and its Situation Awareness Framework Designing Towards Fourth Paradigm for Energy Internet of Things
Authors:
Xing He,
Yuezhong Tang,
Shuyan Ma,
Qian Ai,
Fei Tao,
Robert Qiu
Abstract:
Traditional knowledge-based situation awareness (SA) modes struggle to adapt to the escalating complexity of today's Energy Internet of Things (EIoT), necessitating a pivotal paradigm shift. In response, this work introduces a pioneering data-driven SA framework, termed digital twin-based situation awareness (DT-SA), aiming to bridge existing gaps between data and demands, and further to enhance S…
▽ More
Traditional knowledge-based situation awareness (SA) modes struggle to adapt to the escalating complexity of today's Energy Internet of Things (EIoT), necessitating a pivotal paradigm shift. In response, this work introduces a pioneering data-driven SA framework, termed digital twin-based situation awareness (DT-SA), aiming to bridge existing gaps between data and demands, and further to enhance SA capabilities within the complex EIoT landscape. First, we redefine the concept of digital twin (DT) within the EIoT context, aligning it with data-intensive scientific discovery paradigm (the Fourth Paradigm) so as to waken EIoT's slee** data; this contextual redefinition lays the cornerstone of our DT-SA framework for EIoT. Then, the framework is comprehensively explored through its four fundamental steps: digitalization, simulation, informatization, and intellectualization. These steps initiate a virtual ecosystem conducive to a continuously self-adaptive, self-learning, and self-evolving big model (BM), further contributing to the evolution and effectiveness of DT-SA in engineering. Our framework is characterized by the incorporation of system theory and Fourth Paradigm as guiding ideologies, DT as data engine, and BM as intelligence engine. This unique combination forms the backbone of our approach. This work extends beyond engineering, step** into the domain of data science -- DT-SA not only enhances management practices for EIoT users/operators, but also propels advancements in pattern analysis and machine intelligence (PAMI) within the intricate fabric of a complex system. Numerous real-world cases validate our DT-SA framework.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
The saddlepoint approximation for averages of conditionally independent random variables
Authors:
Ziang Niu,
Jyotishka Ray Choudhury,
Eugene Katsevich
Abstract:
Motivated by the application of saddlepoint approximations to resampling-based statistical tests, we prove that a Lugananni-Rice style approximation for conditional tail probabilities of averages of conditionally independent random variables has vanishing relative error. We also provide a general condition on the existence and uniqueness of the solution to the corresponding saddlepoint equation. T…
▽ More
Motivated by the application of saddlepoint approximations to resampling-based statistical tests, we prove that a Lugananni-Rice style approximation for conditional tail probabilities of averages of conditionally independent random variables has vanishing relative error. We also provide a general condition on the existence and uniqueness of the solution to the corresponding saddlepoint equation. The results are valid under a broad class of distributions involving no restrictions on the smoothness of the distribution function. The derived saddlepoint approximation formula can be directly applied to resampling-based hypothesis tests, including bootstrap, sign-flip** and conditional randomization tests. Our results extend and connect several classical saddlepoint approximation results. On the way to proving our main results, we prove a new conditional Berry-Esseen inequality for the sum of conditionally independent random variables, which may be of independent interest.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Design and characterization of a 60-cm reflective half-wave plate for the CLASS 90 GHz band telescope
Authors:
Rui Shi,
Michael K. Brewer,
Carol Yan Yan Chan,
David T. Chuss,
Jullianna Denes Couto,
Joseph R. Eimer,
John Karakla,
Koji Shukawa,
Deniz A. N. Valle,
John W. Appel,
Charles L. Bennett,
Sumit Dahal,
Thomas Essinger-Hileman,
Tobias A. Marriage,
Matthew A. Petroff,
Karwan Rostem,
Edward J. Wollack
Abstract:
Front-end polarization modulation enables improved polarization measurement stability by modulating the targeted signal above the low-frequency $1/f$ drifts associated with atmospheric and instrumental instabilities and diminishes the impact of instrumental polarization. In this work, we present the design and characterization of a new 60-cm diameter Reflective Half-Wave Plate (RHWP) polarization…
▽ More
Front-end polarization modulation enables improved polarization measurement stability by modulating the targeted signal above the low-frequency $1/f$ drifts associated with atmospheric and instrumental instabilities and diminishes the impact of instrumental polarization. In this work, we present the design and characterization of a new 60-cm diameter Reflective Half-Wave Plate (RHWP) polarization modulator for the 90 GHz band telescope of the Cosmology Large Angular Scale Surveyor (CLASS) project. The RHWP consists of an array of parallel wires (diameter $50~\mathrm{μm}$, $175~\mathrm{μm}$ pitch) positioned $0.88~\mathrm{mm}$ from an aluminum mirror. In lab tests, it was confirmed that the wire resonance frequency ($f_\mathrm{res}$) profile is consistent with the target, $139~\mathrm{Hz}<f_\mathrm{res}<154~\mathrm{Hz}$ in the optically active region (diameter smaller than $150~\mathrm{mm}$), preventing the wire vibration during operation and reducing the RHWP deformation under the wire tension. The mirror tilt relative to the rotating axis was controlled to be $<15''$, corresponding to an increase in beam width due to beam smearing of $<0.6''$, negligible compared to the beam's full-width half-maximum of $36'$. The median and 16/84th percentile of the wire--mirror separation residual was $0.048^{+0.013}_{-0.014}~\mathrm{mm}$ in the optically active region, achieving a modulation efficiency $ε=96.2_{+0.5}^{-0.4}\%$ with an estimated bandpass of 34 GHz. The angular velocity of the RHWP was maintained to an accuracy of within $0.005\%$ at the nominal rotation frequency ($2.5~\mathrm{Hz}$). The RHWP has been successfully integrated into the CLASS 90 GHz telescope and started taking data in June 2024, replacing the previous modulator that has been in operation since June 2018.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Computationally efficient and statistically accurate conditional independence testing with spaCRT
Authors:
Ziang Niu,
Jyotishka Ray Choudhury,
Eugene Katsevich
Abstract:
We introduce the saddlepoint approximation-based conditional randomization test (spaCRT), a novel conditional independence test that effectively balances statistical accuracy and computational efficiency, inspired by applications to single-cell CRISPR screens. Resampling-based methods like the distilled conditional randomization test (dCRT) offer statistical precision but at a high computational c…
▽ More
We introduce the saddlepoint approximation-based conditional randomization test (spaCRT), a novel conditional independence test that effectively balances statistical accuracy and computational efficiency, inspired by applications to single-cell CRISPR screens. Resampling-based methods like the distilled conditional randomization test (dCRT) offer statistical precision but at a high computational cost. The spaCRT leverages a saddlepoint approximation to the resampling distribution of the dCRT test statistic, achieving very similar finite-sample statistical performance with significantly reduced computational demands. We prove that the spaCRT p-value approximates the dCRT p-value with vanishing relative error, and that these two tests are asymptotically equivalent. Through extensive simulations and real data analysis, we demonstrate that the spaCRT controls Type-I error and maintains high power, outperforming other asymptotic and resampling-based tests. Our method is particularly well-suited for large-scale single-cell CRISPR screen analyses, facilitating the efficient and accurate assessment of perturbation-gene associations.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
PAIL: Performance based Adversarial Imitation Learning Engine for Carbon Neutral Optimization
Authors:
Yuyang Ye,
Lu-An Tang,
Haoyu Wang,
Runlong Yu,
Wenchao Yu,
Erhu He,
Haifeng Chen,
Hui Xiong
Abstract:
Achieving carbon neutrality within industrial operations has become increasingly imperative for sustainable development. It is both a significant challenge and a key opportunity for operational optimization in industry 4.0. In recent years, Deep Reinforcement Learning (DRL) based methods offer promising enhancements for sequential optimization processes and can be used for reducing carbon emission…
▽ More
Achieving carbon neutrality within industrial operations has become increasingly imperative for sustainable development. It is both a significant challenge and a key opportunity for operational optimization in industry 4.0. In recent years, Deep Reinforcement Learning (DRL) based methods offer promising enhancements for sequential optimization processes and can be used for reducing carbon emissions. However, existing DRL methods need a pre-defined reward function to assess the impact of each action on the final sustainable development goals (SDG). In many real applications, such a reward function cannot be given in advance. To address the problem, this study proposes a Performance based Adversarial Imitation Learning (PAIL) engine. It is a novel method to acquire optimal operational policies for carbon neutrality without any pre-defined action rewards. Specifically, PAIL employs a Transformer-based policy generator to encode historical information and predict following actions within a multi-dimensional space. The entire action sequence will be iteratively updated by an environmental simulator. Then PAIL uses a discriminator to minimize the discrepancy between generated sequences and real-world samples of high SDG. In parallel, a Q-learning framework based performance estimator is designed to estimate the impact of each action on SDG. Based on these estimations, PAIL refines generated policies with the rewards from both discriminator and performance estimator. PAIL is evaluated on multiple real-world application cases and datasets. The experiment results demonstrate the effectiveness of PAIL comparing to other state-of-the-art baselines. In addition, PAIL offers meaningful interpretability for the optimization in carbon neutrality.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Some classes of minimal surfaces in the $3$-space with $2m$-norm
Authors:
Makoto Sakaki,
Ryota Tanaka
Abstract:
We discuss translation minimal surfaces, homothetical minimal surfaces, and separable minimal surfaces in the $3$-space with $2m$-norm.
We discuss translation minimal surfaces, homothetical minimal surfaces, and separable minimal surfaces in the $3$-space with $2m$-norm.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
What do AI/ML practitioners think about AI/ML bias?
Authors:
Aastha Pant,
Rashina Hoda,
Burak Turhan,
Chakkrit Tantithamthavorn
Abstract:
AI leaders and companies have much to offer to AI/ML practitioners to support them in addressing and mitigating biases in the AI/ML systems they develop. AI/ML practitioners need to receive the necessary resources and support from experts to develop unbiased AI/ML systems. However, our studies have revealed a discrepancy between practitioners' understanding of 'AI/ML bias' and the definitions of t…
▽ More
AI leaders and companies have much to offer to AI/ML practitioners to support them in addressing and mitigating biases in the AI/ML systems they develop. AI/ML practitioners need to receive the necessary resources and support from experts to develop unbiased AI/ML systems. However, our studies have revealed a discrepancy between practitioners' understanding of 'AI/ML bias' and the definitions of tech companies and researchers. This indicates a misalignment that needs addressing. Efforts should be made to match practitioners' understanding of AI/ML bias with the definitions developed by tech companies and researchers. These efforts could yield a significant return on investment by aiding AI/ML practitioners in develo** unbiased AI/ML systems.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.