Search | arXiv e-print repository

Light and hyper nuclei formation at $\sqrt{s_{\text{NN}}} =$ 3 GeV Au+Au collisions using Wigner coalescence approach

Authors: L. K. Liu, C. L. Hu, X. H. He, S. S. Shi, G. N. Xie

Abstract: The production of light nuclei and hyper-nuclei in heavy-ion collisions, particularly at high baryon density, is crucial for understanding dynamical evolution of collision system and exploring the internal state of nuclear matter of compacted stellar. Despite being a topic of ongoing debate, an improved theoretical understanding is needed. In this work, production of light nuclei ($d$, $t$,… ▽ More The production of light nuclei and hyper-nuclei in heavy-ion collisions, particularly at high baryon density, is crucial for understanding dynamical evolution of collision system and exploring the internal state of nuclear matter of compacted stellar. Despite being a topic of ongoing debate, an improved theoretical understanding is needed. In this work, production of light nuclei ($d$, $t$, $^{3}$He, $^{4}$He) and hyper-nuclei ($^{3}_Λ$H, $^{4}_Λ$H) was investigated using the JAM microscopic transport model combined with an afterburner coalescence process at $\sqrt{s_{\text{NN}}} =$ 3 GeV Au+Au collisions. In the coalescence process, the formation of a specific nucleus is determined by its Wigner function. The calculated $\mathrm{p_T}$ spectra, average $\mathrm{p_T}$, and rapidity distributions were compared with the measurements from the STAR experiment. We investigated the dynamic information carried by light nuclei, and determined the averaged spatial distance $\langle ΔR \rangle$ and momentum difference $\langle ΔP \rangle$ of constituent nucleons ($Λ$) for each nucleus species. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.13190 [pdf, other]

doi 10.1103/PhysRevLett.132.206902

Anomalous Long-Distance Coherence in Critically-Driven Cavity Magnonics

Authors: Ying Yang, Jiguang Yao, Yang Xiao, Pak-Tik Fong, Hoi-Kwan Lau, C. -M. Hu

Abstract: Develo** quantum networks necessitates coherently connecting distant systems via remote strong coupling. Here, we demonstrate long-distance coherence in cavity magnonics operating in the linear regime. By locally setting the cavity near critical coupling with travelling photons, non-local magnon-photon coherence is established via strong coupling over a 2-meter distance. We observe two anomalies… ▽ More Develo** quantum networks necessitates coherently connecting distant systems via remote strong coupling. Here, we demonstrate long-distance coherence in cavity magnonics operating in the linear regime. By locally setting the cavity near critical coupling with travelling photons, non-local magnon-photon coherence is established via strong coupling over a 2-meter distance. We observe two anomalies in this long-distance coherence: first, the coupling strength oscillates twice the period of conventional photon-mediated couplings; second, clear mode splitting is observed within the cavity linewidth. Both effects cannot be explained by conventional coupled-mode theory, which reveal the tip of an iceberg of photon-mediated coupling in systems under critical driving. Our work shows the potential of using critical phenomena for harnessing long-distance coherence in distributed systems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures

Journal ref: Physical Review Letters 132, 206902 (2024)

arXiv:2404.12769 [pdf]

Towards Accurate and Efficient Sorting of Retired Lithium-ion Batteries: A Data Driven Based Electrode Aging Assessment Approach

Authors: Ruohan Guo, Feng Wang, Cungang Hu, Weixiang Shen

Abstract: Retired batteries (RBs) for second-life applications offer promising economic and environmental benefits. However, accurate and efficient sorting of RBs with discrepant characteristics persists as a pressing challenge. In this study, we introduce a data driven based electrode aging assessment approach to address this concern. To this end, a number of 15 feature points are extracted from battery op… ▽ More Retired batteries (RBs) for second-life applications offer promising economic and environmental benefits. However, accurate and efficient sorting of RBs with discrepant characteristics persists as a pressing challenge. In this study, we introduce a data driven based electrode aging assessment approach to address this concern. To this end, a number of 15 feature points are extracted from battery open circuit voltage (OCV) curves to capture their characteristics at different levels of aging, and a convolutional neural network with an optimized structure and minimized input size is established to relocate the relative positions of these OCV feature points. Next, a rapid estimation algorithm is proposed to identify the three electrode aging parameters (EAPs) which best reconstruct the 15 OCV feature points over the entire usable capacity range. Utilizing the three EAPs as sorting indices, we employ an adaptive affinity propagation algorithm to cluster RBs without the need for pre-determining the clustering number. Unlike conventional sorting methods based solely on battery capacity, the proposed method provides profound insights into electrode aging behaviors, minimizes the need for constant-current charging data, and supports module/pack-level tests for the simultaneous processing of high volumes of RBs. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 40 pages, 25 figures

arXiv:2404.11836 [pdf, other]

AI-Empowered RIS-Assisted Networks: CV-Enabled RIS Selection and DNN-Enabled Transmission

Authors: Conggang Hu, Yang Lu, Hongyang Du, Mi Yang, Bo Ai, Dusit Niyato

Abstract: This paper investigates artificial intelligence (AI) empowered schemes for reconfigurable intelligent surface (RIS) assisted networks from the perspective of fast implementation. We formulate a weighted sum-rate maximization problem for a multi-RIS-assisted network. To avoid huge channel estimation overhead due to activate all RISs, we propose a computer vision (CV) enabled RIS selection scheme ba… ▽ More This paper investigates artificial intelligence (AI) empowered schemes for reconfigurable intelligent surface (RIS) assisted networks from the perspective of fast implementation. We formulate a weighted sum-rate maximization problem for a multi-RIS-assisted network. To avoid huge channel estimation overhead due to activate all RISs, we propose a computer vision (CV) enabled RIS selection scheme based on a single shot multi-box detector. To realize real-time resource allocation, a deep neural network (DNN) enabled transmit design is developed to learn the optimal map** from channel information to transmit beamformers and phase shift matrix. Numerical results illustrate that the CV module is able to select of RIS with the best propagation condition. The well-trained DNN achieves similar sum-rate performance to the existing alternative optimization method but with much smaller inference time. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.11380 [pdf]

Non-hermitian magnonic knobbing between electromagnetically induced reflection and transparancy

Authors: Youcai Han, Changhao Meng, Ze** Rao, Jie Qian, Yiming Lv, Li** Zhu, CanMing Hu, Zhenghua An

Abstract: Manipulation of wave propagation through open resonant systems has attracted tremendous interest. When accessible to the open system, the system under study is prone to tempering to out of equilibrium, and a lack of reciprocity is the rule rather than the exception. Open systems correspond to non-hermitian Hamiltonians with very unique properties such as resulting exceptional points and ideal isol… ▽ More Manipulation of wave propagation through open resonant systems has attracted tremendous interest. When accessible to the open system, the system under study is prone to tempering to out of equilibrium, and a lack of reciprocity is the rule rather than the exception. Open systems correspond to non-hermitian Hamiltonians with very unique properties such as resulting exceptional points and ideal isolation. Here, we have found a highly sensitive modulation for the intersection of resonant patch antennas with respect to cavity magnonic coupling by means of an open coupling system of three resonant modes. Two types of crossings are implemented in this study: the first type of crossing remotely controls the sharp switching of the transmission line 's transmittance, while regulating the repulsive behavior of its zero-reflection states. The second type of crossing corresponds to the modulation of non-reciprocal phase transitions, which enables a more desirable isolation effect. Three different coupling models are realized by a non-Hermitian scattering Hamiltonian, revealing distinct spatial overlaps between modes. This elucidates that dissipative coupling of at least two modes to the environment is crucial for non-reciprocal transport. Our work not only reveals the versatility of cavity magnonic systems but also provides a way to design functional devices for general wave optics using patch antenna crossings. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.09544 [pdf, other]

GNNavigator: Towards Adaptive Training of Graph Neural Networks via Automatic Guideline Exploration

Authors: Tong Qiao, Jianlei Yang, Yingjie Qi, Ao Zhou, Chen Bai, Bei Yu, Weisheng Zhao, Chunming Hu

Abstract: Graph Neural Networks (GNNs) succeed significantly in many applications recently. However, balancing GNNs training runtime cost, memory consumption, and attainable accuracy for various applications is non-trivial. Previous training methodologies suffer from inferior adaptability and lack a unified training optimization solution. To address the problem, this work proposes GNNavigator, an adaptive G… ▽ More Graph Neural Networks (GNNs) succeed significantly in many applications recently. However, balancing GNNs training runtime cost, memory consumption, and attainable accuracy for various applications is non-trivial. Previous training methodologies suffer from inferior adaptability and lack a unified training optimization solution. To address the problem, this work proposes GNNavigator, an adaptive GNN training configuration optimization framework. GNNavigator meets diverse GNN application requirements due to our unified software-hardware co-abstraction, proposed GNNs training performance model, and practical design space exploration solution. Experimental results show that GNNavigator can achieve up to 3.1x speedup and 44.9% peak memory reduction with comparable accuracy to state-of-the-art approaches. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: Accepted by DAC'24

arXiv:2404.09185 [pdf, other]

Robust spin order and fragile charge order in Na0.5CoO2 as revealed by time-resolved terahertz spectroscopy

Authors: X. Y. Zhou, S. J. Zhang, D. Wu, H. Wang, B. H. Li, S. F. Wu, Q. M. Liu, T. C. Hu, R. S. Li, J. Y. Yuan, S. X. Xu, Q. Wu, L. Yue, T. Dong, N. L. Wang

Abstract: Near-infrared (NIR) pump-terahertz (THz) probe spectroscopy is used to investigate the charge and spin exciations in a strongly correlated electron compound Na0.5CoO2. This compound exhibits a coexistence of various charge and spin orders arising from intricate interactions among charge, spin, and orbital degrees of freedom. NIR pulses create significantly diverse effects on the charge and spin or… ▽ More Near-infrared (NIR) pump-terahertz (THz) probe spectroscopy is used to investigate the charge and spin exciations in a strongly correlated electron compound Na0.5CoO2. This compound exhibits a coexistence of various charge and spin orders arising from intricate interactions among charge, spin, and orbital degrees of freedom. NIR pulses create significantly diverse effects on the charge and spin orders; while the charge order is easily melted,coherent magnon excitations are present in all fluences examined. Furthermore, a novel π phase shift of the coherent magnon oscillations is observed in the pump-induced change of the terahertz electric field between regions of increasing and decreasing field change. These results unequivocally illustrate that ultrashort laser pulses enable the disentanglement of different interactions within complex systems characterized by multiple orders, providing a fresh perspective on the interplay between itinerant and localized electrons within the Co 3d t2g multiplets. △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2404.08943 [pdf, other]

A Novel State-Centric Necessary Condition for Time-Optimal Control of Controllable Linear Systems Based on Augmented Switching Laws

Authors: Yunan Wang, Chuxiong Hu, Yujie Lin, Zeyang Li, Shize Lin, Suqin He

Abstract: Most existing necessary conditions for optimal control based on adjoining methods require both state information and costate information, yet the lack of costates for a given feasible trajectory in practice impedes the determination of optimality. This paper establishes a novel theoretical framework for time-optimal control of controllable linear systems, proposing the augmented switching law that… ▽ More Most existing necessary conditions for optimal control based on adjoining methods require both state information and costate information, yet the lack of costates for a given feasible trajectory in practice impedes the determination of optimality. This paper establishes a novel theoretical framework for time-optimal control of controllable linear systems, proposing the augmented switching law that represents the input control and the feasibility in a compact form. Given a feasible trajectory, the disturbed trajectory under the constraints of augmented switching law is guaranteed to be feasible, resulting in a novel state-centric necessary condition without dependence on costate information. A first order necessary condition is proposed that the Jacobian matrix of the augmented switching law is not full row rank, which also results in an approach to optimizing a given feasible trajectory further. The proposed necessary condition is applied to the chain-of-integrators systems with full box constraints, contributing to some conclusions challenging to reason by traditional costate-based necessary conditions. △ Less

Submitted 13 April, 2024; originally announced April 2024.

arXiv:2404.08706 [pdf, other]

Game Generation via Large Language Models

Authors: Chengpeng Hu, Yunlong Zhao, Jialin Liu

Abstract: Recently, the emergence of large language models (LLMs) has unlocked new opportunities for procedural content generation. However, recent attempts mainly focus on level generation for specific games with defined game rules such as Super Mario Bros. and Zelda. This paper investigates the game generation via LLMs. Based on video game description language, this paper proposes an LLM-based framework t… ▽ More Recently, the emergence of large language models (LLMs) has unlocked new opportunities for procedural content generation. However, recent attempts mainly focus on level generation for specific games with defined game rules such as Super Mario Bros. and Zelda. This paper investigates the game generation via LLMs. Based on video game description language, this paper proposes an LLM-based framework to generate game rules and levels simultaneously. Experiments demonstrate how the framework works with prompts considering different combinations of context. Our findings extend the current applications of LLMs and offer new insights for generating new games in the area of procedural content generation. △ Less

Submitted 29 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

Comments: 2024 IEEE Conference on Games

arXiv:2404.08382 [pdf, other]

Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think

Authors: Xinpeng Wang, Chengzhi Hu, Bolei Ma, Paul Röttger, Barbara Plank

Abstract: Multiple choice questions (MCQs) are commonly used to evaluate the capabilities of large language models (LLMs). One common way to evaluate the model response is to rank the candidate answers based on the log probability of the first token prediction. An alternative way is to examine the text output. Prior work has shown that first token probabilities lack robustness to changes in MCQ phrasing, an… ▽ More Multiple choice questions (MCQs) are commonly used to evaluate the capabilities of large language models (LLMs). One common way to evaluate the model response is to rank the candidate answers based on the log probability of the first token prediction. An alternative way is to examine the text output. Prior work has shown that first token probabilities lack robustness to changes in MCQ phrasing, and that first token probabilities do not match text answers for instruction-tuned models. Therefore, in this paper, we investigate the robustness of text answers. We show that the text answers are more robust to question perturbations than the first token probabilities, when the first token answers mismatch the text answers. The difference in robustness increases as the mismatch rate becomes greater. As the mismatch reaches over 50\%, the text answer is more robust to option order changes than the debiased first token probabilities using state-of-the-art debiasing methods such as PriDe. Our findings provide further evidence for the benefits of text answer evaluation over first token probability evaluation. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.07577 [pdf, other]

Generating Comprehensive Lithium Battery Charging Data with Generative AI

Authors: Lidang Jiang, Changyan Hu, Sibei Ji, Hang Zhao, Junxiong Chen, Ge He

Abstract: In optimizing performance and extending the lifespan of lithium batteries, accurate state prediction is pivotal. Traditional regression and classification methods have achieved some success in battery state prediction. However, the efficacy of these data-driven approaches heavily relies on the availability and quality of public datasets. Additionally, generating electrochemical data predominantly… ▽ More In optimizing performance and extending the lifespan of lithium batteries, accurate state prediction is pivotal. Traditional regression and classification methods have achieved some success in battery state prediction. However, the efficacy of these data-driven approaches heavily relies on the availability and quality of public datasets. Additionally, generating electrochemical data predominantly through battery experiments is a lengthy and costly process, making it challenging to acquire high-quality electrochemical data. This difficulty, coupled with data incompleteness, significantly impacts prediction accuracy. Addressing these challenges, this study introduces the End of Life (EOL) and Equivalent Cycle Life (ECL) as conditions for generative AI models. By integrating an embedding layer into the CVAE model, we developed the Refined Conditional Variational Autoencoder (RCVAE). Through preprocessing data into a quasi-video format, our study achieves an integrated synthesis of electrochemical data, including voltage, current, temperature, and charging capacity, which is then processed by the RCVAE model. Coupled with customized training and inference algorithms, this model can generate specific electrochemical data for EOL and ECL under supervised conditions. This method provides users with a comprehensive electrochemical dataset, pioneering a new research domain for the artificial synthesis of lithium battery data. Furthermore, based on the detailed synthetic data, various battery state indicators can be calculated, offering new perspectives and possibilities for lithium battery performance prediction. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07343 [pdf, other]

Monitoring AGNs with H$β$ Asymmetry. IV. First Reverberation Map** Results of 14 AGNs

Authors: T. E. Zastrocky, Michael S. Brotherton, Pu Du, Jacob N. McLane, Kianna A. Olson, D. A. Dale, H. A. Kobulnicky, Jaya Maithil, My L. Nguyen, William T. Chick, David H. Kasper, Derek Hand, C. Adelman, Z. Carter, G. Murphree, M. Oeur, T. Roth, S. Schonsberg, M. J. Caradonna, J. Favro, A. J. Ferguson, I. M. Gonzalez, L. M. Hadding, H. D. Hagler, C. J. Rogers , et al. (19 additional authors not shown)

Abstract: We report first-time reverberation map** results for 14 AGNs from the ongoing Monitoring AGNs with H$β$ Asymmetry campaign (MAHA). These results utilize optical spectra obtained with the Long Slit Spectrograph on the Wyoming Infrared 2.3m Telescope between 2017 November-2023 May. MAHA combines long-duration monitoring with high cadence. We report results from multiple observing seasons for 9 of… ▽ More We report first-time reverberation map** results for 14 AGNs from the ongoing Monitoring AGNs with H$β$ Asymmetry campaign (MAHA). These results utilize optical spectra obtained with the Long Slit Spectrograph on the Wyoming Infrared 2.3m Telescope between 2017 November-2023 May. MAHA combines long-duration monitoring with high cadence. We report results from multiple observing seasons for 9 of the 14 objects. These results include H$β$ time lags, supermassive black hole masses, and velocity-resolved time lags. The velocity-resolved lags allow us to investigate the kinematics of the broad-line region. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: 35 pages, 19 figures, accepted for publication in ApJ Supplement

arXiv:2404.07157 [pdf, other]

Local probe of bulk and edge states in a fractional Chern insulator

Authors: Zhurun Ji, Heonjoon Park, Mark E. Barber, Chaowei Hu, Kenji Watanabe, Takashi Taniguchi, Jiun-Haw Chu, Xiaodong Xu, Zhi-xun Shen

Abstract: Fractional quantum Hall effect (FQHE) is a prime example of topological quantum many-body phenomena, arising from the interplay between strong electron correlation, topological order, and time reversal symmetry breaking. Recently, a lattice analog of FQHE at zero magnetic field has been observed, confirming the existence of a zero-field fractional Chern insulator (FCI). Despite this, the bulk-edge… ▽ More Fractional quantum Hall effect (FQHE) is a prime example of topological quantum many-body phenomena, arising from the interplay between strong electron correlation, topological order, and time reversal symmetry breaking. Recently, a lattice analog of FQHE at zero magnetic field has been observed, confirming the existence of a zero-field fractional Chern insulator (FCI). Despite this, the bulk-edge correspondence -- a hallmark of FCI featuring an insulating bulk with conductive edges -- has not been directly observed. In fact, this correspondence has not been visualized in any system for fractional states due to experimental challenges. Here we report the imaging of FCI edge states in twisted MoTe2 by employing a newly developed modality of microwave-impedance microscopy. By tuning the carrier density, we observe the system evolving between metallic and FCI states, the latter of which exhibits insulating bulk and conductive edges as expected from bulk-boundary correspondence. We also observe the evolution of edge states across the topological phase transition from an incompressible Chern insulator state to a metal and finally to a putative charge ordered insulating state as a function of interlayer electric field. The local measurement further reveals tantalizing prospects of neighboring domains with different fractional orders. These findings pave the way for research into topologically protected 1D interfaces between various anyonic states at zero magnetic field, such as topological entanglement entropy, Halperin-Laughlin interfaces, and the creation of non-abelian anyons. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.05829 [pdf, other]

SambaLingo: Teaching Large Language Models New Languages

Authors: Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker

Abstract: Despite the widespread availability of LLMs, there remains a substantial gap in their capabilities and availability across diverse languages. One approach to address these issues has been to take an existing pre-trained LLM and continue to train it on new languages. While prior works have experimented with language adaptation, many questions around best practices and methodology have not been cove… ▽ More Despite the widespread availability of LLMs, there remains a substantial gap in their capabilities and availability across diverse languages. One approach to address these issues has been to take an existing pre-trained LLM and continue to train it on new languages. While prior works have experimented with language adaptation, many questions around best practices and methodology have not been covered. In this paper, we present a comprehensive investigation into the adaptation of LLMs to new languages. Our study covers the key components in this process, including vocabulary extension, direct preference optimization and the data scarcity problem for human alignment in low-resource languages. We scale these experiments across 9 languages and 2 parameter scales (7B and 70B). We compare our models against Llama 2, Aya-101, XGLM, BLOOM and existing language experts, outperforming all prior published baselines. Additionally, all evaluation code and checkpoints are made public to facilitate future research. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 23 pages

arXiv:2404.05605 [pdf, other]

Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems

Authors: Ao Zhou, Jianlei Yang, Tong Qiao, Yingjie Qi, Zhi Yang, Weisheng Zhao, Chunming Hu

Abstract: The key to device-edge co-inference paradigm is to partition models into computation-friendly and computation-intensive parts across the device and the edge, respectively. However, for Graph Neural Networks (GNNs), we find that simply partitioning without altering their structures can hardly achieve the full potential of the co-inference paradigm due to various computational-communication overhead… ▽ More The key to device-edge co-inference paradigm is to partition models into computation-friendly and computation-intensive parts across the device and the edge, respectively. However, for Graph Neural Networks (GNNs), we find that simply partitioning without altering their structures can hardly achieve the full potential of the co-inference paradigm due to various computational-communication overheads of GNN operations over heterogeneous devices. We present GCoDE, the first automatic framework for GNN that innovatively Co-designs the architecture search and the map** of each operation on Device-Edge hierarchies. GCoDE abstracts the device communication process into an explicit operation and fuses the search of architecture and the operations map** in a unified space for joint-optimization. Also, the performance-awareness approach, utilized in the constraint-based search process of GCoDE, enables effective evaluation of architecture efficiency in diverse heterogeneous systems. We implement the co-inference engine and runtime dispatcher in GCoDE to enhance the deployment efficiency. Experimental results show that GCoDE can achieve up to $44.9\times$ speedup and $98.2\%$ energy reduction compared to existing approaches across various applications and system configurations. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: Accepted by DAC'24

arXiv:2404.05158 [pdf, ps, other]

Quantum and Classical Two-photon Interference of Single Photons with Ultralong Coherence Time

Authors: Manman Wang, Yanfeng Li, Hanqing Liu, Haiqiao Ni, Zhichuan Niu, Xiaogang Wei, Renfu Yang, Chengyong Hu

Abstract: Two-photon interference (TPI) is a fundamental phenomenon in quantum optics and plays a crucial role in quantum information science and technology. TPI is commonly considered as quantum interference with an upper bound of $100\%$ for both the TPI visibility and the beat visibility in contrast to its classical counterpart with a maximum visibility of $50\%$. However, this is not always the case. He… ▽ More Two-photon interference (TPI) is a fundamental phenomenon in quantum optics and plays a crucial role in quantum information science and technology. TPI is commonly considered as quantum interference with an upper bound of $100\%$ for both the TPI visibility and the beat visibility in contrast to its classical counterpart with a maximum visibility of $50\%$. However, this is not always the case. Here we report a simultaneous observation of quantum and classical TPI of single photons with ultralong coherence time which is longer than the photon correlation time by five orders of magnitude. We observe a TPI visibility of $94.3\%\pm 0.2\%$ but a beat visibility of $50\%$. Besides an anti-bunching central dip due to single-photon statistics, we observe two bunching side peaks in cross-correlation curves for indistinguishable photons. Using either classical wave superposition theory or quantum field approach, we derive the same expressions for the cross-correlation functions which reproduce and explain the experiments well. We conclude that quantum TPI with a stream of single photons is equivalent to classical TPI, both of which are the fourth-order interference arising from the second-order interference occurring on the time scale of photon coherence time. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures, Comments are welcome

arXiv:2404.04801 [pdf, ps, other]

doi 10.1007/s41605-024-00467-8

LHAASO-KM2A detector simulation using Geant4

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04429 [pdf, other]

doi 10.1016/j.ensm.2024.103343

Physics-Informed Machine Learning for Battery Degradation Diagnostics: A Comparison of State-of-the-Art Methods

Authors: Sina Navidi, Adam Thelen, Tingkai Li, Chao Hu

Abstract: Monitoring the health of lithium-ion batteries' internal components as they age is crucial for optimizing cell design and usage control strategies. However, quantifying component-level degradation typically involves aging many cells and destructively analyzing them throughout the aging test, limiting the scope of quantifiable degradation to the test conditions and duration. Fortunately, recent adv… ▽ More Monitoring the health of lithium-ion batteries' internal components as they age is crucial for optimizing cell design and usage control strategies. However, quantifying component-level degradation typically involves aging many cells and destructively analyzing them throughout the aging test, limiting the scope of quantifiable degradation to the test conditions and duration. Fortunately, recent advances in physics-informed machine learning (PIML) for modeling and predicting the battery state of health demonstrate the feasibility of building models to predict the long-term degradation of a lithium-ion battery cell's major components using only short-term aging test data by leveraging physics. In this paper, we present four approaches for building physics-informed machine learning models and comprehensively compare them, considering accuracy, complexity, ease-of-implementation, and their ability to extrapolate to untested conditions. We delve into the details of each physics-informed machine learning method, providing insights specific to implementing them on small battery aging datasets. Our study utilizes long-term cycle aging data from 24 implantable-grade lithium-ion cells subjected to varying temperatures and C-rates over four years. This paper aims to facilitate the selection of an appropriate physics-informed machine learning method for predicting long-term degradation in lithium-ion batteries, using short-term aging data while also providing insights about when to choose which method for general predictive purposes. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: It's an unformatted version of the paper titled 'Physics-Informed Machine Learning for Battery Degradation Diagnostics: A Comparison of State-of-the-Art Methods,' published in Energy Storage Materials, Volume 68, 103343. This version includes an acknowledgment section, which is not present in the journal-published version. Please cite the journal version when you refer to this study

Journal ref: Energy Storage Materials (2024): 103343

arXiv:2403.19837 [pdf, other]

Concept-based Analysis of Neural Networks via Vision-Language Models

Authors: Ravi Mangal, Nina Narodytska, Divya Gopinath, Boyue Caroline Hu, Anirban Roy, Susmit Jha, Corina Pasareanu

Abstract: The analysis of vision-based deep neural networks (DNNs) is highly desirable but it is very challenging due to the difficulty of expressing formal specifications for vision tasks and the lack of efficient verification procedures. In this paper, we propose to leverage emerging multimodal, vision-language, foundation models (VLMs) as a lens through which we can reason about vision models. VLMs have… ▽ More The analysis of vision-based deep neural networks (DNNs) is highly desirable but it is very challenging due to the difficulty of expressing formal specifications for vision tasks and the lack of efficient verification procedures. In this paper, we propose to leverage emerging multimodal, vision-language, foundation models (VLMs) as a lens through which we can reason about vision models. VLMs have been trained on a large body of images accompanied by their textual description, and are thus implicitly aware of high-level, human-understandable concepts describing the images. We describe a logical specification language $\texttt{Con}_{\texttt{spec}}$ designed to facilitate writing specifications in terms of these concepts. To define and formally check $\texttt{Con}_{\texttt{spec}}$ specifications, we build a map between the internal representations of a given vision model and a VLM, leading to an efficient verification procedure of natural-language properties for vision models. We demonstrate our techniques on a ResNet-based classifier trained on the RIVAL-10 dataset using CLIP as the multimodal model. △ Less

Submitted 10 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.19754 [pdf, other]

GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation

Authors: Mohsen Gholami, Mohammad Akbari, Cindy Hu, Vaden Masrani, Z. Jane Wang, Yong Zhang

Abstract: Knowledge distillation from LLMs is essential for the efficient deployment of language models. Prior works have proposed data generation using LLMs for preparing distilled models. We argue that generating data with LLMs is prone to sampling mainly from the center of original content distribution. This limitation hinders the distilled model from learning the true underlying data distribution and to… ▽ More Knowledge distillation from LLMs is essential for the efficient deployment of language models. Prior works have proposed data generation using LLMs for preparing distilled models. We argue that generating data with LLMs is prone to sampling mainly from the center of original content distribution. This limitation hinders the distilled model from learning the true underlying data distribution and to forget the tails of the distributions (samples with lower probability). To this end, we propose GOLD, a task-agnostic data generation and knowledge distillation framework, which employs an iterative out-of-distribution-guided feedback mechanism for the LLM. As a result, the generated data improves the generalizability of distilled models. An energy-based OOD evaluation approach is also introduced to deal with noisy generated data. Our extensive experiments on 10 different classification and sequence-to-sequence tasks in NLP show that GOLD respectively outperforms prior arts and the LLM with an average improvement of 5% and 14%. We will also show that the proposed method is applicable to less explored and novel tasks. The code is available. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.17675 [pdf, other]

Chattering Phenomena in Time-Optimal Control for High-Order Chain-of-Integrators Systems with Full State Constraints

Authors: Yunan Wang, Chuxiong Hu, Zeyang Li, Yujie Lin, Shize Lin, Suqin He

Abstract: Time-optimal control for high-order chain-of-integrators systems with full state constraints remains an open and challenging problem in the optimal control theory domain. The behaviors of optimal control in high-order problems lack precision characterization, even where the existence of the chattering phenomenon remains unknown and overlooked. This paper establishes a theoretical framework for cha… ▽ More Time-optimal control for high-order chain-of-integrators systems with full state constraints remains an open and challenging problem in the optimal control theory domain. The behaviors of optimal control in high-order problems lack precision characterization, even where the existence of the chattering phenomenon remains unknown and overlooked. This paper establishes a theoretical framework for chattering phenomena in the considered problem, providing novel findings on the uniqueness of state constraints inducing chattering, the upper bound on switching times in an unconstrained arc during chattering, and the convergence of states and costates to the chattering limit point. For the first time, this paper proves the existence of the chattering phenomenon in the considered problem. The chattering optimal control for 4th order problems with velocity constraints is precisely solved, providing an approach to plan strictly time-optimal snap-limited trajectories. Other cases of order $n\leq4$ are proved not to allow chattering. The conclusions correct the longstanding misconception in the industry regarding the time-optimality of S-shaped trajectories with minimal switching times. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.17253 [pdf, ps, other]

Convert laser light into single photons via interference

Authors: Yanfeng Li, Manman Wang, Guoqi Huang, Li Liu, Wenyan Wang, Weijie Ji, Hanqing Liu, Xiangbin Su, Shulun Li, Deyan Dai, Xiangjun Shang, Haiqiao Ni, Zhichuan Niu, Chengyong Hu

Abstract: Laser light possesses perfect coherence, but cannot be attenuated to single photons via linear optics. An elegant route to convert laser light into single photons is based on photon blockade in a cavity with a single atom in the strong coupling regime. However, the single-photon purity achieved by this method remains relatively low. Here we propose an interference-based approach where laser light… ▽ More Laser light possesses perfect coherence, but cannot be attenuated to single photons via linear optics. An elegant route to convert laser light into single photons is based on photon blockade in a cavity with a single atom in the strong coupling regime. However, the single-photon purity achieved by this method remains relatively low. Here we propose an interference-based approach where laser light can be transformed into single photons by destructively interfering with a weak but super-bunched incoherent field emitted from a cavity coupling to a single quantum emitter. We demonstrate this idea by measuring the reflected light of a laser field which drives a double-sided optical microcavity containing a single artificial atom-quantum dot (QD) in the Purcell regime. The reflected light consists of a superposition of the driving field with the cavity output field. We achieve the second-order autocorrelation g2(0)=0.030+-0.002 and the two-photon interference visibility 94.3%+-0.2. By separating the coherent and incoherent fields in the reflected light, we observe that the incoherent field from the cavity exhibits super-bunching with g2(0)=41+-2 while the coherent field remains Poissonian statistics. By controlling the relative amplitude of coherent and incoherent fields, we verify that photon statistics of reflected light is tuneable from perfect anti-bunching to super-bunching in agreement with our predictions. Our results demonstrate photon statistics of light as a quantum interference phenomenon that a single QD can scatter two photons simultaneously at low driving fields in contrast to the common picture that a single two-level quantum emitter can only scatter (or absorb and emit) single photons. This work opens the door to tailoring photon statistics of laser light via cavity or waveguide quantum electrodynamics and interference. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: Comments are welcome

arXiv:2403.16040 [pdf, ps, other]

General One-loop Generating Function by IBP relations

Authors: Bo Feng, Chang Hu, Jiyuan Shen, Yaobo Zhang

Abstract: In this paper we have studied the most general generating function of reduction for one loop integrals with arbitrary tensor structure in numerator and arbitrary power distribution of propagators in denominator. Using IBP relations, we have established the partial differential equations for these generating functions and solved them analytically. These results provide useful guidance for applying… ▽ More In this paper we have studied the most general generating function of reduction for one loop integrals with arbitrary tensor structure in numerator and arbitrary power distribution of propagators in denominator. Using IBP relations, we have established the partial differential equations for these generating functions and solved them analytically. These results provide useful guidance for applying generating function method to reductions of higher loop integrals. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: 50 pages

arXiv:2403.14301 [pdf, other]

Picotesla-sensitivity microcavity optomechanical magnetometry

Authors: Zhi-Gang Hu, Yi-Meng Gao, Jian-Fei Liu, Hao Yang, Min Wang, Yuechen Lei, Xin Zhou, **cheng Li, Xuening Cao, ****g Liang, Chao-Qun Hu, Zhilin Li, Yong-Chang Lau, Jian-Wang Cai, Bei-Bei Li

Abstract: Cavity optomechanical systems have enabled precision sensing of magnetic fields, by leveraging the optical resonance-enhanced readout and mechanical resonance-enhanced response. Previous studies have successfully achieved scalable and reproducible microcavity optomechanical magnetometry (MCOM) by incorporating Terfenol-D thin films into high-quality ($Q$) factor whispering gallery mode (WGM) micro… ▽ More Cavity optomechanical systems have enabled precision sensing of magnetic fields, by leveraging the optical resonance-enhanced readout and mechanical resonance-enhanced response. Previous studies have successfully achieved scalable and reproducible microcavity optomechanical magnetometry (MCOM) by incorporating Terfenol-D thin films into high-quality ($Q$) factor whispering gallery mode (WGM) microcavities. However, the sensitivity was limited to 585 pT/Hz$^{1/2}$, over 20 times inferior to those using Terfenol-D particles. In this work, we propose and demonstrate a high-sensitivity and scalable MCOM approach by sputtering a FeGaB thin film onto a high-$Q$ SiO$_2$ WGM microdisk. Theoretical studies are conducted to explore the magnetic actuation constant and noise-limited sensitivity by varying the parameters of the FeGaB film and SiO$_2$ microdisk. Multiple magnetometers with different radii are fabricated and characterized. By utilizing a microdisk with a radius of 355 $μ$m and a thickness of 1 $μ$m, along with a FeGaB film with a radius of 330 $μ$m and a thickness of 1.3 $μ$m, we have achieved a remarkable peak sensitivity of 1.68 pT/Hz$^{1/2}$ at 9.52 MHz. This represents a significant improvement of over two orders of magnitude compared with previous studies employing sputtered Terfenol-D film. Notably, the magnetometer operates without a bias magnetic field, thanks to the remarkable soft magnetic properties of the FeGaB film. Furthermore, as a proof-of-concept, we have demonstrated the real-time measurement of a pulsed magnetic field simulating the corona current in a high-voltage transmission line using our developed magnetometer. These high-sensitivity magnetometers hold great potential for various applications, such as magnetic induction tomography and corona current monitoring. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.14077 [pdf, other]

Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

Authors: Shan Jia, Reilin Lyu, Kangran Zhao, Yize Chen, Zhiyuan Yan, Yan Ju, Chuanbo Hu, Xin Li, Baoyuan Wu, Siwei Lyu

Abstract: DeepFakes, which refer to AI-generated media content, have become an increasing concern due to their use as a means for disinformation. Detecting DeepFakes is currently solved with programmed machine learning algorithms. In this work, we investigate the capabilities of multimodal large language models (LLMs) in DeepFake detection. We conducted qualitative and quantitative experiments to demonstrat… ▽ More DeepFakes, which refer to AI-generated media content, have become an increasing concern due to their use as a means for disinformation. Detecting DeepFakes is currently solved with programmed machine learning algorithms. In this work, we investigate the capabilities of multimodal large language models (LLMs) in DeepFake detection. We conducted qualitative and quantitative experiments to demonstrate multimodal LLMs and show that they can expose AI-generated images through careful experimental design and prompt engineering. This is interesting, considering that LLMs are not inherently tailored for media forensic tasks, and the process does not require programming. We discuss the limitations of multimodal LLMs for these tasks and suggest possible improvements. △ Less

Submitted 11 June, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.12631 [pdf]

PointGrasp: Point Cloud-based Gras** for Tendon-driven Soft Robotic Glove Applications

Authors: Chen Hu, Shirui Lyu, Eo** Rho, Daekyum Kim, Shan Luo, Letizia Gionfrida

Abstract: Controlling hand exoskeletons to assist individuals with gras** tasks poses a challenge due to the difficulty in understanding user intentions. We propose that most daily gras** tasks during activities of daily living (ADL) can be deduced by analyzing object geometries (simple and complex) from 3D point clouds. The study introduces PointGrasp, a real-time system designed for identifying househ… ▽ More Controlling hand exoskeletons to assist individuals with gras** tasks poses a challenge due to the difficulty in understanding user intentions. We propose that most daily gras** tasks during activities of daily living (ADL) can be deduced by analyzing object geometries (simple and complex) from 3D point clouds. The study introduces PointGrasp, a real-time system designed for identifying household scenes semantically, aiming to support and enhance assistance during ADL for tailored end-to-end gras** tasks. The system comprises an RGB-D camera with an inertial measurement unit and a microprocessor integrated into a tendon-driven soft robotic glove. The RGB-D camera processes 3D scenes at a rate exceeding 30 frames per second. The proposed pipeline demonstrates an average RMSE of 0.8 $\pm$ 0.39 cm for simple and 0.11 $\pm$ 0.06 cm for complex geometries. Within each mode, it identifies and pinpoints reachable objects. This system shows promise in end-to-end vision-driven robotic-assisted rehabilitation manual tasks. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 6 pages, 8 figures, conference

ACM Class: I.2; I.4

arXiv:2403.12373 [pdf, other]

RankPrompt: Step-by-Step Comparisons Make Language Models Better Reasoners

Authors: Chi Hu, Yuan Ge, Xiangnan Ma, Hang Cao, Qiang Li, Yonghua Yang, Tong Xiao, **gbo Zhu

Abstract: Large Language Models (LLMs) have achieved impressive performance across various reasoning tasks. However, even state-of-the-art LLMs such as ChatGPT are prone to logical errors during their reasoning processes. Existing solutions, such as deploying task-specific verifiers or voting over multiple reasoning paths, either require extensive human annotations or fail in scenarios with inconsistent res… ▽ More Large Language Models (LLMs) have achieved impressive performance across various reasoning tasks. However, even state-of-the-art LLMs such as ChatGPT are prone to logical errors during their reasoning processes. Existing solutions, such as deploying task-specific verifiers or voting over multiple reasoning paths, either require extensive human annotations or fail in scenarios with inconsistent responses. To address these challenges, we introduce RankPrompt, a new prompting method that enables LLMs to self-rank their responses without additional resources. RankPrompt breaks down the ranking problem into a series of comparisons among diverse responses, leveraging the inherent capabilities of LLMs to generate chains of comparison as contextual exemplars. Our experiments across 11 arithmetic and commonsense reasoning tasks show that RankPrompt significantly enhances the reasoning performance of ChatGPT and GPT-4, with improvements of up to 13%. Moreover, RankPrompt excels in LLM-based automatic evaluations for open-ended tasks, aligning with human judgments 74% of the time in the AlpacaEval dataset. It also exhibits robustness to variations in response order and consistency. Collectively, our results validate RankPrompt as an effective method for eliciting high-quality feedback from language models. △ Less

Submitted 22 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

Comments: LREC-Coling 2024 Long Paper

arXiv:2403.11465 [pdf]

Ultra-Long Homochiral Graphene Nanoribbons Grown Within h-BN Stacks for High-Performance Electronics

Authors: Bosai Lyu, Jiajun Chen, Sen Wang, Shuo Lou, Peiyue Shen, **gxu Xie, Lu Qiu, Izaac Mitchell, Can Li, Cheng Hu, Xianliang Zhou, Kenji Watanabe, Takashi Taniguchi, Xiaoqun Wang, **feng Jia, Qi Liang, Guorui Chen, Tingxin Li, Shiyong Wang, Wengen Ouyang, Oded Hod, Feng Ding, Michael Urbakh, Zhiwen Shi

Abstract: Van der Waals encapsulation of two-dimensional materials within hexagonal boron nitride (h-BN) stacks has proven to be a promising way to create ultrahigh-performance electronic devices. However, contemporary approaches for achieving van der Waals encapsulation, which involve artificial layer stacking using mechanical transfer techniques, are difficult to control, prone to contamination, and unsca… ▽ More Van der Waals encapsulation of two-dimensional materials within hexagonal boron nitride (h-BN) stacks has proven to be a promising way to create ultrahigh-performance electronic devices. However, contemporary approaches for achieving van der Waals encapsulation, which involve artificial layer stacking using mechanical transfer techniques, are difficult to control, prone to contamination, and unscalable. Here, we report on the transfer-free direct growth of high-quality graphene nanoribbons (GNRs) within h-BN stacks. The as-grown embedded GNRs exhibit highly desirable features being ultralong (up to 0.25 mm), ultranarrow ( < 5 nm), and homochiral with zigzag edges. Our atomistic simulations reveal that the mechanism underlying the embedded growth involves ultralow GNR friction when sliding between AA'-stacked h-BN layers. Using the grown structures, we demonstrate the transfer-free fabrication of embedded GNR field-effect devices that exhibit excellent performance at room temperature with mobilities of up to 4,600 $cm^{2} V^{-1} s^{-1}$ and on-off ratios of up to $10^{6}$. This paves the way to the bottom-up fabrication of high-performance electronic devices based on embedded layered materials. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.10010 [pdf, other]

doi 10.1103/PhysRevLett.132.131002

Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at $3.67 \pm 0.05 \pm 0.15$ PeV. Below the knee, the spectral index is found to be -$2.7413 \pm 0.0004 \pm 0.0050$, while above the knee, it is -$3.128 \pm 0.005 \pm 0.027$, with the sharpness of the transition measured with a statistical error of 2%. The mean logarithmic mass of cosmic rays is almost heavier than helium in the whole measured energy range. It decreases from 1.7 at 0.3 PeV to 1.3 at 3 PeV, representing a 24% decline following a power law with an index of -$0.1200 \pm 0.0003 \pm 0.0341$. This is equivalent to an increase in abundance of light components. Above the knee, the mean logarithmic mass exhibits a power law trend towards heavier components, which is reversal to the behavior observed in the all-particle energy spectrum. Additionally, the knee position and the change in power-law index are approximately the same. These findings suggest that the knee observed in the all-particle spectrum corresponds to the knee of the light component, rather than the medium-heavy components. △ Less

Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: 8 pages, 3 figures

Journal ref: Physical Review Letters 132, 131002 (2024)

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.04158 [pdf, other]

DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

Authors: Ling Ge, Chunming Hu, Guanghui Ma, Jihong Liu, Hong Zhang

Abstract: Multi-Source cross-lingual transfer learning deals with the transfer of task knowledge from multiple labelled source languages to an unlabeled target language under the language shift. Existing methods typically focus on weighting the predictions produced by language-specific classifiers of different sources that follow a shared encoder. However, all source languages share the same encoder, which… ▽ More Multi-Source cross-lingual transfer learning deals with the transfer of task knowledge from multiple labelled source languages to an unlabeled target language under the language shift. Existing methods typically focus on weighting the predictions produced by language-specific classifiers of different sources that follow a shared encoder. However, all source languages share the same encoder, which is updated by all these languages. The extracted representations inevitably contain different source languages' information, which may disturb the learning of the language-specific classifiers. Additionally, due to the language gap, language-specific classifiers trained with source labels are unable to make accurate predictions for the target language. Both facts impair the model's performance. To address these challenges, we propose a Disentangled and Adaptive Network (DA-Net). Firstly, we devise a feedback-guided collaborative disentanglement method that seeks to purify input representations of classifiers, thereby mitigating mutual interference from multiple sources. Secondly, we propose a class-aware parallel adaptation method that aligns class-level distributions for each source-target language pair, thereby alleviating the language pairs' language gap. Experimental results on three different tasks involving 38 languages validate the effectiveness of our approach. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: AAAI 2024

arXiv:2403.03954 [pdf, other]

3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations

Authors: Yanjie Ze, Gu Zhang, Kangning Zhang, Chenyuan Hu, Muhan Wang, Huazhe Xu

Abstract: Imitation learning provides an efficient way to teach robots dexterous skills; however, learning complex skills robustly and generalizablely usually consumes large amounts of human demonstrations. To tackle this challenging problem, we present 3D Diffusion Policy (DP3), a novel visual imitation learning approach that incorporates the power of 3D visual representations into diffusion policies, a cl… ▽ More Imitation learning provides an efficient way to teach robots dexterous skills; however, learning complex skills robustly and generalizablely usually consumes large amounts of human demonstrations. To tackle this challenging problem, we present 3D Diffusion Policy (DP3), a novel visual imitation learning approach that incorporates the power of 3D visual representations into diffusion policies, a class of conditional action generative models. The core design of DP3 is the utilization of a compact 3D visual representation, extracted from sparse point clouds with an efficient point encoder. In our experiments involving 72 simulation tasks, DP3 successfully handles most tasks with just 10 demonstrations and surpasses baselines with a 24.2% relative improvement. In 4 real robot tasks, DP3 demonstrates precise control with a high success rate of 85%, given only 40 demonstrations of each task, and shows excellent generalization abilities in diverse aspects, including space, viewpoint, appearance, and instance. Interestingly, in real robot experiments, DP3 rarely violates safety requirements, in contrast to baseline methods which frequently do, necessitating human intervention. Our extensive evaluation highlights the critical importance of 3D representations in real-world robot learning. Videos, code, and data are available on https://3d-diffusion-policy.github.io . △ Less

Submitted 8 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

Comments: Published at Robotics: Science and Systems (RSS) 2024. Videos, code, and data: https://3d-diffusion-policy.github.io

arXiv:2403.02743 [pdf]

Spectral effects of radiating gases on the ignition in a multiswirl staged model combustor using full-spectrum k distribution method -- A Large Eddy Simulation Investigation

Authors: Hongyuan Di, Chaojun Wang, Chuanlong Hu, Xiao Liu, Lixin Yang

Abstract: Radiative heat transfer has been proven to be important during the ignition process in gas turbine. Those radiating gases (CO2, H2O, CO) generated during combustion may display strong spectral, or nongray behavior, which is difficult to both characterize and calculate. In this work, both the full-spectrum k-distribution (FSK) and weighted-sum-of-gray-gases (WSGG) method, along with the Dynamic-thi… ▽ More Radiative heat transfer has been proven to be important during the ignition process in gas turbine. Those radiating gases (CO2, H2O, CO) generated during combustion may display strong spectral, or nongray behavior, which is difficult to both characterize and calculate. In this work, both the full-spectrum k-distribution (FSK) and weighted-sum-of-gray-gases (WSGG) method, along with the Dynamic-thickened-flame (DTF) and Large-Eddy-Simulation (LES) methods, are used to analyze how spectral behavior affects the ignition process in an experimental gas turbine. Results show that radiation affects the ignition process by heating the relatively low temperature regions. Consequently, each ignition phase is differently affected by different spectral treatments. During the initial kernel phase, spectral properties have minimal influence on flame structures and the ignition delay time due to the negligible radiation and optically-thin scenario. However, during the flame growth phase, significant differences appear in the flame structure and the flame propagation speed among different spectral treatments. After the flame fill the combustor and during the stable combustion phase, differences in flame structures calculated by different models become less, but radiation still play an important role in combustion. Therefore, high-fidelity spectral models are recommended during the modelling of the ignition process in the gas turbine. △ Less

Submitted 12 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

arXiv:2403.01570 [pdf, other]

SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction

Authors: Jiahuan Yan, **tai Chen, Chaowen Hu, Bo Zheng, Yaojun Hu, Jimeng Sun, Jian Wu

Abstract: Recent development of large language models (LLMs) has exhibited impressive zero-shot proficiency on generic and common sense questions. However, LLMs' application on domain-specific vertical questions still lags behind, primarily due to the humiliation problems and deficiencies in vertical knowledge. Furthermore, the vertical data annotation process often requires labor-intensive expert involveme… ▽ More Recent development of large language models (LLMs) has exhibited impressive zero-shot proficiency on generic and common sense questions. However, LLMs' application on domain-specific vertical questions still lags behind, primarily due to the humiliation problems and deficiencies in vertical knowledge. Furthermore, the vertical data annotation process often requires labor-intensive expert involvement, thereby presenting an additional challenge in enhancing the model's vertical capabilities. In this paper, we propose SERVAL, a synergy learning pipeline designed for unsupervised development of vertical capabilities in both LLMs and small models by mutual enhancement. Specifically, SERVAL utilizes the LLM's zero-shot outputs as annotations, leveraging its confidence to teach a robust vertical model from scratch. Reversely, the trained vertical model guides the LLM fine-tuning to enhance its zero-shot capability, progressively improving both models through an iterative process. In medical domain, known for complex vertical knowledge and costly annotations, comprehensive experiments show that, without access to any gold labels, SERVAL with the synergy learning of OpenAI GPT-3.5 and a simple model attains fully-supervised competitive performance across ten widely used medical datasets. These datasets represent vertically specialized medical diagnostic scenarios (e.g., diabetes, heart diseases, COVID-19), highlighting the potential of SERVAL in refining the vertical capabilities of LLMs and training vertical models from scratch, all achieved without the need for annotations. △ Less

Submitted 16 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

arXiv:2402.19401 [pdf, other]

Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance

Authors: Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik

Abstract: While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and co… ▽ More While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and continuous range of changes that correspond to the human perceptive quality (i.e., from the original image to the full distortion of all perceived visual information), along with two novel human-aware metrics for NN evaluation. To compare VCR of NNs with human perception, we conducted extensive experiments on 14 commonly used image corruptions with 7,718 human participants and state-of-the-art robust NN models with different training objectives (e.g., standard, adversarial, corruption robustness), different architectures (e.g., convolution NNs, vision transformers), and different amounts of training data augmentation. Our study showed that: 1) assessing robustness against continuous corruption can reveal insufficient robustness undetected by existing benchmarks; as a result, 2) the gap between NN and human robustness is larger than previously known; and finally, 3) some image corruptions have a similar impact on human perception, offering opportunities for more cost-effective robustness assessments. Our validation set with 14 image corruptions, human robustness data, and the evaluation code is provided as a toolbox and a benchmark. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.18191 [pdf, other]

Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation

Authors: Yuan Ge, Yilun Liu, Chi Hu, Weibin Meng, Shimin Tao, Xiaofeng Zhao, Hongxia Ma, Li Zhang, Hao Yang, Tong Xiao

Abstract: With contributions from the open-source community, a vast amount of instruction tuning (IT) data has emerged. Given the significant resource allocation required by training and evaluating models, it is advantageous to have an efficient method for selecting high-quality IT data. However, existing methods for instruction data selection have limitations such as relying on fragile external APIs, being… ▽ More With contributions from the open-source community, a vast amount of instruction tuning (IT) data has emerged. Given the significant resource allocation required by training and evaluating models, it is advantageous to have an efficient method for selecting high-quality IT data. However, existing methods for instruction data selection have limitations such as relying on fragile external APIs, being affected by biases in GPT models, or reducing the diversity of the selected instruction dataset. In this paper, we propose an industrial-friendly, expert-aligned and diversity-preserved instruction data selection method: Clustering and Ranking (CaR). CaR consists of two steps. The first step involves ranking instruction pairs using a scoring model that is well aligned with expert preferences (achieving an accuracy of 84.25%). The second step involves preserving dataset diversity through a clustering process.In our experiment, CaR selected a subset containing only 1.96% of Alpaca's IT data, yet the underlying AlpaCaR model trained on this subset outperforms Alpaca by an average of 32.1% in GPT-4 evaluations. Furthermore, our method utilizes small models (355M parameters) and requires only 11.2% of the monetary cost compared to existing methods, making it easily deployable in industrial scenarios. △ Less

Submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.14499 [pdf, other]

"My Answer is C": First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models

Authors: Xinpeng Wang, Bolei Ma, Chengzhi Hu, Leon Weber-Genzel, Paul Röttger, Frauke Kreuter, Dirk Hovy, Barbara Plank

Abstract: The open-ended nature of language generation makes the evaluation of autoregressive large language models (LLMs) challenging. One common evaluation approach uses multiple-choice questions (MCQ) to limit the response space. The model is then evaluated by ranking the candidate answers by the log probability of the first token prediction. However, first-tokens may not consistently reflect the final r… ▽ More The open-ended nature of language generation makes the evaluation of autoregressive large language models (LLMs) challenging. One common evaluation approach uses multiple-choice questions (MCQ) to limit the response space. The model is then evaluated by ranking the candidate answers by the log probability of the first token prediction. However, first-tokens may not consistently reflect the final response output, due to model's diverse response styles such as starting with "Sure" or refusing to answer. Consequently, MCQ evaluation is not indicative of model behaviour when interacting with users. But by how much? We evaluate how aligned first-token evaluation is with the text output along several dimensions, namely final option choice, refusal rate, choice distribution and robustness under prompt perturbation. Our results show that the two approaches are severely misaligned on all dimensions, reaching mismatch rates over 60%. Models heavily fine-tuned on conversational or safety data are especially impacted. Crucially, models remain misaligned even when we increasingly constrain prompts, i.e., force them to start with an option letter or example template. Our findings i) underscore the importance of inspecting the text output as well and ii) caution against relying solely on first-token evaluation. △ Less

Submitted 4 July, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: ACL 2024 Findings

arXiv:2402.10987 [pdf, other]

WilKE: Wise-Layer Knowledge Editor for Lifelong Knowledge Editing

Authors: Chenhui Hu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao

Abstract: Knowledge editing aims to rectify inaccuracies in large language models (LLMs) without costly retraining for outdated or erroneous knowledge. However, current knowledge editing methods primarily focus on single editing, failing to meet the requirements for lifelong editing. This study reveals a performance degradation encountered by knowledge editing in lifelong editing, characterized by toxicity… ▽ More Knowledge editing aims to rectify inaccuracies in large language models (LLMs) without costly retraining for outdated or erroneous knowledge. However, current knowledge editing methods primarily focus on single editing, failing to meet the requirements for lifelong editing. This study reveals a performance degradation encountered by knowledge editing in lifelong editing, characterized by toxicity buildup and toxicity flash, with the primary cause identified as pattern unmatch. We introduce a knowledge editing approach named Wise-Layer Knowledge Editor (WilKE), which selects editing layer based on the pattern matching degree of editing knowledge across different layers in language models. Experimental results demonstrate that, in lifelong editing, WilKE exhibits an average improvement of 46.2% and 67.8% on editing GPT2-XL and GPT-J relative to state-of-the-art knowledge editing methods. △ Less

Submitted 5 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: To be published in ACL Findings 2024

arXiv:2402.10476 [pdf, other]

Spike-EVPR: Deep Spiking Residual Network with Cross-Representation Aggregation for Event-Based Visual Place Recognition

Authors: Chenming Hu, Zheng Fang, Kuanxu Hou, Delei Kong, Junjie Jiang, Hao Zhuang, Mingyuan Sun, Xinjie Huang

Abstract: Event cameras have been successfully applied to visual place recognition (VPR) tasks by using deep artificial neural networks (ANNs) in recent years. However, previously proposed deep ANN architectures are often unable to harness the abundant temporal information presented in event streams. In contrast, deep spiking networks exhibit more intricate spatiotemporal dynamics and are inherently well-su… ▽ More Event cameras have been successfully applied to visual place recognition (VPR) tasks by using deep artificial neural networks (ANNs) in recent years. However, previously proposed deep ANN architectures are often unable to harness the abundant temporal information presented in event streams. In contrast, deep spiking networks exhibit more intricate spatiotemporal dynamics and are inherently well-suited to process sparse asynchronous event streams. Unfortunately, directly inputting temporal-dense event volumes into the spiking network introduces excessive time steps, resulting in prohibitively high training costs for large-scale VPR tasks. To address the aforementioned issues, we propose a novel deep spiking network architecture called Spike-EVPR for event-based VPR tasks. First, we introduce two novel event representations tailored for SNN to fully exploit the spatio-temporal information from the event streams, and reduce the video memory occupation during training as much as possible. Then, to exploit the full potential of these two representations, we construct a Bifurcated Spike Residual Encoder (BSR-Encoder) with powerful representational capabilities to better extract the high-level features from the two event representations. Next, we introduce a Shared & Specific Descriptor Extractor (SSD-Extractor). This module is designed to extract features shared between the two representations and features specific to each. Finally, we propose a Cross-Descriptor Aggregation Module (CDA-Module) that fuses the above three features to generate a refined, robust global descriptor of the scene. Our experimental results indicate the superior performance of our Spike-EVPR compared to several existing EVPR pipelines on Brisbane-Event-VPR and DDD20 datasets, with the average Recall@1 increased by 7.61% on Brisbane and 13.20% on DDD20. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 14 pages, 10 figures

arXiv:2402.09291 [pdf, other]

doi 10.1038/s41586-023-07012-5

Rapid spin changes around a magnetar fast radio burst

Authors: Chin-** Hu, Takuto Narita, Teruaki Enoto, George Younes, Zorawar Wadiasingh, Matthew G. Baring, Wynn C. G. Ho, Sebastien Guillot, Paul S. Ray, Tolga Guver, Kaustubh Rajwade, Zaven Arzoumanian, Chryssa Kouveliotou, Alice K. Harding, Keith C. Gendreau

Abstract: Magnetars are neutron stars with extremely high magnetic fields that exhibit various X-ray phenomena such as sporadic sub-second bursts, long-term persistent flux enhancements, and variable rates of rotation period change. In 2020, a fast radio burst (FRB), akin to cosmological millisecond-duration radio bursts, was detected from the Galactic magnetar SGR 1935+2154, confirming the long-suspected a… ▽ More Magnetars are neutron stars with extremely high magnetic fields that exhibit various X-ray phenomena such as sporadic sub-second bursts, long-term persistent flux enhancements, and variable rates of rotation period change. In 2020, a fast radio burst (FRB), akin to cosmological millisecond-duration radio bursts, was detected from the Galactic magnetar SGR 1935+2154, confirming the long-suspected association between some FRBs and magnetars. However, the mechanism for FRB generation in magnetars remains unclear. Here we report the X-ray discovery of an unprecedented double glitch in SGR 1935+2154 within a time interval of approximately nine hours, bracketing an FRB that occurred on October 14, 2022. Each glitch involved a significant increase in the magnetar's spin frequency, being among the largest abrupt changes in neutron star rotation ever observed. Between the glitches, the magnetar exhibited a rapid spin-down phase, accompanied by a profound increase and subsequent decline in its persistent X-ray emission and burst rate. We postulate that a strong, ephemeral, magnetospheric wind provides the torque that rapidly slows the star's rotation. The trigger for the first glitch couples the star's crust to its magnetosphere, enhances the various X-ray signals, and spawns the wind that alters magnetospheric conditions that might produce the FRB. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 46 pages, 9figures, 4 tables, a submitted version of Nature 626, 500 (https://www.nature.com/articles/s41586-023-07012-5)

arXiv:2402.05733 [pdf, other]

TimeArena: Sha** Efficient Multitasking Language Agents in a Time-Aware Simulation

Authors: Yikai Zhang, Siyu Yuan, Caiyu Hu, Kyle Richardson, Yanghua Xiao, Jiangjie Chen

Abstract: Despite remarkable advancements in emulating human-like behavior through Large Language Models (LLMs), current textual simulations do not adequately address the notion of time. To this end, we introduce TimeArena, a novel textual simulated environment that incorporates complex temporal dynamics and constraints that better reflect real-life planning scenarios. In TimeArena, agents are asked to comp… ▽ More Despite remarkable advancements in emulating human-like behavior through Large Language Models (LLMs), current textual simulations do not adequately address the notion of time. To this end, we introduce TimeArena, a novel textual simulated environment that incorporates complex temporal dynamics and constraints that better reflect real-life planning scenarios. In TimeArena, agents are asked to complete multiple tasks as soon as possible, allowing for parallel processing to save time. We implement the dependency between actions, the time duration for each action, and the occupancy of the agent and the objects in the environment. TimeArena grounds to 30 real-world tasks in cooking, household activities, and laboratory work. We conduct extensive experiments with various state-of-the-art LLMs using TimeArena. Our findings reveal that even the most powerful models, e.g., GPT-4, still lag behind humans in effective multitasking, underscoring the need for enhanced temporal awareness in the development of language agents. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: Work in progress

arXiv:2402.00320 [pdf]

DARCS: Memory-Efficient Deep Compressed Sensing Reconstruction for Acceleration of 3D Whole-Heart Coronary MR Angiography

Authors: Zhihao Xue, Fan Yang, Juan Gao, Zhuo Chen, Hao Peng, Chao Zou, Hang **, Chenxi Hu

Abstract: Three-dimensional coronary magnetic resonance angiography (CMRA) demands reconstruction algorithms that can significantly suppress the artifacts from a heavily undersampled acquisition. While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to tr… ▽ More Three-dimensional coronary magnetic resonance angiography (CMRA) demands reconstruction algorithms that can significantly suppress the artifacts from a heavily undersampled acquisition. While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to train an unrolled network. In this study, we propose a memory-efficient deep compressed sensing method by employing a sparsifying transform based on a pre-trained artifact estimation network. The motivation is that the artifact image estimated by a well-trained network is sparse when the input image is artifact-free, and less sparse when the input image is artifact-affected. Thus, the artifact-estimation network can be used as an inherent sparsifying transform. The proposed method, named De-Aliasing Regularization based Compressed Sensing (DARCS), was compared with a traditional compressed sensing method, de-aliasing generative adversarial network (DAGAN), model-based deep learning (MoDL), and plug-and-play for accelerations of 3D CMRA. The results demonstrate that the proposed method improved the reconstruction quality relative to the compared methods by a large margin. Furthermore, the proposed method well generalized for different undersampling rates and noise levels. The memory usage of the proposed method was only 63% of that needed by MoDL. In conclusion, the proposed method achieves improved reconstruction quality for 3D CMRA with reduced memory burden. △ Less

Submitted 2 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

Comments: 10 pages, 8 figures

arXiv:2401.16758 [pdf, other]

doi 10.1093/mnras/stae965

Similarity to earthquakes again: periodic radio pulses of the magnetar SGR 1935+2154 are accompanied by aftershocks like fast radio bursts

Authors: Yuya Tsuzuki, Tomonori Totani, Chin-** Hu, Teruaki Enoto

Abstract: It was recently discovered that the time correlations of repeating fast radio bursts (FRBs) are similar to earthquake aftershocks. Motivated by the association between FRBs and magnetars, here we report correlation function analyses in the time-energy space for the 563 periodic radio pulses and the 579 X-ray short bursts from the magnetar SGR 1935+2154, which is known to have generated FRBs. Altho… ▽ More It was recently discovered that the time correlations of repeating fast radio bursts (FRBs) are similar to earthquake aftershocks. Motivated by the association between FRBs and magnetars, here we report correlation function analyses in the time-energy space for the 563 periodic radio pulses and the 579 X-ray short bursts from the magnetar SGR 1935+2154, which is known to have generated FRBs. Although radio pulses are concentrated near the fixed phase of the rotational cycle, we find that when multiple pulses occur within a single cycle, their correlation properties (aftershock production probability, aftershock rate decaying in power of time, and more) are similar to those of extragalactic FRBs and earthquakes. A possible interpretation is that the radio pulses are produced by rupture of the neutron star crust, and the first pulse within one cycle is triggered by external force periodically exerted on the crust. The periodic external force may be from the interaction of the magnetosphere with material ejected in an outburst. For X-ray bursts, we found no significant correlation signal, though correlation on the same time scale as radio pulses may be hidden due to the long event duration. The aftershock similarity between the periodic radio pulsation and FRBs is surprising, given that the two are energetically very different, and therefore the energy sources would be different. This suggests that the essence of FRB-like phenomena is starquakes, regardless of the energy source, and it is important to search for FRB-like bursts from neutron stars with various properties or environments. △ Less

Submitted 9 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: 9 pages, 7 figures. Accepted by MNRAS

arXiv:2401.16723 [pdf, other]

Improving Business Insurance Loss Models by Leveraging InsurTech Innovation

Authors: Zhiyu Quan, Changyue Hu, Panyi Dong, Emiliano A. Valdez

Abstract: Recent transformative and disruptive advancements in the insurance industry have embraced various InsurTech innovations. In particular, with the rapid progress in data science and computational capabilities, InsurTech is able to integrate a multitude of emerging data sources, shedding light on opportunities to enhance risk classification and claims management. This paper presents a groundbreaking… ▽ More Recent transformative and disruptive advancements in the insurance industry have embraced various InsurTech innovations. In particular, with the rapid progress in data science and computational capabilities, InsurTech is able to integrate a multitude of emerging data sources, shedding light on opportunities to enhance risk classification and claims management. This paper presents a groundbreaking effort as we combine real-life proprietary insurance claims information together with InsurTech data to enhance the loss model, a fundamental component of insurance companies' risk management. Our study further utilizes various machine learning techniques to quantify the predictive improvement of the InsurTech-enhanced loss model over that of the insurance in-house. The quantification process provides a deeper understanding of the value of the InsurTech innovation and advocates potential risk factors that are unexplored in traditional insurance loss modeling. This study represents a successful undertaking of an academic-industry collaboration, suggesting an inspiring path for future partnerships between industry and academic institutions. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.16102 [pdf, other]

Flexible Parallel Neural Network Architecture Model for Early Prediction of Lithium Battery Life

Authors: Lidang Jiang, Zhuoxiang Li, Changyan Hu, Qingsong Huang, Ge He

Abstract: The early prediction of battery life (EPBL) is vital for enhancing the efficiency and extending the lifespan of lithium batteries. Traditional models with fixed architectures often encounter underfitting or overfitting issues due to the diverse data distributions in different EPBL tasks. An interpretable deep learning model of flexible parallel neural network (FPNN) is proposed, which includes an… ▽ More The early prediction of battery life (EPBL) is vital for enhancing the efficiency and extending the lifespan of lithium batteries. Traditional models with fixed architectures often encounter underfitting or overfitting issues due to the diverse data distributions in different EPBL tasks. An interpretable deep learning model of flexible parallel neural network (FPNN) is proposed, which includes an InceptionBlock, a 3D convolutional neural network (CNN), a 2D CNN, and a dual-stream network. The proposed model effectively extracts electrochemical features from video-like formatted data using the 3D CNN and achieves advanced multi-scale feature abstraction through the InceptionBlock. The FPNN can adaptively adjust the number of InceptionBlocks to flexibly handle tasks of varying complexity in EPBL. The test on the MIT dataset shows that the FPNN model achieves outstanding predictive accuracy in EPBL tasks, with MAPEs of 2.47%, 1.29%, 1.08%, and 0.88% when the input cyclic data volumes are 10, 20, 30, and 40, respectively. The interpretability of the FPNN is mainly reflected in its flexible unit structure and parameter selection: its diverse branching structure enables the model to capture features at different scales, thus allowing the machine to learn informative features. The approach presented herein provides an accurate, adaptable, and comprehensible solution for early life prediction of lithium batteries, opening new possibilities in the field of battery health monitoring. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.15703 [pdf, other]

A Bayesian multivariate extreme value mixture model

Authors: Chenglei Hu, Ben Swallow, Daniela Castro-Camilo

Abstract: Impact assessment of natural hazards requires the consideration of both extreme and non-extreme events. Extensive research has been conducted on the joint modeling of bulk and tail in univariate settings; however, the corresponding body of research in the context of multivariate analysis is comparatively scant. This study extends the univariate joint modeling of bulk and tail to the multivariate f… ▽ More Impact assessment of natural hazards requires the consideration of both extreme and non-extreme events. Extensive research has been conducted on the joint modeling of bulk and tail in univariate settings; however, the corresponding body of research in the context of multivariate analysis is comparatively scant. This study extends the univariate joint modeling of bulk and tail to the multivariate framework. Specifically, it pertains to cases where multivariate observations exceed a high threshold in at least one component. We propose a multivariate mixture model that assumes a parametric model to capture the bulk of the distribution, which is in the max-domain of attraction (MDA) of a multivariate extreme value distribution (mGEVD). The tail is described by the multivariate generalized Pareto distribution, which is asymptotically justified to model multivariate threshold exceedances. We show that if all components exceed the threshold, our mixture model is in the MDA of an mGEVD. Bayesian inference based on multivariate random-walk Metropolis-Hastings and the automated factor slice sampler allows us to incorporate uncertainty from the threshold selection easily. Due to computational limitations, simulations and data applications are provided for dimension $d=2$, but a discussion is provided with views toward scalability based on pairwise likelihood. △ Less

Submitted 28 January, 2024; originally announced January 2024.

Comments: 34 pages, 7 figures

arXiv:2401.14699 [pdf, other]

Quantum Oscillations Measurement of the Heavy Electron Mass near the van Hove Singularity in a Kagome Metal

Authors: Elliott Rosenberg, Jonathan DeStefano, Yongbin Lee, Chaowei Hu, Yue Shi, David Graf, Shermane M. Benjamin, Liqin Ke, Jiun-Haw Chu

Abstract: Kagome metals with the Fermi energy tuned near the van Hove singularities (vHss) have shown to host exotic phases including unconventional superconductivity and a chiral flux phase arising from a charge density wave. However, most quantum oscillations studies of the electronic structure of kagome metals focus on compounds which electronically or magnetically order, obscuring the unperturbed vHs. H… ▽ More Kagome metals with the Fermi energy tuned near the van Hove singularities (vHss) have shown to host exotic phases including unconventional superconductivity and a chiral flux phase arising from a charge density wave. However, most quantum oscillations studies of the electronic structure of kagome metals focus on compounds which electronically or magnetically order, obscuring the unperturbed vHs. Here we present quantum oscillation measurements of YV$_6$Sn$_6$ which contains a pristine kagome lattice free from long range order. We discovered quantum oscillations corresponding to a large orbit ($\approx$70% of the Brillouin Zone area) with the heaviest mass ever observed in vanadium based kagome metals ($\approx3.3 m_e$), consistent with a Fermi pocket whose Fermi level is near the vHs. Comparing with first principles calculations suggests that the effective mass of this pocket is highly sensitive to the position of Fermi level. Our study establishes the enhanced density of states associated with a vHs in a kagome metal, allowing further insight into a potential driving mechanism for the unconventional electronic orderings in this class of materials. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.11500 [pdf, other]

Integration of Large Language Models in Control of EHD Pumps for Precise Color Synthesis

Authors: Yanhong Peng, Ceng Zhang, Chenlong Hu, Zebing Mao

Abstract: This paper presents an innovative approach to integrating Large Language Models (LLMs) with Arduino-controlled Electrohydrodynamic (EHD) pumps for precise color synthesis in automation systems. We propose a novel framework that employs fine-tuned LLMs to interpret natural language commands and convert them into specific operational instructions for EHD pump control. This approach aims to enhance u… ▽ More This paper presents an innovative approach to integrating Large Language Models (LLMs) with Arduino-controlled Electrohydrodynamic (EHD) pumps for precise color synthesis in automation systems. We propose a novel framework that employs fine-tuned LLMs to interpret natural language commands and convert them into specific operational instructions for EHD pump control. This approach aims to enhance user interaction with complex hardware systems, making it more intuitive and efficient. The methodology involves four key steps: fine-tuning the language model with a dataset of color specifications and corresponding Arduino code, develo** a natural language processing interface, translating user inputs into executable Arduino code, and controlling EHD pumps for accurate color mixing. Conceptual experiment results, based on theoretical assumptions, indicate a high potential for accurate color synthesis, efficient language model interpretation, and reliable EHD pump operation. This research extends the application of LLMs beyond text-based tasks, demonstrating their potential in industrial automation and control systems. While highlighting the limitations and the need for real-world testing, this study opens new avenues for AI applications in physical system control and sets a foundation for future advancements in AI-driven automation technologies. △ Less

Submitted 21 January, 2024; originally announced January 2024.

arXiv:2401.11181 [pdf, other]

Inference without Interference: Disaggregate LLM Inference for Mixed Downstream Workloads

Authors: Cunchen Hu, Heyang Huang, Liangliang Xu, Xusheng Chen, Jiang Xu, Shuang Chen, Hao Feng, Chenxi Wang, Sa Wang, Yungang Bao, Ninghui Sun, Yizhou Shan

Abstract: Transformer-based large language model (LLM) inference serving is now the backbone of many cloud services. LLM inference consists of a prefill phase and a decode phase. However, existing LLM deployment practices often overlook the distinct characteristics of these phases, leading to significant interference. To mitigate interference, our insight is to carefully schedule and group inference request… ▽ More Transformer-based large language model (LLM) inference serving is now the backbone of many cloud services. LLM inference consists of a prefill phase and a decode phase. However, existing LLM deployment practices often overlook the distinct characteristics of these phases, leading to significant interference. To mitigate interference, our insight is to carefully schedule and group inference requests based on their characteristics. We realize this idea in TetriInfer through three pillars. First, it partitions prompts into fixed-size chunks so that the accelerator always runs close to its computationsaturated limit. Second, it disaggregates prefill and decode instances so each can run independently. Finally, it uses a smart two-level scheduling algorithm augmented with predicted resource usage to avoid decode scheduling hotspots. Results show that TetriInfer improves time-to-first-token (TTFT), job completion time (JCT), and inference efficiency in turns of performance per dollar by a large margin, e.g., it uses 38% less resources all the while lowering average TTFT and average JCT by 97% and 47%, respectively. △ Less

Submitted 20 January, 2024; originally announced January 2024.

arXiv:2401.00283 [pdf, other]

Near-Space Communications: the Last Piece of 6G Space-Air-Ground-Sea Integrated Network Puzzle

Authors: Hongshan Liu, Tong Qin, Zhen Gao, Tianqi Mao, Keke Ying, Ziwei Wan, Li Qiao, Rui Na, Zhongxiang Li, Chun Hu, Yikun Mei, Tuan Li, Guanghui Wen, Lei Chen, Zhonghuai Wu, Ruiqi Liu, Gaojie Chen, Shuo Wang, Dezhi Zheng

Abstract: This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis… ▽ More This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis between the NS-COM network and other counterparts in SAGSIN is conducted, covering aspects of deployment, coverage, channel characteristics and unique problems of NS-COM network. Afterwards, the technical aspects of NS-COM, including channel modeling, random access, channel estimation, array-based beam management and joint network optimization, are examined in detail. Furthermore, we explore the potential applications of NS-COM, such as structural expansion in SAGSIN communication, civil aviation communication, remote and urgent communication, weather monitoring and carbon neutrality. Finally, some promising research avenues are identified, including stratospheric satellite (StratoSat) -to-ground direct links for mobile terminals, reconfigurable multiple-input multiple-output (MIMO) and holographic MIMO, federated learning in NS-COM networks, maritime communication, electromagnetic spectrum sensing and adversarial game, integrated sensing and communications, StratoSat-based radar detection and imaging, NS-COM assisted enhanced global navigation system, NS-COM assisted intelligent unmanned system and free space optical (FSO) communication. Overall, this paper highlights that the NS-COM plays an indispensable role in the SAGSIN puzzle, providing substantial performance and coverage enhancement to the traditional SAGSIN architecture. △ Less

Submitted 4 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

Comments: 28 pages, 8 figures, 2 tables

Showing 51–100 of 1,237 results for author: Hu, C