Search | arXiv e-print repository

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Authors: Pan Zhang, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Rui Qian, Lin Chen, Qipeng Guo, Haodong Duan, Bin Wang, Linke Ouyang, Songyang Zhang, Wenwei Zhang, Yining Li, Yang Gao, Peng Sun, Xinyue Zhang, Wei Li, **gwen Li, Wenhai Wang, Hang Yan, Conghui He, Xingcheng Zhang, Kai Chen, Jifeng Dai, Yu Qiao , et al. (2 additional authors not shown)

Abstract: We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. Trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. Th… ▽ More We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. Trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. This long-context capability allows IXC-2.5 to excel in tasks requiring extensive input and output contexts. Compared to its previous 2.0 version, InternLM-XComposer-2.5 features three major upgrades in vision-language comprehension: (1) Ultra-High Resolution Understanding, (2) Fine-Grained Video Understanding, and (3) Multi-Turn Multi-Image Dialogue. In addition to comprehension, IXC-2.5 extends to two compelling applications using extra LoRA parameters for text-image composition: (1) Crafting Webpages and (2) Composing High-Quality Text-Image Articles. IXC-2.5 has been evaluated on 28 benchmarks, outperforming existing open-source state-of-the-art models on 16 benchmarks. It also surpasses or competes closely with GPT-4V and Gemini Pro on 16 key tasks. The InternLM-XComposer-2.5 is publicly available at https://github.com/InternLM/InternLM-XComposer. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Technical Report. https://github.com/InternLM/InternLM-XComposer

arXiv:2407.03318 [pdf, other]

Fair Division of Indivisible Chores via Earning Restricted Equilibria

Authors: Jugal Garg, Aniket Murhekar, John Qin

Abstract: We study fair division of $m$ indivisible chores among $n$ agents with additive preferences. We consider the desirable fairness notions of envy-freeness up to any chore (EFX) and envy-freeness up to $k$ chores (EF$k$), alongside the efficiency notion of Pareto optimality (PO). We present the first constant approximations of these notions, showing the existence of: - 5-EFX allocations, which impr… ▽ More We study fair division of $m$ indivisible chores among $n$ agents with additive preferences. We consider the desirable fairness notions of envy-freeness up to any chore (EFX) and envy-freeness up to $k$ chores (EF$k$), alongside the efficiency notion of Pareto optimality (PO). We present the first constant approximations of these notions, showing the existence of: - 5-EFX allocations, which improve the best-known factor of $O(n^2)$-EFX. - 3-EFX and PO allocations for the special case of bivalued instances, which improve the best-known factor of $O(n)$-EFX without any efficiency guarantees. - 2-EF2 + PO allocations, which improve the best-known factor of EF$m$ + PO. A notable contribution of our work is the introduction of the novel concept of earning-restricted (ER) competitive equilibrium for fractional allocations, which limits agents' earnings from each chore. Technically, our work addresses two main challenges: proving the existence of an ER equilibrium and designing algorithms that leverage ER equilibria to achieve the above results. To tackle the first challenge, we formulate a linear complementarity problem (LCP) formulation that captures all ER equilibria and show that the classic complementary pivot algorithm on the LCP must terminate at an ER equilibrium. For the second challenge, we carefully set the earning limits and use properties of ER equilibria to design sophisticated procedures that involve swap** and merging bundles to meet the desired fairness and efficiency criteria. We expect that the concept of ER equilibrium will be instrumental in deriving further results on related problems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 54 pages

arXiv:2407.03316 [pdf, other]

An Upper Limit on the Photoproduction Cross Section of the Spin-Exotic $π_1(1600)$

Authors: F. Afzal, C. S. Akondi, M. Albrecht, M. Amaryan, S. Arrigo, V. Arroyave, A. Asaturyan, A. Austregesilo, Z. Baldwin, F. Barbosa, J. Barlow, E. Barriga, R. Barsotti, D. Barton, V. Baturin, V. V. Berdnikov, T. Black, W. Boeglin, M. Boer, W. J. Briscoe, T. Britton, S. Cao, E. Chudakov, G. Chung, P. L. Cole , et al. (124 additional authors not shown)

Abstract: The spin-exotic hybrid meson $π_{1}(1600)$ is predicted to have a large decay rate to the $ωππ$ final state. Using 76.6~pb$^{-1}$ of data collected with the GlueX detector, we measure the cross sections for the reactions $γp \to ωπ^+ π^- p$, $γp \to ωπ^0 π^0 p$, and $γp\toωπ^-π^0Δ^{++}$ in the range $E_γ=$ 8-10 GeV. Using isospin conservation, we set the first upper limits on the photoproduction c… ▽ More The spin-exotic hybrid meson $π_{1}(1600)$ is predicted to have a large decay rate to the $ωππ$ final state. Using 76.6~pb$^{-1}$ of data collected with the GlueX detector, we measure the cross sections for the reactions $γp \to ωπ^+ π^- p$, $γp \to ωπ^0 π^0 p$, and $γp\toωπ^-π^0Δ^{++}$ in the range $E_γ=$ 8-10 GeV. Using isospin conservation, we set the first upper limits on the photoproduction cross sections of the $π^{0}_{1}(1600)$ and $π^{-}_{1}(1600)$. We combine these limits with lattice calculations of decay widths and find that photoproduction of $η'π$ is the most sensitive two-body system to search for the $π_1(1600)$. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 6 pages, 3 figures plus supplemental materials

arXiv:2407.03314 [pdf, other]

BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations

Authors: Zhantao Yang, Ruili Feng, Keyu Yan, Huangji Wang, Zhicai Wang, Shangwen Zhu, Han Zhang, Jie Xiao, **yu Wu, Kai Zhu, Jixuan Chen, Chen-Wei Xie, Chaojie Mao, Yue Yang, Hongyang Zhang, Yu Liu, Fan Cheng

Abstract: This paper presents Bag-of-Concept Graph (BACON) to gift models with limited linguistic abilities to taste the privilege of Vision Language Models (VLMs) and boost downstream tasks such as detection, visual question answering (VQA), and image generation. Since the visual scenes in physical worlds are structured with complex relations between objects, BACON breaks down annotations into basic minimu… ▽ More This paper presents Bag-of-Concept Graph (BACON) to gift models with limited linguistic abilities to taste the privilege of Vision Language Models (VLMs) and boost downstream tasks such as detection, visual question answering (VQA), and image generation. Since the visual scenes in physical worlds are structured with complex relations between objects, BACON breaks down annotations into basic minimum elements and presents them in a graph structure. Element-wise style enables easy understanding, and structural composition liberates difficult locating. Careful prompt design births the BACON captions with the help of public-available VLMs and segmentation methods. In this way, we gather a dataset with 100K annotated images, which endow VLMs with remarkable capabilities, such as accurately generating BACON, transforming prompts into BACON format, envisioning scenarios in the style of BACONr, and dynamically modifying elements within BACON through interactive dialogue and more. Wide representative experiments, including detection, VQA, and image generation tasks, tell BACON as a lifeline to achieve previous out-of-reach tasks or excel in their current cutting-edge solutions. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03311 [pdf, other]

Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations

Authors: Trevor Ablett, Bryan Chan, Jayce Haoran Wang, Jonathan Kelly

Abstract: Learning from examples of success is an appealing approach to reinforcement learning that eliminates many of the disadvantages of using hand-crafted reward functions or full expert-demonstration trajectories, both of which can be difficult to acquire, biased, or suboptimal. However, learning from examples alone dramatically increases the exploration challenge, especially for complex tasks. This wo… ▽ More Learning from examples of success is an appealing approach to reinforcement learning that eliminates many of the disadvantages of using hand-crafted reward functions or full expert-demonstration trajectories, both of which can be difficult to acquire, biased, or suboptimal. However, learning from examples alone dramatically increases the exploration challenge, especially for complex tasks. This work introduces value-penalized auxiliary control from examples (VPACE); we significantly improve exploration in example-based control by adding scheduled auxiliary control and examples of auxiliary tasks. Furthermore, we identify a value-calibration problem, where policy value estimates can exceed their theoretical limits based on successful data. We resolve this problem, which is exacerbated by learning auxiliary tasks, through the addition of an above-success-level value penalty. Across three simulated and one real robotic manipulation environment, and 21 different main tasks, we show that our approach substantially improves learning efficiency. Videos, code, and datasets are available at https://papers.starslab.ca/vpace. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Submitted to the Conference on Robot Learning (CoRL'24), Munich, Germany, Nov. 6-9, 2024

arXiv:2407.03298 [pdf, other]

Eyes on the Game: Deciphering Implicit Human Signals to Infer Human Proficiency, Trust, and Intent

Authors: Nikhil Hulle, Stéphane Aroca-Ouellette, Anthony J. Ries, Jake Brawer, Katharina von der Wense, Alessandro Roncone

Abstract: Effective collaboration between humans and AIs hinges on transparent communication and alignment of mental models. However, explicit, verbal communication is not always feasible. Under such circumstances, human-human teams often depend on implicit, nonverbal cues to glean important information about their teammates such as intent and expertise, thereby bolstering team alignment and adaptability. A… ▽ More Effective collaboration between humans and AIs hinges on transparent communication and alignment of mental models. However, explicit, verbal communication is not always feasible. Under such circumstances, human-human teams often depend on implicit, nonverbal cues to glean important information about their teammates such as intent and expertise, thereby bolstering team alignment and adaptability. Among these implicit cues, two of the most salient and fundamental are a human's actions in the environment and their visual attention. In this paper, we present a novel method to combine eye gaze data and behavioral data, and evaluate their respective predictive power for human proficiency, trust, and intent. We first collect a dataset of paired eye gaze and gameplay data in the fast-paced collaborative "Overcooked" environment. We then train models on this dataset to compare how the predictive powers differ between gaze data, gameplay data, and their combination. We additionally compare our method to prior works that aggregate eye gaze data and demonstrate how these aggregation methods can substantially reduce the predictive ability of eye gaze. Our results indicate that, while eye gaze data and gameplay data excel in different situations, a model that integrates both types consistently outperforms all baselines. This work paves the way for develo** intuitive and responsive agents that can efficiently adapt to new teammates. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 7 pages, 5 figures, To be published in The 33rd IEEE International Conference on Robot and Human Interactive Communication, IEEE RO-MAN 2024

arXiv:2407.03291 [pdf, other]

VCHAR:Variance-Driven Complex Human Activity Recognition framework with Generative Representation

Authors: Yuan Sun, Navid Salami Pargoo, Taqiya Ehsan, Zhao Zhang Jorge Ortiz

Abstract: Complex human activity recognition (CHAR) remains a pivotal challenge within ubiquitous computing, especially in the context of smart environments. Existing studies typically require meticulous labeling of both atomic and complex activities, a task that is labor-intensive and prone to errors due to the scarcity and inaccuracies of available datasets. Most prior research has focused on datasets tha… ▽ More Complex human activity recognition (CHAR) remains a pivotal challenge within ubiquitous computing, especially in the context of smart environments. Existing studies typically require meticulous labeling of both atomic and complex activities, a task that is labor-intensive and prone to errors due to the scarcity and inaccuracies of available datasets. Most prior research has focused on datasets that either precisely label atomic activities or, at minimum, their sequence approaches that are often impractical in real world settings.In response, we introduce VCHAR (Variance-Driven Complex Human Activity Recognition), a novel framework that treats the outputs of atomic activities as a distribution over specified intervals. Leveraging generative methodologies, VCHAR elucidates the reasoning behind complex activity classifications through video-based explanations, accessible to users without prior machine learning expertise. Our evaluation across three publicly available datasets demonstrates that VCHAR enhances the accuracy of complex activity recognition without necessitating precise temporal or sequential labeling of atomic activities. Furthermore, user studies confirm that VCHAR's explanations are more intelligible compared to existing methods, facilitating a broader understanding of complex activity recognition among non-experts. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03290 [pdf, other]

doi 10.1103/PhysRevB.108.094111

Thermal and mechanical properties and the structural phase transition under pressure in $A$In$_2$As$_2$ ($A$ = Ca, Sr, Ba)

Authors: Wen-Ti Guo, Zhigao Huang, Jian-Min Zhang

Abstract: Experimental results that BaIn2As2 and Ca(Sr)In2As2, which are the same class of alkali metal compounds, belong to different structural phases have puzzled the current materials physics community. Here, we investigate the pressure-induced structural phase transition of AIn2As2 and its accompanying improvement in mechanical and thermal properties. Firstly, the structural stability of the materials… ▽ More Experimental results that BaIn2As2 and Ca(Sr)In2As2, which are the same class of alkali metal compounds, belong to different structural phases have puzzled the current materials physics community. Here, we investigate the pressure-induced structural phase transition of AIn2As2 and its accompanying improvement in mechanical and thermal properties. Firstly, the structural stability of the materials and their structural phase transitions under pressure are characterized by enthalpy and double checking by phonon dispersion spectrum. We also confirm the structural phase transitions of the hexagonal and monoclinic phases from a group-theoretic point of view, associating their symmetry operations using transformation matrices. In terms of mechanical properties, we propose an effective scheme for pressure modulation of the anisotropy of AIn2As2 materials and to induce the transformation of AIn2As2 from isotropic to anisotropic (hexagonal) and from brittle to ductile (hexagonal and monoclinic). Meanwhile, we find the negative Poisson's ratio phenomenon under compression and tension, which is favorable for a wide range of applications of this series of materials in aerospace, medicine, sensors, etc. In terms of thermal properties, applying pressure will enhance the structural phase transition temperature of AIn2As2 materials to near room temperature. We further give direct evidence of phonon softening based on group velocity calculations and reveal that phonon softening prevents the heat capacity from reaching the Dulong-Petit limit. Our study provides a theoretical basis for selecting stable structural phases and pioneering thermodynamic property studies of the thermoelectric topological candidate material AIn2As2. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 22 pages, 13 figures

Journal ref: Phys. Rev. B 108, 094111 (2023)

arXiv:2407.03287 [pdf, other]

Generic Complex Polynomial Vector Fields with Real Coefficients

Authors: Jonathan Godin, Christiane Rousseau

Abstract: The paper studies the complex 1-dimensional polynomial vector fields with real coefficients under topological orbital equivalence preserving the separatrices of the pole at infinity. The number of generic strata is determined, and a complete parametrization of these strata is given in terms of a modulus formed by a combinatorial and an analytic part. The bifurcation diagram is described for the de… ▽ More The paper studies the complex 1-dimensional polynomial vector fields with real coefficients under topological orbital equivalence preserving the separatrices of the pole at infinity. The number of generic strata is determined, and a complete parametrization of these strata is given in terms of a modulus formed by a combinatorial and an analytic part. The bifurcation diagram is described for the degree 4. A realization theorem is proved for any generic modulus. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 24 pages, 9 figures

arXiv:2407.03286 [pdf, other]

Large Language Models for JSON Schema Discovery

Authors: Michael J. Mior

Abstract: Semi-structured data formats such as JSON have proved to be useful data models for applications that require flexibility in the format of data stored. However, JSON data often come without the schemas that are typically available with relational data. This has resulted in a number of tools for discovering schemas from a collection of data. Although such tools can be useful, existing approaches foc… ▽ More Semi-structured data formats such as JSON have proved to be useful data models for applications that require flexibility in the format of data stored. However, JSON data often come without the schemas that are typically available with relational data. This has resulted in a number of tools for discovering schemas from a collection of data. Although such tools can be useful, existing approaches focus on the syntax of documents and ignore semantic information. In this work, we explore the automatic addition of meaningful semantic information to discovered schemas similar to information that is added by human schema authors. We leverage large language models and a corpus of manually authored JSON Schema documents to generate natural language descriptions of schema elements, meaningful names for reusable definitions, and identify which discovered properties are most useful and which can be considered "noise". Our approach performs well on existing metrics for text generation that have been previously shown to correlate well with human judgement. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03284 [pdf, other]

Aligning Planet-Hosting Binaries via Dissipative Precession in Circumstellar Disks

Authors: Konstantin Gerbig, Malena Rice, J. J. Zanazzi, Sam Christian, Andrew Vanderburg

Abstract: Recent observations have demonstrated that some subset of even moderately wide-separation planet-hosting binaries are preferentially configured such that planetary and binary orbits appear to lie within the same plane. In this work, we explore dissipation during the protoplanetary disk phase, induced by disk war** as the system is forced into nodal recession by an inclined binary companion as a… ▽ More Recent observations have demonstrated that some subset of even moderately wide-separation planet-hosting binaries are preferentially configured such that planetary and binary orbits appear to lie within the same plane. In this work, we explore dissipation during the protoplanetary disk phase, induced by disk war** as the system is forced into nodal recession by an inclined binary companion as a possible avenue of achieving orbit-orbit alignment. We analytically model the coupled evolution of the disk angular momentum vector and stellar spin vector under the influence of a distant binary companion. We find that a population of systems with random initial orientations can appear detectably more aligned after undergoing dissipative precession, and that this process can simultaneously produce an obliquity distribution that is consistent with observations. While dissipative precession proceeds efficiently in close binaries, favorable system properties (e.g., $r_{out} \gtrsim 100$ AU, $α\gtrsim 0.05$, and/or $M_b/M_{*} \gtrsim 1$) are required to reproduce observed alignment trends at wider binary separations $a_\mathrm{b} \gtrsim450$ AU. Our framework further predicts that circum-primary planets in systems with high stellar mass ratios should be preferentially less aligned than planets in equal-mass stellar binary systems. We discover tentative evidence for this trend in \textit{Gaia} DR3 and TESS data. Our findings suggest that dissipative precession may play a significant role in sculpting orbital configurations in a sub-set of moderately-wide planet-hosting binaries, but is likely not solely responsible for their observed population-level alignment. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 18 pages, 9 figures, accepted for publication in ApJ

arXiv:2407.03274 [pdf, other]

Using Photoplethysmography to Detect Real-time Blood Pressure Changes with a Calibration-free Deep Learning Model

Authors: **gyuan Hong, Manasi Nandi, Weiwei **, Jordi Alastruey

Abstract: Blood pressure (BP) changes are linked to individual health status in both clinical and non-clinical settings. This study developed a deep learning model to classify systolic (SBP), diastolic (DBP), and mean (MBP) BP changes using photoplethysmography (PPG) waveforms. Data from the Vital Signs Database (VitalDB) comprising 1,005 ICU patients with synchronized PPG and BP recordings was used. BP cha… ▽ More Blood pressure (BP) changes are linked to individual health status in both clinical and non-clinical settings. This study developed a deep learning model to classify systolic (SBP), diastolic (DBP), and mean (MBP) BP changes using photoplethysmography (PPG) waveforms. Data from the Vital Signs Database (VitalDB) comprising 1,005 ICU patients with synchronized PPG and BP recordings was used. BP changes were categorized into three labels: Spike (increase above a threshold), Stable (change within a plus or minus threshold), and Dip (decrease below a threshold). Four time-series classification models were studied: multi-layer perceptron, convolutional neural network, residual network, and Encoder. A subset of 500 patients was randomly selected for training and validation, ensuring a uniform distribution across BP change labels. Two test datasets were compiled: Test-I (n=500) with a uniform distribution selection process, and Test-II (n=5) without. The study also explored the impact of including second-deviation PPG (sdPPG) waveforms as additional input information. The Encoder model with a Softmax weighting process using both PPG and sdPPG waveforms achieved the highest detection accuracy--exceeding 71.3% and 85.4% in Test-I and Test-II, respectively, with thresholds of 30 mmHg for SBP, 15 mmHg for DBP, and 20 mmHg for MBP. Corresponding F1-scores were over 71.8% and 88.5%. These findings confirm that PPG waveforms are effective for real-time monitoring of BP changes in ICU settings and suggest potential for broader applications. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 8 pages, 5 figures, 7 tables, 1 supplementary material

arXiv:2407.03270 [pdf, other]

Lattices, Gates, and Curves: GKP codes as a Rosetta stone

Authors: Jonathan Conrad, Ansgar G. Burchards, Steven T. Flammia

Abstract: Gottesman-Kitaev-Preskill (GKP) codes are a promising candidate for implementing fault tolerant quantum computation in quantum harmonic oscillator systems such as superconducting resonators, optical photons and trapped ions, and in recent years theoretical and experimental evidence for their utility has steadily grown. It is known that logical Clifford operations on GKP codes can be implemented fa… ▽ More Gottesman-Kitaev-Preskill (GKP) codes are a promising candidate for implementing fault tolerant quantum computation in quantum harmonic oscillator systems such as superconducting resonators, optical photons and trapped ions, and in recent years theoretical and experimental evidence for their utility has steadily grown. It is known that logical Clifford operations on GKP codes can be implemented fault tolerantly using only Gaussian operations, and several theoretical investigations have illuminated their general structure. In this work, we explain how GKP Clifford gates arise as symplectic automorphisms of the corresponding GKP lattice and show how they are identified with the map** class group of suitable genus $n$ surfaces. This correspondence introduces a topological interpretation of fault tolerance for GKP codes and motivates the connection between GKP codes (lattices), their Clifford gates, and algebraic curves, which we explore in depth. For a single-mode GKP code, we identify the space of all GKP codes with the moduli space of elliptic curves, given by the three sphere with a trefoil knot removed, and explain how logical degrees of freedom arise from the choice of a level structure on the corresponding curves. We discuss how the implementation of Clifford gates corresponds to homotopically nontrivial loops on the space of all GKP codes and show that the modular Rademacher function describes a topological invariant for certain Clifford gates implemented by such loops. Finally, we construct a universal family of GKP codes and show how it gives rise to an explicit construction of fiber bundle fault tolerance as proposed by Gottesman and Zhang for the GKP code. On our path towards understanding this correspondence, we introduce a general algebraic geometric perspective on GKP codes and their moduli spaces, which uncovers a map towards many possible routes of future research. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 31 pages, 14 figures, comments welcome!

arXiv:2407.03267 [pdf]

Insulator-to-Metal Transition and Isotropic Gigantic Magnetoresistance in Layered Magnetic Semiconductors

Authors: Gokul Acharya, Bimal Neupane, Chia-Hsiu Hsu, Xian P. Yang, David Graf, Eun Sang Choi, Krishna Pandey, Md Rafique Un Nabi, Santosh Karki Chhetri, Rabindra Basnet, Sumaya Rahman, Jian Wang, Zhengxin Hu, Bo Da, Hugh Churchill, Guoqing Chang, M. Zahid Hasan, Yuanxi Wang, ** Hu

Abstract: Magnetotransport, the response of electrical conduction to external magnetic field, acts as an important tool to reveal fundamental concepts behind exotic phenomena and plays a key role in enabling spintronic applications. Magnetotransport is generally sensitive to magnetic field orientations. In contrast, efficient and isotropic modulation of electronic transport, which is useful in technology ap… ▽ More Magnetotransport, the response of electrical conduction to external magnetic field, acts as an important tool to reveal fundamental concepts behind exotic phenomena and plays a key role in enabling spintronic applications. Magnetotransport is generally sensitive to magnetic field orientations. In contrast, efficient and isotropic modulation of electronic transport, which is useful in technology applications such as omnidirectional sensing, is rarely seen, especially for pristine crystals. Here we propose a strategy to realize extremely strong modulation of electron conduction by magnetic field which is independent of field direction. GdPS, a layered antiferromagnetic semiconductor with resistivity anisotropies, supports a field-driven insulator-to-metal transition with a paradoxically isotropic gigantic negative magnetoresistance insensitive to magnetic field orientations. This isotropic magnetoresistance originates from the combined effects of a near-zero spin-orbit coupling of Gd3+-based half-filling f-electron system and the strong on-site f-d exchange coupling in Gd atoms. Our results not only provide a novel material system with extraordinary magnetotransport that offers a missing block for antiferromagnet-based ultrafast and efficient spintronic devices, but also demonstrate the key ingredients for designing magnetic materials with desired transport properties for advanced functionalities. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 44 pages, 18 figures

arXiv:2407.03266 [pdf, other]

Do Quantum Neural Networks have Simplicity Bias?

Authors: Jessica Pointing

Abstract: One hypothesis for the success of deep neural networks (DNNs) is that they are highly expressive, which enables them to be applied to many problems, and they have a strong inductive bias towards solutions that are simple, known as simplicity bias, which allows them to generalise well on unseen data because most real-world data is structured (i.e. simple). In this work, we explore the inductive bia… ▽ More One hypothesis for the success of deep neural networks (DNNs) is that they are highly expressive, which enables them to be applied to many problems, and they have a strong inductive bias towards solutions that are simple, known as simplicity bias, which allows them to generalise well on unseen data because most real-world data is structured (i.e. simple). In this work, we explore the inductive bias and expressivity of quantum neural networks (QNNs), which gives us a way to compare their performance to those of DNNs. Our results show that it is possible to have simplicity bias with certain QNNs, but we prove that this type of QNN limits the expressivity of the QNN. We also show that it is possible to have QNNs with high expressivity, but they either have no inductive bias or a poor inductive bias and result in a worse generalisation performance compared to DNNs. We demonstrate that an artificial (restricted) inductive bias can be produced by intentionally restricting the expressivity of a QNN. Our results suggest a bias-expressivity tradeoff. Our conclusion is that the QNNs we studied can not generally offer an advantage over DNNs, because these QNNs either have a poor inductive bias or poor expressivity compared to DNNs. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 9 pages, 42 pages with appendices

arXiv:2407.03249 [pdf, other]

Quantum coarsening and collective dynamics on a programmable quantum simulator

Authors: Tom Manovitz, Sophie H. Li, Sepehr Ebadi, Rhine Samajdar, Alexandra A. Geim, Simon J. Evered, Dolev Bluvstein, Hengyun Zhou, Nazli Uğur Köylüoğlu, Johannes Feldmeier, Pavel E. Dolgirev, Nishad Maskara, Marcin Kalinowski, Subir Sachdev, David A. Huse, Markus Greiner, Vladan Vuletić, Mikhail D. Lukin

Abstract: Understanding the collective quantum dynamics of nonequilibrium many-body systems is an outstanding challenge in quantum science. In particular, dynamics driven by quantum fluctuations are important for the formation of exotic quantum phases of matter \cite{altman2023quantum}, fundamental high-energy processes \cite{bauer2023highenergy}, quantum metrology \cite{degen2017sensing, li2023scrambling},… ▽ More Understanding the collective quantum dynamics of nonequilibrium many-body systems is an outstanding challenge in quantum science. In particular, dynamics driven by quantum fluctuations are important for the formation of exotic quantum phases of matter \cite{altman2023quantum}, fundamental high-energy processes \cite{bauer2023highenergy}, quantum metrology \cite{degen2017sensing, li2023scrambling}, and quantum algorithms \cite{ebadi2022quantum}. Here, we use a programmable quantum simulator based on Rydberg atom arrays to experimentally study collective dynamics across a (2+1)D Ising quantum phase transition. After crossing the quantum critical point, we observe a gradual growth of correlations through coarsening of antiferromagnetically ordered domains~\cite{Samajdar2024}. By deterministically preparing and following the evolution of ordered domains, we show that the coarsening is driven by the curvature of domain boundaries, and find that the dynamics accelerate with proximity to the quantum critical point. We quantitatively explore these phenomena and further observe long-lived oscillations of the order parameter, corresponding to an amplitude (Higgs) mode \cite{pekker2015amplitude}. These observations offer a unique viewpoint into emergent collective dynamics in strongly correlated quantum systems and nonequilibrium quantum processes. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 25 pages, 14 figures

arXiv:2407.03247 [pdf, other]

Bridging Model Heterogeneity in Federated Learning via Uncertainty-based Asymmetrical Reciprocity Learning

Authors: Jiaqi Wang, Chenxu Zhao, Lingjuan Lyu, Quanzeng You, Mengdi Huai, Fenglong Ma

Abstract: This paper presents FedType, a simple yet pioneering framework designed to fill research gaps in heterogeneous model aggregation within federated learning (FL). FedType introduces small identical proxy models for clients, serving as agents for information exchange, ensuring model security, and achieving efficient communication simultaneously. To transfer knowledge between large private and small p… ▽ More This paper presents FedType, a simple yet pioneering framework designed to fill research gaps in heterogeneous model aggregation within federated learning (FL). FedType introduces small identical proxy models for clients, serving as agents for information exchange, ensuring model security, and achieving efficient communication simultaneously. To transfer knowledge between large private and small proxy models on clients, we propose a novel uncertainty-based asymmetrical reciprocity learning method, eliminating the need for any public data. Comprehensive experiments conducted on benchmark datasets demonstrate the efficacy and generalization ability of FedType across diverse settings. Our approach redefines federated learning paradigms by bridging model heterogeneity, eliminating reliance on public data, prioritizing client privacy, and reducing communication costs. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: This paper has been accepted by ICML 2024

arXiv:2407.03245 [pdf, other]

TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach

Authors: Weikun Peng, Jun Lv, Yuwei Zeng, Haonan Chen, Siheng Zhao, Jicheng Sun, Cewu Lu, Lin Shao

Abstract: The tie-knotting task is highly challenging due to the tie's high deformation and long-horizon manipulation actions. This work presents TieBot, a Real-to-Sim-to-Real learning from visual demonstration system for the robots to learn to knot a tie. We introduce the Hierarchical Feature Matching approach to estimate a sequence of tie's meshes from the demonstration video. With these estimated meshes… ▽ More The tie-knotting task is highly challenging due to the tie's high deformation and long-horizon manipulation actions. This work presents TieBot, a Real-to-Sim-to-Real learning from visual demonstration system for the robots to learn to knot a tie. We introduce the Hierarchical Feature Matching approach to estimate a sequence of tie's meshes from the demonstration video. With these estimated meshes used as subgoals, we first learn a teacher policy using privileged information. Then, we learn a student policy with point cloud observation by imitating teacher policy. Lastly, our pipeline learns a residual policy when the learned policy is applied to real-world execution, mitigating the Sim2Real gap. We demonstrate the effectiveness of TieBot in simulation and the real world. In the real-world experiment, a dual-arm robot successfully knots a tie, achieving 50% success rate among 10 trials. Videos can be found on our $\href{https://tiebots.github.io/}{\text{website}}$. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: initial commit

arXiv:2407.03243 [pdf, other]

Visual Grounding with Attention-Driven Constraint Balancing

Authors: Weitai Kang, Luowei Zhou, Junyi Wu, Changchang Sun, Yan Yan

Abstract: Unlike Object Detection, Visual Grounding task necessitates the detection of an object described by complex free-form language. To simultaneously model such complex semantic and visual representations, recent state-of-the-art studies adopt transformer-based models to fuse features from both modalities, further introducing various modules that modulate visual features to align with the language exp… ▽ More Unlike Object Detection, Visual Grounding task necessitates the detection of an object described by complex free-form language. To simultaneously model such complex semantic and visual representations, recent state-of-the-art studies adopt transformer-based models to fuse features from both modalities, further introducing various modules that modulate visual features to align with the language expressions and eliminate the irrelevant redundant information. However, their loss function, still adopting common Object Detection losses, solely governs the bounding box regression output, failing to fully optimize for the above objectives. To tackle this problem, in this paper, we first analyze the attention mechanisms of transformer-based models. Building upon this, we further propose a novel framework named Attention-Driven Constraint Balancing (AttBalance) to optimize the behavior of visual features within language-relevant regions. Extensive experimental results show that our method brings impressive improvements. Specifically, we achieve constant improvements over five different models evaluated on four different benchmarks. Moreover, we attain a new state-of-the-art performance by integrating our method into QRNet. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03241 [pdf, other]

Terrain Classification Enhanced with Uncertainty for Space Exploration Robots from Proprioceptive Data

Authors: Mariela De Lucas Álvarez, Jichen Guo, Raul Domínguez, Matias Valdenegro-Toro

Abstract: Terrain Classification is an essential task in space exploration, where unpredictable environments are difficult to observe using only exteroceptive sensors such as vision. Implementing Neural Network classifiers can have high performance but can be deemed untrustworthy as they lack transparency, which makes them unreliable for taking high-stakes decisions during mission planning. We address this… ▽ More Terrain Classification is an essential task in space exploration, where unpredictable environments are difficult to observe using only exteroceptive sensors such as vision. Implementing Neural Network classifiers can have high performance but can be deemed untrustworthy as they lack transparency, which makes them unreliable for taking high-stakes decisions during mission planning. We address this by proposing Neural Networks with Uncertainty Quantification in Terrain Classification. We enable our Neural Networks with Monte Carlo Dropout, DropConnect, and Flipout in time series-capable architectures using only proprioceptive data as input. We use Bayesian Optimization with Hyperband for efficient hyperparameter optimization to find optimal models for trustworthy terrain classification. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 6 pages, 4 figures. LatinX in AI Workshop @ ICML 2023 Camera Ready

arXiv:2407.03235 [pdf, other]

Programming universal unitary transformations on a general-purpose silicon photonics platform

Authors: Jose Roberto Rausell-Campo, Daniel Pérez, López, José Capmany Francoy

Abstract: General-purpose programmable photonic processors provide a versatile platform for integrating diverse functionalities on a single chip. Leveraging a two-dimensional hexagonal waveguide mesh of Mach-Zehnder interferometers, these systems have demonstrated significant potential in microwave photonics applications. Additionally, they are a promising platform for creating unitary linear transformation… ▽ More General-purpose programmable photonic processors provide a versatile platform for integrating diverse functionalities on a single chip. Leveraging a two-dimensional hexagonal waveguide mesh of Mach-Zehnder interferometers, these systems have demonstrated significant potential in microwave photonics applications. Additionally, they are a promising platform for creating unitary linear transformations, which are key elements in quantum computing and photonic neural networks. However, a general procedure for implementing these transformations on such systems has not been established yet. This work demonstrates the programming of universal unitary transformations on a general-purpose programmable photonic circuit with a hexagonal topology. We detail the steps to split the light on-chip, demonstrate that an equivalent structure to the Mach-Zehnder interferometer with one internal and one external phase shifter can be built in the hexagonal mesh, and program both the triangular and rectangular architectures for matrix multiplication. We recalibrate the system to account for passive phase deviations. Experimental programming of 3x3 and 4x4 random unitary matrices yields fidelities > 98% and bit precisions over 5 bits. To the best of our knowledge, this is the first time that random unitary matrices are demonstrated on a general-purpose photonic processor and pave the way for the implementation of programmable photonic circuits in optical computing and signal processing systems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03231 [pdf]

doi 10.1021/acs.nanolett.4c01536

Dimensionality Engineering of Magnetic Anisotropy from Anomalous Hall Effect in Synthetic SrRuO3 Crystals

Authors: Seung Gyo Jeong, Seong Won Cho, Sehwan Song, ** Young Oh, Do Gyeom Jeong, Gyeongtak Han, Hu Young Jeong, Ahmed Yousef Mohamed, Woo-suk Noh, Sungkyun Park, Jong Seok Lee, Suyoun Lee, Young-Min Kim, Deok-Yong Cho, Woo Seok Choi

Abstract: Magnetic anisotropy in atomically thin correlated heterostructures is essential for exploring quantum magnetic phases for next-generation spintronics. Whereas previous studies have mostly focused on van der Waals systems, here, we investigate the impact of dimensionality of epitaxially-grown correlated oxides down to the monolayer limit on structural, magnetic, and orbital anisotropies. By designi… ▽ More Magnetic anisotropy in atomically thin correlated heterostructures is essential for exploring quantum magnetic phases for next-generation spintronics. Whereas previous studies have mostly focused on van der Waals systems, here, we investigate the impact of dimensionality of epitaxially-grown correlated oxides down to the monolayer limit on structural, magnetic, and orbital anisotropies. By designing oxide superlattices with a correlated ferromagnetic SrRuO3 and nonmagnetic SrTiO3 layers, we observed modulated ferromagnetic behavior with the change of the SrRuO3 thickness. Especially, for three-unit-cell-thick layers, we observe a significant 1,500% improvement of coercive field in the anomalous Hall effect, which cannot be solely attributed to the dimensional crossover in ferromagnetism. The atomic-scale heterostructures further reveal the systematic modulation of anisotropy for the lattice structure and orbital hybridization, explaining the enhanced magnetic anisotropy. Our findings provide valuable insights into engineering the anisotropic hybridization of synthetic magnetic crystals, offering a tunable spin order for various applications. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 23 pages

Journal ref: published 2024

arXiv:2407.03227 [pdf, other]

Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning

Authors: Zhili Shen, Pavlos Vougiouklis, Chenxin Diao, Kaustubh Vyas, Yuanyi Ji, Jeff Z. Pan

Abstract: We focus on Text-to-SQL semantic parsing from the perspective of Large Language Models. Motivated by challenges related to the size of commercial database schemata and the deployability of business intelligence solutions, we propose an approach that dynamically retrieves input database information and uses abstract syntax trees to select few-shot examples for in-context learning. Furthermore, we… ▽ More We focus on Text-to-SQL semantic parsing from the perspective of Large Language Models. Motivated by challenges related to the size of commercial database schemata and the deployability of business intelligence solutions, we propose an approach that dynamically retrieves input database information and uses abstract syntax trees to select few-shot examples for in-context learning. Furthermore, we investigate the extent to which an in-parallel semantic parser can be leveraged for generating $\textit{approximated}$ versions of the expected SQL queries, to support our retrieval. We take this approach to the extreme--we adapt a model consisting of less than $500$M parameters, to act as an extremely efficient approximator, enhancing it with the ability to process schemata in a parallelised manner. We apply our approach to monolingual and cross-lingual benchmarks for semantic parsing, showing improvements over state-of-the-art baselines. Comprehensive experiments highlight the contribution of modules involved in this retrieval-augmented generation setting, revealing interesting directions for future work. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03221 [pdf, other]

Modelling the BOSS void-galaxy cross-correlation function using a neural-network emulator

Authors: Tristan S. Fraser, Enrique Paillas, Will J. Percival, Seshadri Nadathur, Slađana Radinović, Hans A. Winther

Abstract: We introduce an emulator-based method to model the cross-correlation between cosmological voids and galaxies. This allows us to model the effect of cosmology on void finding and on the shape of the void-galaxy cross-correlation function, improving on previous template-based methods. We train a neural network using the AbacusSummit simulation suite and fit to data from the Sloan Digital Sky Survey… ▽ More We introduce an emulator-based method to model the cross-correlation between cosmological voids and galaxies. This allows us to model the effect of cosmology on void finding and on the shape of the void-galaxy cross-correlation function, improving on previous template-based methods. We train a neural network using the AbacusSummit simulation suite and fit to data from the Sloan Digital Sky Survey Baryon Oscillation Spectroscopic Survey sample. We recover information on the growth of structure through redshift-space distortions (RSD), and the geometry of the Universe through the Alcock-Paczyński (AP) effect, measuring $Ω_{\rm m} = 0.330\pm 0.020$ and $σ_8 = 0.777^{+0.047}_{-0.062}$ for a $Λ\rm{CDM}$ cosmology. Comparing to results from a template-based method, we find that fitting the shape of the void-galaxy cross-correlation function provides more information and leads to an improvement in constraining power. In contrast, we find that errors on the AP measurements were previously underestimated if void centres were assumed to have the same response to the AP effect as galaxies - a common simplification. Overall, we recover a $28\%$ reduction in errors for $Ω_{\rm{m}}$ and similar errors on $σ_8$ with our new, more comprehensive, method. Given the statistical power of future surveys including DESI and Euclid, we expect the method presented to become the new baseline for the analysis of voids in these data. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 35 pages, 14 figures, 4 tables, submitted to JCAP

arXiv:2407.03218 [pdf, other]

Programmable Photonic Extreme Learning Machines

Authors: Jose Roberto Rausell-Campo, Antonio Hurtado, Daniel Pérez-López, José Capmany Francoy

Abstract: Photonic neural networks offer a promising alternative to traditional electronic systems for machine learning accelerators due to their low latency and energy efficiency. However, the challenge of implementing the backpropagation algorithm during training has limited their development. To address this, alternative machine learning schemes, such as extreme learning machines (ELMs), have been propos… ▽ More Photonic neural networks offer a promising alternative to traditional electronic systems for machine learning accelerators due to their low latency and energy efficiency. However, the challenge of implementing the backpropagation algorithm during training has limited their development. To address this, alternative machine learning schemes, such as extreme learning machines (ELMs), have been proposed. ELMs use a random hidden layer to increase the feature space dimensionality, requiring only the output layer to be trained through linear regression, thus reducing training complexity. Here, we experimentally demonstrate a programmable photonic extreme learning machine (PPELM) using a hexagonal waveguide mesh, and which enables to program directly on chip the input feature vector and the random hidden layer. Our system also permits to apply the nonlinearity directly on-chip by using the systems integrated photodetecting elements. Using the PPELM we solved successfully three different complex classification tasks. Additioanlly, we also propose and demonstrate two techniques to increase the accuracy of the models and reduce their variability using an evolutionary algorithm and a wavelength division multiplexing approach, obtaining excellent performance. Our results show that programmable photonic processors may become a feasible way to train competitive machine learning models on a versatile and compact platform. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03208 [pdf, other]

Randomized Implicitly Restarted Arnoldi method for the non-symmetric eigenvalue problem

Authors: Jean-Guillaume de Damas, Laura Grigori

Abstract: In this paper, we introduce a randomized algorithm for solving the non-symmetric eigenvalue problem, referred to as randomized Implicitly Restarted Arnoldi (rIRA). This method relies on using a sketch-orthogonal basis during the Arnoldi process while maintaining the Arnoldi relation and exploiting a restarting scheme to focus on a specific part of the spectrum. We analyze this method and show that… ▽ More In this paper, we introduce a randomized algorithm for solving the non-symmetric eigenvalue problem, referred to as randomized Implicitly Restarted Arnoldi (rIRA). This method relies on using a sketch-orthogonal basis during the Arnoldi process while maintaining the Arnoldi relation and exploiting a restarting scheme to focus on a specific part of the spectrum. We analyze this method and show that it retains useful properties of the Implicitly Restarted Arnoldi (IRA) method, such as restarting without adding errors to the Ritz pairs and implicitly applying polynomial filtering. Experiments are presented to validate the numerical efficiency of the proposed randomized eigenvalue solver. △ Less

Submitted 3 July, 2024; originally announced July 2024.

MSC Class: 65F10; 65F15; 65F25; 15B52

arXiv:2407.03203 [pdf, other]

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

Authors: Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

Abstract: Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance… ▽ More Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance due to the scarcity of aligned NL and Formal Language (FL) theorem-proving data. This scarcity results in a paucity of methodologies for training LLMs and techniques to fully utilize their capabilities in composing formal proofs. To address the challenges, this paper proposes **TheoremLlama**, an end-to-end framework to train a general-purpose LLM to become a Lean4 expert. This framework encompasses NL-FL aligned dataset generation methods, training approaches for the LLM formal theorem prover, and techniques for LLM Lean4 proof writing. Using the dataset generation method, we provide *Open Bootstrapped Theorems* (OBT), an NL-FL aligned and bootstrapped dataset. A key innovation in this framework is the NL-FL bootstrap** method, where NL proofs are integrated into Lean4 code for training datasets, leveraging the NL reasoning ability of LLMs for formal reasoning. The **TheoremLlama** framework achieves cumulative accuracies of 36.48% and 33.61% on MiniF2F-Valid and Test datasets respectively, surpassing the GPT-4 baseline of 22.95% and 25.41%. We have also open-sourced our model checkpoints and generated dataset, and will soon make all the code publicly available. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03202 [pdf, other]

Clifford Circuits Augmented Time-Dependent Variational Principle

Authors: Xiangjian Qian, Jiale Huang, Mingpu Qin

Abstract: The recently proposed Clifford Circuits Augmented Matrix Product States (CA-MPS) (arXiv:2405.09217) seamlessly augments Density Matrix Renormalization Group with Clifford circuits. In CA-MPS, the entanglement from stabilizers is transferred to the Clifford circuits which can be easily handled according to the Gottesman-Knill theorem. As a result, MPS needs only to deal with the non-stabilizer enta… ▽ More The recently proposed Clifford Circuits Augmented Matrix Product States (CA-MPS) (ar** process similar as in DMRG, aiming at reducing the entanglement entropy in the MPS, and the Hamiltonian is transformed accordingly using the chosen Clifford circuits. Similar as in CA-MPS, the Clifford circuits doesn't increase the number of terms in the Hamiltonian which makes the overhead very small in the new method. We test this method in both XXZ chain and two dimensional Heisenberg model. The results show that the Clifford circuits augmented TDVP method can reduce the entanglement entropy in the time evolution process and hence makes the simulation reliable for longer time. The Clifford circuits augmented Time-Dependent Variational Principle provides a useful tool for the simulation of time evolution process of many-body systems in the future. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03201 [pdf, other]

doi 10.1038/s44306-024-00035-2

Wideband Coherent Microwave Conversion via Magnon Nonlinearity in Hybrid Quantum System

Authors: Jiahao Wu, Jiacheng Liu, Zheyu Ren, Man Yin Leung, Wai Kuen Leung, Kin On Ho, Xiangrong Wang, Qiming Shao, Sen Yang

Abstract: Frequency conversion is a widely realized physical process in nonlinear systems of optics and electronics. As an emerging nonlinear platform, spintronic devices have the potential to achieve stronger frequency conversion. Here, we demonstrated a microwave frequency conversion method in a hybrid quantum system, integrating nitrogen-vacancy centers in diamond with magnetic thin film CoFeB. We achiev… ▽ More Frequency conversion is a widely realized physical process in nonlinear systems of optics and electronics. As an emerging nonlinear platform, spintronic devices have the potential to achieve stronger frequency conversion. Here, we demonstrated a microwave frequency conversion method in a hybrid quantum system, integrating nitrogen-vacancy centers in diamond with magnetic thin film CoFeB. We achieve a conversion bandwidth ranging from 0.1 to 12GHz, presenting an up to $\mathrm{25^{th}}$ order frequency conversion and further display the application of this method for frequency detection and qubits coherent control. Distinct from traditional frequency conversion techniques based on nonlinear electric response, our approach employs nonlinear magnetic response in spintronic devices. The nonlinearity, originating from the symmetry breaking such as domain walls in magnetic films, presents that our method can be adapted to hybrid systems of other spintronic devices and spin qubits, expanding the application scope of spintronic devices and providing a promising on-chip platform for coupling quantum systems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 11 pages, 5 figures

Journal ref: npj Spintronics volume 2, Article number: 30 (2024)

arXiv:2407.03199 [pdf, other]

BOWIE-ALIGN: How formation and migration histories of giant planets impact atmospheric compositions

Authors: Anna B. T. Penzlin, Richard A. Booth, James Kirk, James E. Owen, Eva-Maria Ahrer, Duncan A. Christie, Alastair B. Claringbold, Emma Esparza-Borges, M. López-Morales, N. J. Mayne, Mason McCormack, Annabella Meech, Vatsal Panwar, Diana Powell, Denis E. Sergeev, Jake Taylor, Peter J. Wheatley, Maria Zamyatina

Abstract: Hot Jupiters present a unique opportunity for measuring how planet formation history shapes present-day atmospheric composition. However, due to the myriad pathways influencing composition, a well-constructed sample of planets is needed to determine whether formation history can be accurately traced back from atmospheric composition. To this end, the BOWIE-ALIGN survey will compare the composition… ▽ More Hot Jupiters present a unique opportunity for measuring how planet formation history shapes present-day atmospheric composition. However, due to the myriad pathways influencing composition, a well-constructed sample of planets is needed to determine whether formation history can be accurately traced back from atmospheric composition. To this end, the BOWIE-ALIGN survey will compare the compositions of 8 hot Jupiters around F stars, 4 with orbits aligned with the stellar rotation axis and 4 misaligned. Using the alignment as an indicator for planets that underwent disc migration or high-eccentricity migration, one can determine whether migration history produces notable differences in composition between the two samples of planets. This paper describes the planet formation model that motivates our observing programme. Our model traces the accretion of chemical components from the gas and dust in the disc over a broad parameter space to create a full, unbiased model sample from which we can estimate the range of final atmospheric compositions. For high metallicity atmospheres (O/H > 10 times solar), the C/O ratios of aligned and misaligned planets diverge, with aligned planets having lower C/O (< 0.25) due to the accretion of oxygen-rich silicates from the inner disc. However, silicates may rain out instead of releasing their oxygen into the atmosphere. This would significantly increase the C/O of aligned planets (C/O > 0.6), inverting the trend between the aligned and misaligned planets. Nevertheless, by comparing statistically significant samples of aligned and misaligned planets, we expect atmospheric composition to constrain how planets form. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 11pages 10 figures, (appendix: 6 page, 4 figures), submitted to mnras

arXiv:2407.03198 [pdf, other]

BOWIE-ALIGN: A JWST comparative survey of aligned vs misaligned hot Jupiters to test the dependence of atmospheric composition on migration history

Authors: James Kirk, Eva-Maria Ahrer, Anna B. T. Penzlin, James E. Owen, Richard A. Booth, Lili Alderson, Duncan A. Christie, Alastair B. Claringbold, Emma Esparza-Borges, Chloe E. Fisher, Mercedes López-Morales, N. J. Mayne, Mason McCormack, Annabella Meech, Vatsal Panwar, Diana Powell, Jake Taylor, Denis E. Sergeev, Daniel Valentine, Hannah R. Wakeford, Peter J. Wheatley, Maria Zamyatina

Abstract: A primary objective of exoplanet atmosphere characterisation is to learn about planet formation and evolution, however, this is challenged by degeneracies. To determine whether differences in atmospheric composition can be reliably traced to differences in evolution, we are undertaking a new survey with JWST to compare the compositions of a sample of hot Jupiters that orbit F stars above the Kraft… ▽ More A primary objective of exoplanet atmosphere characterisation is to learn about planet formation and evolution, however, this is challenged by degeneracies. To determine whether differences in atmospheric composition can be reliably traced to differences in evolution, we are undertaking a new survey with JWST to compare the compositions of a sample of hot Jupiters that orbit F stars above the Kraft break with different orbital alignments. Under the assumption that aligned planets migrate through the inner disc, while misaligned planets migrate after disc dispersal, the act of migrating through the inner disc should lead to a measurable difference in the C/O between aligned and misaligned planets. We expect the amplitude and sign of this difference to depend on the amount of planetesimal accretion and whether silicates accreted from the inner disc release their oxygen. Here, we identify all known exoplanets that are suitable for testing this hypothesis, describe our JWST survey, and use noise simulations and atmospheric retrievals to estimate our survey's sensitivity. With the selected sample of four aligned and four misaligned hot Jupiters, we will be sensitive to the predicted differences in C/O between aligned and misaligned hot Jupiters for a wide range of model scenarios. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 13 pages, 8 figures, submitted to RASTI

arXiv:2407.03194 [pdf, ps, other]

Prediction Instability in Machine Learning Ensembles

Authors: Jeremy Kedziora

Abstract: In machine learning ensembles predictions from multiple models are aggregated. Despite widespread use and strong performance of ensembles in applied problems little is known about the mathematical properties of aggregating models and associated consequences for safe, explainable use of such models. In this paper we prove a theorem that shows that any ensemble will exhibit at least one of the follo… ▽ More In machine learning ensembles predictions from multiple models are aggregated. Despite widespread use and strong performance of ensembles in applied problems little is known about the mathematical properties of aggregating models and associated consequences for safe, explainable use of such models. In this paper we prove a theorem that shows that any ensemble will exhibit at least one of the following forms of prediction instability. It will either ignore agreement among all underlying models, change its mind when none of the underlying models have done so, or be manipulable through inclusion or exclusion of options it would never actually predict. As a consequence, ensemble aggregation procedures will always need to balance the benefits of information use against the risk of these prediction instabilities. This analysis also sheds light on what specific forms of prediction instability to expect from particular ensemble algorithms; for example popular tree ensembles like random forest, or xgboost will violate basic, intuitive monotonicity and fairness properties. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 15 pages, uses a modified version of ICML2024.sty

ACM Class: I.2.0

arXiv:2407.03192 [pdf, other]

CiteAssist: A System for Automated Preprint Citation and BibTeX Generation

Authors: Lars Benedikt Kaesberg, Terry Ruas, Jan Philip Wahle, Bela Gipp

Abstract: We present CiteAssist, a system to automate the generation of BibTeX entries for preprints, streamlining the process of bibliographic annotation. Our system extracts metadata, such as author names, titles, publication dates, and keywords, to create standardized annotations within the document. CiteAssist automatically attaches the BibTeX citation to the end of a PDF and links it on the first page… ▽ More We present CiteAssist, a system to automate the generation of BibTeX entries for preprints, streamlining the process of bibliographic annotation. Our system extracts metadata, such as author names, titles, publication dates, and keywords, to create standardized annotations within the document. CiteAssist automatically attaches the BibTeX citation to the end of a PDF and links it on the first page of the document so other researchers gain immediate access to the correct citation of the article. This method promotes platform flexibility by ensuring that annotations remain accessible regardless of the repository used to publish or access the preprint. The annotations remain available even if the preprint is viewed externally to CiteAssist. Additionally, the system adds relevant related papers based on extracted keywords to the preprint, providing researchers with additional publications besides those in related work for further reading. Researchers can enhance their preprints organization and reference management workflows through a free and publicly available web interface. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Published at SDProc @ ACL 2024

arXiv:2407.03191 [pdf, other]

Controlling Plasmonic Catalysis via Strong Coupling with Electromagnetic Resonators

Authors: Jakub Fojt, Paul Erhart, Christian Schäfer

Abstract: Plasmonic excitations decay within femtoseconds, leaving non-thermal (often referred to as "hot") charge carriers behind that can be injected into molecular structures to trigger chemical reactions that are otherwise out of reach -- a process known as plasmonic catalysis. In this Letter, we demonstrate that strong coupling between resonator structures and plasmonic nanoparticles can be used to con… ▽ More Plasmonic excitations decay within femtoseconds, leaving non-thermal (often referred to as "hot") charge carriers behind that can be injected into molecular structures to trigger chemical reactions that are otherwise out of reach -- a process known as plasmonic catalysis. In this Letter, we demonstrate that strong coupling between resonator structures and plasmonic nanoparticles can be used to control the spectral overlap between the plasmonic excitation energy and the charge injection energy into nearby molecules. Our atomistic description couples real-time density-functional theory self-consistently to Maxwell's equations via the radiation-reaction potential. Control over the resonator provides then an additional knob for non-intrusively enhancing plasmonic catalysis and dynamically reacting to deterioration of the catalyst -- a new facet of modern catalysis. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03189 [pdf, other]

Origin of anomalous magnetotransport in kagome superconductors AV$_{3}$Sb$_{5}$ (A=K,Rb,Cs)

Authors: A. E. Koshelev, R. Chapai, D. Y. Chung, J. F. Mitchell, U. Welp

Abstract: Multiple anomalous features in electronic spectra of metals with kagome lattice structure -- van Hove singularities, Dirac points, and flat bands -- imply that materials containing this structural motif may lie at a nexus of topological and correlated electron physics. Due to the prospects of such exceptional electronic behavior, the recent discovery of superconductivity coexisting with charge-den… ▽ More Multiple anomalous features in electronic spectra of metals with kagome lattice structure -- van Hove singularities, Dirac points, and flat bands -- imply that materials containing this structural motif may lie at a nexus of topological and correlated electron physics. Due to the prospects of such exceptional electronic behavior, the recent discovery of superconductivity coexisting with charge-density wave (CDW) order in the layered kagome metals AV$_{3}$Sb$_{5}$ (A=K,Rb,Cs) has attracted considerable attention. Notably, these kagome metals express unconventional magnetotransport behavior, including a linear-in-H diagonal resistivity at low fields, and an even more peculiar, nonmonotonic sign-changing behavior of the Hall resistivity, which has been speculated to arise from a chiral CDW. We argue here that this unusual magnetotransport derives not from such unconventional phenomena, but rather from the unique fermiology of the AV$_{3}$Sb$_{5}$ materials. Specifically, it is caused by a large, concave hexagonal Fermi surface sheet formed in the close proximity to the van Hove singularities, which is backfolded into a small hexagonal sheet and two large triangular sheets in the CDW state. We introduce a model of the electronic structure of these Fermi surface sheets that allows for a full analytical treatment within Boltzmann kinetic theory and that enables semi-quantitative fits of our transport data. Specifically, we find that the anomalous magnetotransport behavior is caused by the confluence of strong reduction of the Fermi velocity near the van Hove singularities located near the vertices of the hexagonal sheet and sharp corners in Fermi surface generated by the CDW reconstruction. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 36 pages, 15 figures, Subm. Phys. Rev. B

arXiv:2407.03188 [pdf, other]

MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation

Authors: Zihao Wang, Haoxuan Liu, Jiaxing Yu, Tao Zhang, Yan Liu, Kejun Zhang

Abstract: Amid the rising intersection of generative AI and human artistic processes, this study probes the critical yet less-explored terrain of alignment in human-centric automatic song composition. We propose a novel task of Colloquial Description-to-Song Generation, which focuses on aligning the generated content with colloquial human expressions. This task is aimed at bridging the gap between colloquia… ▽ More Amid the rising intersection of generative AI and human artistic processes, this study probes the critical yet less-explored terrain of alignment in human-centric automatic song composition. We propose a novel task of Colloquial Description-to-Song Generation, which focuses on aligning the generated content with colloquial human expressions. This task is aimed at bridging the gap between colloquial language understanding and auditory expression within an AI model, with the ultimate goal of creating songs that accurately satisfy human auditory expectations and structurally align with musical norms. Current datasets are limited due to their narrow descriptive scope, semantic gaps and inaccuracies. To overcome data scarcity in this domain, we present the Caichong Music Dataset (CaiMD). CaiMD is manually annotated by both professional musicians and amateurs, offering diverse perspectives and a comprehensive understanding of colloquial descriptions. Unlike existing datasets pre-set with expert annotations or auto-generated ones with inherent biases, CaiMD caters more sufficiently to our purpose of aligning AI-generated music with widespread user-desired results. Moreover, we propose an innovative single-stage framework called MuDiT/MuSiT for enabling effective human-machine alignment in song creation. This framework not only achieves cross-modal comprehension between colloquial language and auditory music perceptions but also ensures generated songs align with user-desired results. MuDiT/MuSiT employs one DiT/SiT model for end-to-end generation of musical components like melody, harmony, rhythm, vocals, and instrumentation. The approach ensures harmonious sonic cohesiveness amongst all generated musical components, facilitating better resonance with human auditory expectations. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 19 pages, 5 figures

MSC Class: 68Txx(Primary)14F05; 91Fxx(Secondary) ACM Class: I.2.7; J.5

arXiv:2407.03177 [pdf, other]

EDPNet: An Efficient Dual Prototype Network for Motor Imagery EEG Decoding

Authors: Can Han, Chen Liu, Crystal Cai, Jun Wang, Dahong Qian

Abstract: Motor imagery electroencephalograph (MI-EEG) decoding plays a crucial role in develo** motor imagery brain-computer interfaces (MI-BCIs). However, decoding intentions from MI remains challenging due to the inherent complexity of EEG signals relative to the small-sample size. In this paper, we propose an Efficient Dual Prototype Network (EDPNet) to enable accurate and fast MI decoding. EDPNet emp… ▽ More Motor imagery electroencephalograph (MI-EEG) decoding plays a crucial role in develo** motor imagery brain-computer interfaces (MI-BCIs). However, decoding intentions from MI remains challenging due to the inherent complexity of EEG signals relative to the small-sample size. In this paper, we propose an Efficient Dual Prototype Network (EDPNet) to enable accurate and fast MI decoding. EDPNet employs a lightweight adaptive spatial-spectral fusion module, which promotes more efficient information fusion between multiple EEG electrodes. Subsequently, a parameter-free multi-scale variance pooling module extracts more comprehensive temporal features. Furthermore, we introduce dual prototypical learning to optimize the feature space distribution and training process, thereby improving the model's generalization ability on small-sample MI datasets. Our experimental results show that the EDPNet outperforms state-of-the-art models with superior classification accuracy and kappa values (84.11% and 0.7881 for dataset BCI competition IV 2a, 86.65% and 0.7330 for dataset BCI competition IV 2b). Additionally, we use the BCI competition III IVa dataset with fewer training data to further validate the generalization ability of the proposed EDPNet. We also achieve superior performance with 82.03% classification accuracy. Benefiting from the lightweight parameters and superior decoding accuracy, our EDPNet shows great potential for MI-BCI applications. The code is publicly available at https://github.com/hancan16/EDPNet. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03174 [pdf, other]

$ν_μ$ and $ν_τ$ elastic scattering in Borexino

Authors: Kevin J. Kelly, Nityasa Mishra, Mudit Rai, Louis E. Strigari

Abstract: We perform a detailed study of neutrino-electron elastic scattering using the mono-energetic $^{7}$Be neutrinos in Borexino, with an emphasis on exploring the differences between the contributions of $ν_e$, $ν_μ$, and $ν_τ$. We find that current data are capable of measuring these components such that the contributions from $ν_μ$ and $ν_τ$ cannot be zero, although distinguishing between them is ch… ▽ More We perform a detailed study of neutrino-electron elastic scattering using the mono-energetic $^{7}$Be neutrinos in Borexino, with an emphasis on exploring the differences between the contributions of $ν_e$, $ν_μ$, and $ν_τ$. We find that current data are capable of measuring these components such that the contributions from $ν_μ$ and $ν_τ$ cannot be zero, although distinguishing between them is challenging -- the differences stemming from Standard Model radiative corrections are insufficient without significantly more precise measurements. In studying these components, we compare predicted neutrino-electron scattering event rates within the Standard Model (accounting for neutrino oscillations), as well as going beyond the Standard Model in two ways. We allow for non-unitary evolution to modify neutrino oscillations, and find that with a larger exposure (${\sim}30$x), Borexino may provide relevant information for constraining non-unitarity, and that JUNO may be able to accomplish this with its data collection of $^{7}$Be neutrinos. We also consider novel $ν_μ$- and $ν_τ$-electron scattering from a gauged $U(1)_{L_μ- L_τ}$ model, showing consistency with previous analyses of Borexino and this scenario, but also demonstrating the impact of uncertainties on Standard Model mixing parameters on these results. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 12 pages, 5 figures

Report number: MI-HET-836

arXiv:2407.03168 [pdf, other]

LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control

Authors: Jianzhu Guo, Dingyun Zhang, Xiaoqiang Liu, Zhizhou Zhong, Yuan Zhang, Pengfei Wan, Di Zhang

Abstract: Portrait Animation aims to synthesize a lifelike video from a single source image, using it as an appearance reference, with motion (i.e., facial expressions and head pose) derived from a driving video, audio, text, or generation. Instead of following mainstream diffusion-based methods, we explore and extend the potential of the implicit-keypoint-based framework, which effectively balances computa… ▽ More Portrait Animation aims to synthesize a lifelike video from a single source image, using it as an appearance reference, with motion (i.e., facial expressions and head pose) derived from a driving video, audio, text, or generation. Instead of following mainstream diffusion-based methods, we explore and extend the potential of the implicit-keypoint-based framework, which effectively balances computational efficiency and controllability. Building upon this, we develop a video-driven portrait animation framework named LivePortrait with a focus on better generalization, controllability, and efficiency for practical usage. To enhance the generation quality and generalization ability, we scale up the training data to about 69 million high-quality frames, adopt a mixed image-video training strategy, upgrade the network architecture, and design better motion transformation and optimization objectives. Additionally, we discover that compact implicit keypoints can effectively represent a kind of blendshapes and meticulously propose a stitching and two retargeting modules, which utilize a small MLP with negligible computational overhead, to enhance the controllability. Experimental results demonstrate the efficacy of our framework even compared to diffusion-based methods. The generation speed remarkably reaches 12.8ms on an RTX 4090 GPU with PyTorch. The inference code and models are available at https://github.com/KwaiVGI/LivePortrait △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03167 [pdf, other]

Tail calibration of probabilistic forecasts

Authors: Sam Allen, Jonathan Koh, Johan Segers, Johanna Ziegel

Abstract: Probabilistic forecasts comprehensively describe the uncertainty in the unknown future outcome, making them essential for decision making and risk management. While several methods have been introduced to evaluate probabilistic forecasts, existing evaluation techniques are ill-suited to the evaluation of tail properties of such forecasts. However, these tail properties are often of particular inte… ▽ More Probabilistic forecasts comprehensively describe the uncertainty in the unknown future outcome, making them essential for decision making and risk management. While several methods have been introduced to evaluate probabilistic forecasts, existing evaluation techniques are ill-suited to the evaluation of tail properties of such forecasts. However, these tail properties are often of particular interest to forecast users due to the severe impacts caused by extreme outcomes. In this work, we introduce a general notion of tail calibration for probabilistic forecasts, which allows forecasters to assess the reliability of their predictions for extreme outcomes. We study the relationships between tail calibration and standard notions of forecast calibration, and discuss connections to peaks-over-threshold models in extreme value theory. Diagnostic tools are introduced and applied in a case study on European precipitation forecasts △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03166 [pdf, other]

Neutral Atomic Hydrogen Surveys: past, present and future

Authors: F. M. Maccagni, W. J. G. de Blok

Abstract: Neutral atomic hydrogen (HI) observations are fundamental to understand the dynamics of galaxies, their assembly, the fuelling of their star formation and environmental interactions. HI studies have so far been limited by the capabilities of single-dish radio telescopes or synthesis arrays to either small samples or low resolution and sensitivities. Now, the Square Kilometer Array precursors and p… ▽ More Neutral atomic hydrogen (HI) observations are fundamental to understand the dynamics of galaxies, their assembly, the fuelling of their star formation and environmental interactions. HI studies have so far been limited by the capabilities of single-dish radio telescopes or synthesis arrays to either small samples or low resolution and sensitivities. Now, the Square Kilometer Array precursors and pathfinders are providing a novel view of the HI in and around galaxies allowing wide-field high resolution deep surveys in nearby galaxies. We give an overview of past, current and future HI surveys consistently comparing their HI column density and spatial resolutions highlighting their main scientific key goals and results. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 5 pages, 3 figures, originally presented at 2024 URSI Atlantic Radio Science Meeting (AT-RASC), Gran Canaria, Spain

arXiv:2407.03163 [pdf, other]

Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection

Authors: Rui-Yang Ju, Chun-Tse Chien, Chia-Min Lin, Jen-Shiun Chiang

Abstract: Children often suffer wrist injuries in daily life, while fracture injuring radiologists usually need to analyze and interpret X-ray images before surgical treatment by surgeons. The development of deep learning has enabled neural network models to work as computer-assisted diagnosis (CAD) tools to help doctors and experts in diagnosis. Since the YOLOv8 models have obtained the satisfactory succes… ▽ More Children often suffer wrist injuries in daily life, while fracture injuring radiologists usually need to analyze and interpret X-ray images before surgical treatment by surgeons. The development of deep learning has enabled neural network models to work as computer-assisted diagnosis (CAD) tools to help doctors and experts in diagnosis. Since the YOLOv8 models have obtained the satisfactory success in object detection tasks, it has been applied to fracture detection. The Global Context (GC) block effectively models the global context in a lightweight way, and incorporating it into YOLOv8 can greatly improve the model performance. This paper proposes the YOLOv8+GC model for fracture detection, which is an improved version of the YOLOv8 model with the GC block. Experimental results demonstrate that compared to the original YOLOv8 model, the proposed YOLOv8-GC model increases the mean average precision calculated at intersection over union threshold of 0.5 (mAP 50) from 63.58% to 66.32% on the GRAZPEDWRI-DX dataset, achieving the state-of-the-art (SOTA) level. The implementation code for this work is available on GitHub at https://github.com/RuiyangJu/YOLOv8_Global_Context_Fracture_Detection. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03162 [pdf, other]

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

Authors: Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

Abstract: Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-bas… ▽ More Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-based teleoperation systems, we design novel low-cost devices to provide haptic feedback to the operator, enhancing immersion. Our system prioritizes safety by incorporating collision and singularity avoidance while maintaining real-time performance through innovative designs. Bunny-VisionPro outperforms prior systems on a standard task suite, achieving higher success rates and reduced task completion times. Moreover, the high-quality teleoperation demonstrations improve downstream imitation learning performance, leading to better generalizability. Notably, Bunny-VisionPro enables imitation learning with challenging multi-stage, long-horizon dexterous manipulation tasks, which have rarely been addressed in previous work. Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: project page: https://dingry.github.io/projects/bunny_visionpro.html

arXiv:2407.03161 [pdf, other]

Simulating electron-vibron energy transfer with quantum dots and resonators

Authors: Cecilie Hermansen, Mara Caltapanides, Volker Meden, Jens Paaske

Abstract: Gateable semiconductor quantum dots (QDs) provide a versatile platform for analog quantum simulations of electronic many-body systems. In particular, QD arrays offer a natural representation of the interacting $π$-electron system of small hydrocarbons. Here we investigate the prospects for extending QD simulators to encompass also the nuclear degrees of freedom. We represent the molecular vibratio… ▽ More Gateable semiconductor quantum dots (QDs) provide a versatile platform for analog quantum simulations of electronic many-body systems. In particular, QD arrays offer a natural representation of the interacting $π$-electron system of small hydrocarbons. Here we investigate the prospects for extending QD simulators to encompass also the nuclear degrees of freedom. We represent the molecular vibrational modes by single-mode microwave resonators coupled capacitively to the QDs and study the gate-tunable energy transfer from a voltage-biased triple quantum dot (TQD) system to a single damped resonator mode. We determine the QD population inversions, the corresponding charge and energy currents as well as the resonator photon number, using Lindblad master equations and lowest-order perturbation theory within Keldysh Green function formalism. Along the way, we discuss the merits and shortcomings of the two methods.A central result is the interrelation of a pronounced minimum in the charge current with a maximum in energy transfer, arising from a gate-tunable interference effect in the molecular orbitals of the TQD electron system. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 20 pages, 20 figures

Report number: NBI QDEV CMT 2024

arXiv:2407.03159 [pdf, other]

Protection Degree and Migration in the Stochastic SIRS Model: A Queueing System Perspective

Authors: Yuhan Li, Ziyan Zeng, Minyu Feng, Jürgen Kurths

Abstract: With the prevalence of COVID-19, the modeling of epidemic propagation and its analyses have played a significant role in controlling epidemics. However, individual behaviors, in particular the self-protection and migration, which have a strong influence on epidemic propagation, were always neglected in previous studies. In this paper, we mainly propose two models from the individual and population… ▽ More With the prevalence of COVID-19, the modeling of epidemic propagation and its analyses have played a significant role in controlling epidemics. However, individual behaviors, in particular the self-protection and migration, which have a strong influence on epidemic propagation, were always neglected in previous studies. In this paper, we mainly propose two models from the individual and population perspectives. In the first individual model, we introduce the individual protection degree that effectively suppresses the epidemic level as a stochastic variable to the SIRS model. In the alternative population model, an open Markov queueing network is constructed to investigate the individual number of each epidemic state, and we present an evolving population network via the migration of people. Besides, stochastic methods are applied to analyze both models. In various simulations, the infected probability, the number of individuals in each state and its limited distribution are demonstrated. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03157 [pdf, other]

Let the Code LLM Edit Itself When You Edit the Code

Authors: Zhenyu He, Jun Zhang, Shengjie Luo, **g**g Xu, Zhi Zhang, Di He

Abstract: In this work, we investigate a typical scenario in code generation where a developer edits existing code in real time and requests a code assistant, e.g., a large language model, to re-predict the next token or next line on the fly. Naively, the LLM needs to re-encode the entire KV cache to provide an accurate prediction. However, this process is computationally expensive, especially when the sequ… ▽ More In this work, we investigate a typical scenario in code generation where a developer edits existing code in real time and requests a code assistant, e.g., a large language model, to re-predict the next token or next line on the fly. Naively, the LLM needs to re-encode the entire KV cache to provide an accurate prediction. However, this process is computationally expensive, especially when the sequence length is long. Simply encoding the edited subsequence and integrating it to the original KV cache meets the temporal confusion problem, leading to significantly worse performance. We address this efficiency and accuracy trade-off by introducing \underline{\textbf{Positional \textbf{I}ntegrity \textbf{E}ncoding} (PIE). Building upon the rotary positional encoding, PIE first removes the rotary matrices in the Key cache that introduce temporal confusion and then reapplies the correct rotary matrices. This process ensures that positional relationships between tokens are correct and requires only a single round of matrix multiplication. We validate the effectiveness of PIE through extensive experiments on the RepoBench-C-8k dataset, utilizing DeepSeek-Coder models with 1.3B, 6.7B, and 33B parameters. Our evaluation includes three real-world coding tasks: code insertion, code deletion, and multi-place code editing. Results demonstrate that PIE reduces computational overhead by over 85% compared to the standard full recomputation approach across all model sizes and tasks while well approximating the model performance. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Preprint. Work in Progress

arXiv:2407.03156 [pdf, ps, other]

Function theory in the bfd-norm on an elliptical region

Authors: Jim Agler, Zinaida Lykova, Nicholas Young

Abstract: Let $E$ be the open region in the complex plane bounded by an ellipse. The B. and F. Delyon norm $\|\cdot\|_{\mathrm{bfd}}$ on the space $\mathrm{Hol}(E)$ of holomorphic functions on $E$ is defined by $$ \|f\|_{\mathrm{bfd}} \stackrel{\rm def}{=} \sup_{T\in \mathcal{F}_{\mathrm {bfd}}(E)}\|f(T)\|, $$ where $\mathcal{F}_{\mathrm {bfd}}(E)$ is the class of operators $T$ such that the closure of the… ▽ More Let $E$ be the open region in the complex plane bounded by an ellipse. The B. and F. Delyon norm $\|\cdot\|_{\mathrm{bfd}}$ on the space $\mathrm{Hol}(E)$ of holomorphic functions on $E$ is defined by $$ \|f\|_{\mathrm{bfd}} \stackrel{\rm def}{=} \sup_{T\in \mathcal{F}_{\mathrm {bfd}}(E)}\|f(T)\|, $$ where $\mathcal{F}_{\mathrm {bfd}}(E)$ is the class of operators $T$ such that the closure of the numerical range of $T$ is contained in $E$. The name of the norm recognizes a celebrated theorem of the brothers Delyon, which implies that $\|\cdot\|_{\mathrm{bfd}}$ is equivalent to the supremum norm $\|\cdot\|_\infty$ on $\mathrm{Hol}(E)$. The purpose of this paper is to develop the theory of holomorphic functions of bfd-norm less than or equal to one on $E$. To do so we shall employ a remarkable connection between the bfd norm on $\mathrm{Hol}(E)$ and the supremum norm $\|\cdot\|_\infty$ on the space $\mathrm{H}^\infty(G)$ of bounded holomorphic functions on the symmetrized bidisc, the domain $G$ in $\mathbb{C}^2$ defined by \begin{align*} G & \stackrel{\rm def}{=} \{(z+w,zw): |z|<1, |w|<1\}. \end{align*} It transpires that there exists a holomorphic embedding $τ:E \to G$ having the property that, for any bounded holomorphic function $f$ on $E$, \[ \|f\|_{\mathrm{bfd}} = \inf\{\|F\|_\infty: F \in {\mathrm H}^\infty(G), F\circτ=f\}, \] and moreover, the infimum is attained at some $F \in \mathrm{H}^\infty(G)$. This result allows us to derive, for holomorphic functions of bfd-norm at most one on $E$, analogs of the well-known model and realization formulae for Schur-class functions. We also give a second derivation of these models and realizations, which exploits the Zhukovskii map** from an annulus onto $E$. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 23 pages

MSC Class: 32A10; 47A12; 15A60; 47B99; 47N70

arXiv:2407.03154 [pdf, other]

Reinforcement Learning for Sequence Design Leveraging Protein Language Models

Authors: Jithendaraa Subramanian, Shivakanth Sujit, Niloy Irtisam, Umong Sain, Derek Nowrouzezahrai, Samira Ebrahimi Kahou, Riashat Islam

Abstract: Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large se… ▽ More Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large search spaces, learning a mutation policy to generate novel sequences with reinforcement learning is appealing. Recent advances in protein language models (PLMs) trained on large corpora of protein sequences offer a potential solution to this problem by scoring proteins according to their biological plausibility (such as the TM-score). In this work, we propose to use PLMs as a reward function to generate new sequences. Yet the PLM can be computationally expensive to query due to its large size. To this end, we propose an alternative paradigm where optimization can be performed on scores from a smaller proxy model that is periodically finetuned, jointly while learning the mutation policy. We perform extensive experiments on various sequence lengths to benchmark RL-based approaches, and provide comprehensive evaluations along biological plausibility and diversity of the protein. Our experimental results include favorable evaluations of the proposed sequences, along with high diversity scores, demonstrating that RL is a strong candidate for biological sequence design. Finally, we provide a modular open source implementation can be easily integrated in most RL training loops, with support for replacing the reward model with other PLMs, to spur further research in this domain. The code for all experiments is provided in the supplementary material. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 22 pages, 7 figures, 4 tables

arXiv:2407.03151 [pdf, other]

Inverse stochastic resonance in adaptive small-world neural networks

Authors: Marius E. Yamakou, **jie Zhu, Erik A. Martens

Abstract: Inverse stochastic resonance (ISR) is a phenomenon where noise reduces rather than increases the firing rate of a neuron, sometimes leading to complete quiescence. ISR was first experimentally verified with cerebellar Purkinje neurons. These experiments showed that ISR enables optimal information transfer between the input and output spike train of neurons. Subsequent studies demonstrated the effi… ▽ More Inverse stochastic resonance (ISR) is a phenomenon where noise reduces rather than increases the firing rate of a neuron, sometimes leading to complete quiescence. ISR was first experimentally verified with cerebellar Purkinje neurons. These experiments showed that ISR enables optimal information transfer between the input and output spike train of neurons. Subsequent studies demonstrated the efficiency of information processing and transfer in neural networks with small-world topology. We conducted a numerical investigation into the impact of adaptivity on ISR in a small-world network of noisy FitzHugh-Nagumo (FHN) neurons, operating in a bistable regime with a stable fixed point and a limit cycle -- a prerequisite for ISR. Our results show that the degree of ISR is highly dependent on the FHN model's timescale separation parameter $ε$. The network structure undergoes dynamic adaptation via mechanisms of either spike-time-dependent plasticity (STDP) with potentiation-/depression-domination parameter $P$, or homeostatic structural plasticity (HSP) with rewiring frequency $F$. We demonstrate that both STDP and HSP amplify ISR when $ε$ lies within the bistability region of FHN neurons. Specifically, at larger values of $ε$ within the bistability regime, higher rewiring frequencies $F$ enhance ISR at intermediate (weak) synaptic noise intensities, while values of $P$ consistent with depression-domination (potentiation-domination) enhance (deteriorate) ISR. Moreover, although STDP and HSP parameters may jointly enhance ISR, $P$ has a greater impact on ISR compared to $F$. Our findings inform future ISR enhancement strategies in noisy artificial neural circuits, aiming to optimize information transfer between input and output spike trains in neuromorphic systems, and prompt venues for experiments in neural networks. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 16 pages, 59 references, 10 figures

arXiv:2407.03149 [pdf, ps, other]

Finite Germ Extensions

Authors: James Belk, James Hyde, Francesco Matucci

Abstract: We prove finiteness properties for groups of homeomorphisms that have finitely many "singular points", and we describe the normal structure of such groups. As an application, we prove that every countable abelian group can be embedded into a finitely presented simple group, verifying the Boone-Higman conjecture for countable abelian groups. Indeed, we describe a specific 2-generated,… ▽ More We prove finiteness properties for groups of homeomorphisms that have finitely many "singular points", and we describe the normal structure of such groups. As an application, we prove that every countable abelian group can be embedded into a finitely presented simple group, verifying the Boone-Higman conjecture for countable abelian groups. Indeed, we describe a specific 2-generated, $\mathrm{F}_\infty$ simple group $V\mathcal{A}$ of homeomorphisms of the Cantor set that contains every countable abelian group. As a second application, we prove that if $G$ is a bounded automata group then the associated Röver-Nekrashevych groups $V_{d,r}G$ have type $\mathrm{F}_\infty$, verifying a conjecture of Nekrashevych for a large class of contracting self-similar groups. Among others, this result applies to Röver-Nekrashevych groups associated to Gupta-Sidki groups and the basilica group. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 34 pages, no figures

MSC Class: 20F65; 20J05; 20E32; 20F10

Showing 1–50 of 707,761 results for author: J.