-
Integer programs with nearly totally unimodular matrices: the cographic case
Authors:
Manuel Aprile,
Samuel Fiorini,
Gwenaël Joret,
Stefan Kober,
Michał T. Seweryn,
Stefan Weltge,
Yelena Yuditsky
Abstract:
It is a notorious open question whether integer programs (IPs), with an integer coefficient matrix $M$ whose subdeterminants are all bounded by a constant $Δ$ in absolute value, can be solved in polynomial time. We answer this question in the affirmative if we further require that, by removing a constant number of rows and columns from $M$, one obtains a submatrix $A$ that is the transpose of a ne…
▽ More
It is a notorious open question whether integer programs (IPs), with an integer coefficient matrix $M$ whose subdeterminants are all bounded by a constant $Δ$ in absolute value, can be solved in polynomial time. We answer this question in the affirmative if we further require that, by removing a constant number of rows and columns from $M$, one obtains a submatrix $A$ that is the transpose of a network matrix.
Our approach focuses on the case where $A$ arises from $M$ after removing $k$ rows only, where $k$ is a constant. We achieve our result in two main steps, the first related to the theory of IPs and the second related to graph minor theory.
First, we derive a strong proximity result for the case where $A$ is a general totally unimodular matrix: Given an optimal solution of the linear programming relaxation, an optimal solution to the IP can be obtained by finding a constant number of augmentations by circuits of $[A\; I]$.
Second, for the case where $A$ is transpose of a network matrix, we reformulate the problem as a maximum constrained integer potential problem on a graph $G$. We observe that if $G$ is $2$-connected, then it has no rooted $K_{2,t}$-minor for $t = Ω(k Δ)$. We leverage this to obtain a tree-decomposition of $G$ into highly structured graphs for which we can solve the problem locally. This allows us to solve the global problem via dynamic programming.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Galaxy Mergers in the Epoch of Reionization I: A JWST Study of Pair Fractions, Merger Rates, and Stellar Mass Accretion Rates at $z = 4.5-11.5$
Authors:
Qiao Duan,
Christopher J. Conselice,
Qiong Li,
Duncan Austin,
Thomas Harvey,
Nathan J. Adams,
Kenneth J. Duncan,
James Trussler,
Leonardo Ferreira,
Lewi Westcott,
Honor Harris,
Rogier A. Windhorst,
Benne W. Holwerda,
Thomas J. Broadhurst,
Dan Coe,
Seth H. Cohen,
Simon P. Driver,
Brenda Frye,
Norman A. Grogin,
Nimish P. Hathi,
Rolf A. Jansen,
Anton M. Koekemoer,
Madeline A. Marshall,
Mario Nonino,
Rafael Ortiz III
, et al. (7 additional authors not shown)
Abstract:
We present a full analysis of galaxy major merger pair fractions, merger rates, and mass accretion rates, thus uncovering the role of mergers in galaxy formation at the earliest previously unexplored epoch of $4.5<z<11.5$. We target galaxies with masses $\log_{10}(\mathrm{M}_*/\mathrm{M}_\odot) = 8.0 - 10.0$, utilizing data from eight JWST Cycle-1 fields (CEERS, JADES GOODS-S, NEP-TDF, NGDEEP, GLA…
▽ More
We present a full analysis of galaxy major merger pair fractions, merger rates, and mass accretion rates, thus uncovering the role of mergers in galaxy formation at the earliest previously unexplored epoch of $4.5<z<11.5$. We target galaxies with masses $\log_{10}(\mathrm{M}_*/\mathrm{M}_\odot) = 8.0 - 10.0$, utilizing data from eight JWST Cycle-1 fields (CEERS, JADES GOODS-S, NEP-TDF, NGDEEP, GLASS, El-Gordo, SMACS-0723, MACS-0416), covering an unmasked area of 189.36 $\mathrm{arcmin}^2$. We develop a new probabilistic pair-counting methodology that integrates full photometric redshift posteriors and corrects for detection incompleteness to quantify close pairs with physical projected separations between 20 and 50 kpc. Our analysis reveals an increase in pair fractions up to $z = 8$, reaching $0.211 \pm 0.065$, followed by a statistically flat evolution to $z = 11.5$. We find that the galaxy merger rate increases from the local Universe up to $z = 6$ and then stabilizes at a value of $\sim 6$ Gyr$^{-1}$ up to $z = 11.5$. We fit both a power-law and a power-law + exponential model to our pair fraction and merger rate redshift evolution, finding that the latter model describes the trends more accurately, particularly at $z = 8.0 - 11.5$. In addition, we measure that the average galaxy increases its stellar mass due to mergers by a factor of $2.77 \pm 0.99$ from redshift $z = 10.5$ to $z = 5.0$. Lastly, we investigate the impact of mergers on galaxy stellar mass growth, revealing that mergers contribute $71 \pm 25\%$ as much to galaxy stellar mass increases as star formation from gas. This indicates that mergers drive about half of galaxy assembly at high redshift.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Learning Coordinated Maneuver in Adversarial Environments
Authors:
Zechen Hu,
Manshi Limbu,
Daigo Shishika,
Xuesu Xiao,
Xuan Wang
Abstract:
This paper aims to solve the coordination of a team of robots traversing a route in the presence of adversaries with random positions. Our goal is to minimize the overall cost of the team, which is determined by (i) the accumulated risk when robots stay in adversary-impacted zones and (ii) the mission completion time. During traversal, robots can reduce their speed and act as a `guard' (the slower…
▽ More
This paper aims to solve the coordination of a team of robots traversing a route in the presence of adversaries with random positions. Our goal is to minimize the overall cost of the team, which is determined by (i) the accumulated risk when robots stay in adversary-impacted zones and (ii) the mission completion time. During traversal, robots can reduce their speed and act as a `guard' (the slower, the better), which will decrease the risks certain adversary incurs. This leads to a trade-off between the robots' guarding behaviors and their travel speeds. The formulated problem is highly non-convex and cannot be efficiently solved by existing algorithms. Our approach includes a theoretical analysis of the robots' behaviors for the single-adversary case. As the scale of the problem expands, solving the optimal solution using optimization approaches is challenging, therefore, we employ reinforcement learning techniques by develo** new encoding and policy-generating methods. Simulations demonstrate that our learning methods can efficiently produce team coordination behaviors. We discuss the reasoning behind these behaviors and explain why they reduce the overall team cost.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures
Authors:
Sophia Sanborn,
Johan Mathe,
Mathilde Papillon,
Domas Buracas,
Hansen J Lillemark,
Christian Shewmake,
Abby Bertics,
Xavier Pennec,
Nina Miolane
Abstract:
The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently nonEuclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-tim…
▽ More
The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently nonEuclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
TRAVERSE: Traffic-Responsive Autonomous Vehicle Experience & Rare-event Simulation for Enhanced safety
Authors:
Sandeep Thalapanane,
Sandip Sharan Senthil Kumar,
Guru Nandhan Appiya Dilipkumar Peethambari,
Sourang SriHari,
Laura Zheng,
Julio Poveda,
Ming C. Lin
Abstract:
Data for training learning-enabled self-driving cars in the physical world are typically collected in a safe, normal environment. Such data distribution often engenders a strong bias towards safe driving, making self-driving cars unprepared when encountering adversarial scenarios like unexpected accidents. Due to a dearth of such adverse data that is unrealistic for drivers to collect, autonomous…
▽ More
Data for training learning-enabled self-driving cars in the physical world are typically collected in a safe, normal environment. Such data distribution often engenders a strong bias towards safe driving, making self-driving cars unprepared when encountering adversarial scenarios like unexpected accidents. Due to a dearth of such adverse data that is unrealistic for drivers to collect, autonomous vehicles can perform poorly when experiencing such rare events. This work addresses much-needed research by having participants drive a VR vehicle simulator going through simulated traffic with various types of accidental scenarios. It aims to understand human responses and behaviors in simulated accidents, contributing to our understanding of driving dynamics and safety. The simulation framework adopts a robust traffic simulation and is rendered using the Unity Game Engine. Furthermore, the simulation framework is built with portable, light-weight immersive driving simulator hardware, lowering the resource barrier for studies in autonomous driving research.
Keywords: Rare Events, Traffic Simulation, Autonomous Driving, Virtual Reality, User Studies
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Stochastic proof of the sharp symmetrized Talagrand inequality
Authors:
Thomas A. Courtade,
Max Fathi,
Dan Mikulincer
Abstract:
We give a new proof of the sharp symmetrized form of Talagrand's transport-entropy inequality. Compared to stochastic proofs of other Gaussian functional inequalities, the new idea here is a certain coupling induced by time-reversed martingale representations.
We give a new proof of the sharp symmetrized form of Talagrand's transport-entropy inequality. Compared to stochastic proofs of other Gaussian functional inequalities, the new idea here is a certain coupling induced by time-reversed martingale representations.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Symmetric Second-Harmonic Generation in Sub-wavelength Periodically Poled Thin Film Lithium Niobate
Authors:
Fengyan Yang,
Juanjuan Lu,
Mohan Shen,
Guangcanlan Yang,
Hong X. Tang
Abstract:
Second harmonic generation (SHG) extensively employs periodically poled nonlinear crystals through forward quasi-phase-matching to achieve efficient frequency conversion. As poling periods approach sub-micrometers, backward quasi-phase-matching has also been demonstrated, albeit by utilizing pulsed laser drives. The realization of symmetric second harmonic generation, characterized by counterpropa…
▽ More
Second harmonic generation (SHG) extensively employs periodically poled nonlinear crystals through forward quasi-phase-matching to achieve efficient frequency conversion. As poling periods approach sub-micrometers, backward quasi-phase-matching has also been demonstrated, albeit by utilizing pulsed laser drives. The realization of symmetric second harmonic generation, characterized by counterpropagating pumps, however, has remained elusive despite theoretical predictions. The main challenge lies in achieving strong nonlinear coupling with poling period below half the wavelength of the second-harmonic light. The recent emergence of high-quality ferroelectric lithium niobate thin films provides an opportunity for achieving precise domain control at submicron dimensions. In this article, we demonstrate reliable control of ferroelectric domains in thin film lithium niobate waveguide with a poling period down to 370nm, thereby realizing highly efficient continuous-wave pumped symmetric SHG. This demonstration not only validates the feasibility of achieving subwavelength periodic poling on waveguides but also opens new avenues for leveraging submicron ferroelectric domain structures in integrated photonics and nonlinear optics research.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Interactive Coding with Unbounded Noise
Authors:
Eden Fargion,
Ran Gelles,
Meghal Gupta
Abstract:
Interactive coding allows two parties to conduct a distributed computation despite noise corrupting a certain fraction of their communication. Dani et al.\@ (Inf.\@ and Comp., 2018) suggested a novel setting in which the amount of noise is unbounded and can significantly exceed the length of the (noise-free) computation. While no solution is possible in the worst case, under the restriction of obl…
▽ More
Interactive coding allows two parties to conduct a distributed computation despite noise corrupting a certain fraction of their communication. Dani et al.\@ (Inf.\@ and Comp., 2018) suggested a novel setting in which the amount of noise is unbounded and can significantly exceed the length of the (noise-free) computation. While no solution is possible in the worst case, under the restriction of oblivious noise, Dani et al.\@ designed a coding scheme that succeeds with a polynomially small failure probability.
We revisit the question of conducting computations under this harsh type of noise and devise a computationally-efficient coding scheme that guarantees the success of the computation, except with an exponentially small probability. This higher degree of correctness matches the case of coding schemes with a bounded fraction of noise.
Our simulation of an $N$-bit noise-free computation in the presence of $T$ corruptions, communicates an optimal number of $O(N+T)$ bits and succeeds with probability $1-2^{-Ω(N)}$. We design this coding scheme by introducing an intermediary noise model, where an oblivious adversary can choose the locations of corruptions in a worst-case manner, but the effect of each corruption is random: the noise either flips the transmission with some probability or otherwise erases it. This randomized abstraction turns out to be instrumental in achieving an optimal coding scheme.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Spin-filter tunneling detection of antiferromagnetic resonance with electrically-tunable dam**
Authors:
Thow Min Jerald Cham,
Daniel G. Chica,
Kenji Watanabe,
Takashi Taniguchi,
Xavier Roy,
Yunqiu Kelly Luo,
Daniel C. Ralph
Abstract:
Antiferromagnetic spintronics offers the potential for higher-frequency operations compared to ferromagnetic spintronics and improved insensitivity to magnetic fields. However, previous electrical techniques to detect antiferromagnetic dynamics have required millimeter-scale samples to achieve measurable signals. Here we demonstrate direct electrical detection of antiferromagnetic resonance in dev…
▽ More
Antiferromagnetic spintronics offers the potential for higher-frequency operations compared to ferromagnetic spintronics and improved insensitivity to magnetic fields. However, previous electrical techniques to detect antiferromagnetic dynamics have required millimeter-scale samples to achieve measurable signals. Here we demonstrate direct electrical detection of antiferromagnetic resonance in devices 1000 times smaller using spin-filter tunneling in micron-scale PtTe$_2$/bilayer CrSBr/graphite junctions in which the tunnel barrier is the van der Waals antiferromaget CrSBr. This sample geometry allows not only efficient detection, but also electrical control of the antiferromagnetic resonance through spin-orbit torque from the PtTe$_2$ electrode. The ability to efficiently detect and control antiferromagnetic resonance provides the means to make detailed studies of the physics governing these high-frequency dynamics and to pursue applications including radiation sources, modulators, and detectors.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
A Versatile Side Entry Laser System for Scanning Transmission Electron Microscopy
Authors:
Ondrej Dyck,
Olugbenga Olunloyo,
Kai Xiao,
Benjamin Wolf,
Thomas M. Moore,
Andrew R. Lupini,
Stephen Jesse
Abstract:
We present the design and implementation of a side entry laser system designed for an ultra-high vacuum scanning transmission electron microscope. This system uses a versatile probe design enclosed in a vacuum envelope such that parts can be easily aligned, modified, or exchanged without disturbing the vacuum. The system uses a mirror mounted on the sample holder such that the sample can be illumi…
▽ More
We present the design and implementation of a side entry laser system designed for an ultra-high vacuum scanning transmission electron microscope. This system uses a versatile probe design enclosed in a vacuum envelope such that parts can be easily aligned, modified, or exchanged without disturbing the vacuum. The system uses a mirror mounted on the sample holder such that the sample can be illuminated without being tilted. Notably the mirror can be removed and replaced with an ablation target and a higher power laser used to ablate material directly onto the sample. We argue that new capabilities hold the potential to transform the electron microscope from an analysis tool towards a more flexible synthesis system, where atomic scale fabrication and atom-by-atom experiments can be performed.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
GazeRace: Revolutionizing Remote Piloting with Eye-Gaze Control
Authors:
Issatay Tokmurziyev,
Valerii Serpiva,
Alexey Fedoseev,
Miguel Altamirano Cabrera,
Dzmitry Tsetserukou
Abstract:
This paper introduces the GazeRace method for drone navigation, employing a computer vision interface facilitated by eye-tracking technology. This interface is designed to be compatible with a single camera and uses a convolutional neural network to convert eye movements into control commands for the drone. Experimental validation demonstrates that users equipped with the eye-tracking interface ac…
▽ More
This paper introduces the GazeRace method for drone navigation, employing a computer vision interface facilitated by eye-tracking technology. This interface is designed to be compatible with a single camera and uses a convolutional neural network to convert eye movements into control commands for the drone. Experimental validation demonstrates that users equipped with the eye-tracking interface achieve comparable performance to a traditional remote control interface when completing a drone racing task.
Ten participants completed flight tests in which they navigated a drone through a racing track in a Gazebo simulation environment. Users reduced drone trajectory length by 18% (73.44 m vs. 89.29 m) using the eye-tracking interface to navigate racing gates effectively. The time taken to complete the route using the eye-tracking method (average of 70.01 seconds) was only 3.5% slower than using the remote control method (also average of 70.01 seconds), indicating the good efficiency of the interface. It is also worth mentioning that four of the participants completed the race with an average time that was 25.9% faster than the other participants. In addition, users evaluated highly the performance (M = 34.0, SD = 14.2) and low frustration (M = 30.5, SD = 9.2) with the eye-tracking interface compared to performance (M = 63.0, SD = 10.1) and frustration (M = 49.0, SD = 11.7) with the baseline remote controller. The hedonic quality (M = 1.65, SD = 0.45) was also evaluated high by the users in the UEQ questionnaire.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Non-Hermitian Origin of Wannier Localizability and Detachable Topological Boundary States
Authors:
Daichi Nakamura,
Ken Shiozaki,
Kenji Shimomura,
Masatoshi Sato,
Kohei Kawabata
Abstract:
While topology can impose obstructions to exponentially localized Wannier functions, certain topological insulators are exempt from such Wannier obstructions. The absence of the Wannier obstructions can further accompany topological boundary states that are detachable from the bulk bands. Here, we elucidate a close connection between these detachable topological boundary states and non-Hermitian t…
▽ More
While topology can impose obstructions to exponentially localized Wannier functions, certain topological insulators are exempt from such Wannier obstructions. The absence of the Wannier obstructions can further accompany topological boundary states that are detachable from the bulk bands. Here, we elucidate a close connection between these detachable topological boundary states and non-Hermitian topology. Identifying topological boundary states as non-Hermitian topology, we demonstrate that intrinsic non-Hermitian topology leads to the inevitable spectral flow. By contrast, we show that extrinsic non-Hermitian topology underlies the detachment of topological boundary states and clarify anti-Hermitian topology of the detached boundary states. Based on this connection and $K$-theory, we complete the tenfold classification of Wannier localizability and detachable topological boundary states.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
How coronal mass ejections are influenced by the morphology and toroidal flux of their source magnetic flux ropes?
Authors:
J. H. Guo,
L. Linan,
S. Poedts,
Y. Guo,
B. Schmieder,
A. Lani,
Y. W. Ni,
M. Brchnelova,
B. Perri,
T. Baratashvili,
S. T. Li,
P. F. Chen
Abstract:
Coronal mass ejections (CMEs) stand as intense eruptions of magnetized plasma from the Sun, playing a pivotal role in driving significant changes of the heliospheric environment. Deducing the properties of CMEs from their progenitors in solar source regions is crucial for space weather forecasting. Deducing the properties of CMEs from their progenitors in solar source regions is crucial for space…
▽ More
Coronal mass ejections (CMEs) stand as intense eruptions of magnetized plasma from the Sun, playing a pivotal role in driving significant changes of the heliospheric environment. Deducing the properties of CMEs from their progenitors in solar source regions is crucial for space weather forecasting. Deducing the properties of CMEs from their progenitors in solar source regions is crucial for space weather forecasting. The primary objective of this paper is to establish a connection between CMEs and their progenitors in solar source regions, enabling us to infer the magnetic structures of CMEs before their full development. To this end, we create a dataset comprising a magnetic flux rope series with varying projection shapes, sizes and toroidal fluxes, using the Regularized Biot-Savart Laws (RBSL). Thereafter, we simulate the propagation of these flux ropes from the solar surface to a distance of 25$R_{\odot}$ with our global coronal MHD model which is named COCONUT. Our parametric survey reveals significant impacts of source flux ropes on the consequent CMEs. We find that the projection shape can influence the magnetic structures of CMEs at 20$R_{\odot}$, albeit with minimal impacts on the propagation speed. However, these impacts diminish as source flux ropes become fat. In terms of toroidal flux, our simulation results demonstrate a pronounced correlation with the propagation speed of CMEs, as well as the successfulness in erupting. This work builds the bridge between the CMEs in the outer corona and their progenitors in solar source regions. Our parametric survey suggests that the projection shape, cross-section radius and toroidal flux of source flux ropes are crucial parameters in predicting magnetic structures and propagation speed of CMEs, providing valuable insights for space weather prediction.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Strain coupling of a single exciton to a nano-optomechanical resonator
Authors:
Matteo Lodde,
René P. J. van Veldhoven,
Ewold Verhagen,
Andrea Fiore
Abstract:
We demonstrate coupling of a semiconductor quantum dot (QD) to an optomechanical cavity, mediated by the strain of a nano-mechanical mode. The device comprises an optomechanical photonic crystal nanobeam in GaAs with embedded In(Ga)As QDs. The flexural mechanical mode of the device can be optically driven exploiting the large optomechanical coupling rate of the cavity. The vibrations generate a ti…
▽ More
We demonstrate coupling of a semiconductor quantum dot (QD) to an optomechanical cavity, mediated by the strain of a nano-mechanical mode. The device comprises an optomechanical photonic crystal nanobeam in GaAs with embedded In(Ga)As QDs. The flexural mechanical mode of the device can be optically driven exploiting the large optomechanical coupling rate of the cavity. The vibrations generate a time-modulated strain field that shifts the quantum dot transition energy. We observe that optical driving of the mechanical mode induces a shift in an excitonic line corresponding to an estimated vacuum strain coupling rate of 214 kHz. Our approach represents an important step towards the use of phonons to couple different on-chip quantum systems.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Plasmonic lightning-rod effect
Authors:
Vlastimil Křápek,
Rostislav Řepa,
Michael Foltýn,
Tomáš Šikola,
Michal Horák
Abstract:
The plasmonic lightning-rod effect refers to the formation of a strong electric near field of localized surface plasmons at the sharp features of plasmonic antennas. While this effect is intuitively utilized in the design and optimization of plasmonic antennas, the relation between the magnitude of the electric field and the local curvature of the plasmonic antenna has not been yet rigorously esta…
▽ More
The plasmonic lightning-rod effect refers to the formation of a strong electric near field of localized surface plasmons at the sharp features of plasmonic antennas. While this effect is intuitively utilized in the design and optimization of plasmonic antennas, the relation between the magnitude of the electric field and the local curvature of the plasmonic antenna has not been yet rigorously established. Here, we provide such a study. We design sets of plasmonic antennas that allow to isolate the role of the local curvature from other effects influencing the field. The near electric field is inspected by electron energy loss spectroscopy and electrodynamic simulations. We demonstrate the existence of the plasmonic lightning-rod effect and establish its quantitative description, showing that its strength is comparable to the electrostatic lightning-rod effect. We also provide a simple phenomenological formula for the spatial dependence of the field. Finally, we introduce the effective radius of curvature related to the spatial distribution of induced charge in plasmonic antennas, significantly smaller than their geometrical radius.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators
Authors:
Paolo D'Alberto,
Taehee Jeong,
Akshai Jain,
Shreyas Manjunath,
Mrinal Sarmah,
Samuel Hsu Yaswanth Raparti,
Nitesh Pipralia
Abstract:
Nowadays, increasingly larger Deep Neural Networks (DNNs) are being developed, trained, and utilized. These networks require significant computational resources, putting a strain on both advanced and limited devices. Our solution is to implement {\em weight block sparsity}, which is a structured sparsity that is friendly to hardware. By zeroing certain sections of the convolution and fully connect…
▽ More
Nowadays, increasingly larger Deep Neural Networks (DNNs) are being developed, trained, and utilized. These networks require significant computational resources, putting a strain on both advanced and limited devices. Our solution is to implement {\em weight block sparsity}, which is a structured sparsity that is friendly to hardware. By zeroing certain sections of the convolution and fully connected layers parameters of pre-trained DNN models, we can efficiently speed up the DNN's inference process. This results in a smaller memory footprint, faster communication, and fewer operations.
Our work presents a vertical system that allows for the training of convolution and matrix multiplication weights to exploit 8x8 block sparsity on a single GPU within a reasonable amount of time. Compilers recognize this sparsity and use it for both data compaction and computation splitting into threads. Blocks like these take full advantage of both spatial and temporal locality, paving the way for fast vector operations and memory reuse. By using this system on a Resnet50 model, we were able to reduce the weight by half with minimal accuracy loss, resulting in a two-times faster inference speed. We will present performance estimates using accurate and complete code generation for AIE2 configuration sets (AMD Versal FPGAs) with Resnet50, Inception V3, and VGG16 to demonstrate the necessary synergy between hardware overlay designs and software stacks for compiling and executing machine learning applications.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Beyond the instanton gas approach: dominant thimbles approximation for the Hubbard model
Authors:
Maksim Ulybyshev,
Fakher F. Assaad
Abstract:
To each complex saddle point of an action, one can attach a Lefschetz thimble on which the imaginary part of the action is constant. Cauchy theorem states that summation over a set of thimbles produces the exact result. This reorganization of the path integral, is an appealing starting point for various approximations: In the realm of auxiliary quantum Monte Carlo methods it provides a framework t…
▽ More
To each complex saddle point of an action, one can attach a Lefschetz thimble on which the imaginary part of the action is constant. Cauchy theorem states that summation over a set of thimbles produces the exact result. This reorganization of the path integral, is an appealing starting point for various approximations: In the realm of auxiliary quantum Monte Carlo methods it provides a framework to alleviate the negative sign problem. Here, we suggest to constrain the integration to the \textit{dominant} thimbles: the thimbles attached to the saddle points with the largest statistical weight. For the Hubbard model, in a formulation where the the Hubbard Stratonovitch field couples to the charge, this provides a \textit{symmetry} consistent approximation to the physics of the Hubbard model: constraining the integration domain does not explicitly break a symmetry. We can test this approach for the Hubbard model at half-filling on a bipartite lattice. The paper builds on the previously developed instanton gas approach, where an exhaustive saddle point approximation was constructed. We present results, showing that the dominant thimbles approximation provides results that are in remarkable agreement with the exact results for various fermionic observables including spin and charge order parameters and single electron spectral functions. We discuss implications of our results for simulations away of half filling.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Human-like Episodic Memory for Infinite Context LLMs
Authors:
Zafeirios Fountas,
Martin A Benfeghoul,
Adnan Oomerjee,
Fenia Christopoulou,
Gerasimos Lampouras,
Haitham Bou-Ammar,
Jun Wang
Abstract:
Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrat…
▽ More
Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrates key aspects of human episodic memory and event cognition into LLMs, enabling them to effectively handle practically infinite context lengths while maintaining computational efficiency. EM-LLM organises sequences of tokens into coherent episodic events using a combination of Bayesian surprise and graph-theoretic boundary refinement in an on-line fashion. When needed, these events are retrieved through a two-stage memory process, combining similarity-based and temporally contiguous retrieval for efficient and human-like access to relevant information. Experiments on the LongBench dataset demonstrate EM-LLM's superior performance, outperforming the state-of-the-art InfLLM model with an overall relative improvement of 4.3% across various tasks, including a 33% improvement on the PassageRetrieval task. Furthermore, our analysis reveals strong correlations between EM-LLM's event segmentation and human-perceived events, suggesting a bridge between this artificial system and its biological counterpart. This work not only advances LLM capabilities in processing extended contexts but also provides a computational framework for exploring human memory mechanisms, opening new avenues for interdisciplinary research in AI and cognitive science.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts
Authors:
Amelia F. Hardy,
Houjun Liu,
Bernard Lange,
Mykel J. Kochenderfer
Abstract:
Typical schemes for automated red-teaming large language models (LLMs) focus on discovering prompts that trigger a frozen language model (the defender) to generate toxic text. This often results in the prompting model (the adversary) producing text that is unintelligible and unlikely to arise. Here, we propose a reinforcement learning formulation of the LLM red-teaming task which allows us to disc…
▽ More
Typical schemes for automated red-teaming large language models (LLMs) focus on discovering prompts that trigger a frozen language model (the defender) to generate toxic text. This often results in the prompting model (the adversary) producing text that is unintelligible and unlikely to arise. Here, we propose a reinforcement learning formulation of the LLM red-teaming task which allows us to discover prompts that both (1) trigger toxic outputs from a frozen defender and (2) have low perplexity as scored by the defender. We argue these cases are most pertinent in a red-teaming setting because of their likelihood to arise during normal use of the defender model. We solve this formulation through a novel online and weakly supervised variant of Identity Preference Optimization (IPO) on GPT-2 and GPT-2 XL defenders. We demonstrate that our policy is capable of generating likely prompts that also trigger toxicity. Finally, we qualitatively analyze learned strategies, trade-offs of likelihood and toxicity, and discuss implications. Source code is available for this project at: https://github.com/sisl/ASTPrompter/.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Intensive broadband reverberation map** of Fairall 9 with 1.8 years of daily Swift monitoring
Authors:
R. Edelson,
B. M. Peterson,
J. Gelbord,
K. Horne,
M. Goad,
I. McHardy,
S. Vaughan,
M. Vestergaard
Abstract:
We present 1.8 years of near-daily Swift monitoring of the bright, strongly variable Type 1 AGN Fairall 9. Totaling 575 successful visits, this is the largest such campaign reported to date. Variations within the UV/optical are well-correlated, with longer wavelengths lagging shorter wavelengths in the direction predicted by thin disk/lamp-post models. The correlations are improved by detrending;…
▽ More
We present 1.8 years of near-daily Swift monitoring of the bright, strongly variable Type 1 AGN Fairall 9. Totaling 575 successful visits, this is the largest such campaign reported to date. Variations within the UV/optical are well-correlated, with longer wavelengths lagging shorter wavelengths in the direction predicted by thin disk/lamp-post models. The correlations are improved by detrending; subtracting a second-order polynomial fit to the UV/optical light curves to remove long-term trends that are not of interest to this study. Extensive testing indicates detrending with higher-order polynomials removes too much intrinsic variability signal on reverberation timescales. These data provide the clearest detection to date of interband lags within the UV, indicating that neither emission from a large disk nor diffuse continuum emission from the broad-line region can independently explain the full observed lag spectrum. The observed X-ray flux variations are poorly correlated with those in the UV/optical. Further, subdivision of the data into four ~160 day light curves shows that the UV/optical lag spectrum is highly stable throughout the four periods, but the X-ray to UV lags are unstable, significantly changing magnitude and even direction from one period to the next. This indicates the X-ray to UV relationship is more complex than predicted by the simple reprocessing model often adopted for AGN. A bowl model (lamp-post irradiation and blackbody reprocessing on a disk with a steep rim) fit suggests the disk thickens at a distance (~10 lt-day) and temperature (~8000K) consistent with the inner edge of the BLR.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Addressing Confounding and Continuous Exposure Measurement Error Using Corrected Score Functions
Authors:
Brian D. Richardson,
Bryan S. Blette,
Peter B. Gilbert,
Michael G. Hudgens
Abstract:
Confounding and exposure measurement error can introduce bias when drawing inference about the marginal effect of an exposure on an outcome of interest. While there are broad methodologies for addressing each source of bias individually, confounding and exposure measurement error frequently co-occur and there is a need for methods that address them simultaneously. In this paper, corrected score me…
▽ More
Confounding and exposure measurement error can introduce bias when drawing inference about the marginal effect of an exposure on an outcome of interest. While there are broad methodologies for addressing each source of bias individually, confounding and exposure measurement error frequently co-occur and there is a need for methods that address them simultaneously. In this paper, corrected score methods are derived under classical additive measurement error to draw inference about marginal exposure effects using only measured variables. Three estimators are proposed based on g-formula, inverse probability weighting, and doubly-robust estimation techniques. The estimators are shown to be consistent and asymptotically normal, and the doubly-robust estimator is shown to exhibit its namesake property. The methods, which are implemented in the R package mismex, perform well in finite samples under both confounding and measurement error as demonstrated by simulation studies. The proposed doubly-robust estimator is applied to study the effects of two biomarkers on HIV-1 infection using data from the HVTN 505 preventative vaccine trial.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
The $μ\mathcal{G}$ Language for Programming Graph Neural Networks
Authors:
Matteo Belenchia,
Flavio Corradini,
Michela Quadrini,
Michele Loreti
Abstract:
Graph neural networks form a class of deep learning architectures specifically designed to work with graph-structured data. As such, they share the inherent limitations and problems of deep learning, especially regarding the issues of explainability and trustworthiness. We propose $μ\mathcal{G}$, an original domain-specific language for the specification of graph neural networks that aims to overc…
▽ More
Graph neural networks form a class of deep learning architectures specifically designed to work with graph-structured data. As such, they share the inherent limitations and problems of deep learning, especially regarding the issues of explainability and trustworthiness. We propose $μ\mathcal{G}$, an original domain-specific language for the specification of graph neural networks that aims to overcome these issues. The language's syntax is introduced, and its meaning is rigorously defined by a denotational semantics. An equivalent characterization in the form of an operational semantics is also provided and, together with a type system, is used to prove the type soundness of $μ\mathcal{G}$. We show how $μ\mathcal{G}$ programs can be represented in a more user-friendly graphical visualization, and provide examples of its generality by showing how it can be used to define some of the most popular graph neural network models, or to develop any custom graph processing application.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Self-organized multiscale structures in thermally relativistic electron-positron-ion plasmas
Authors:
Usman Shazad,
M. Iqbal,
Shafa Ullah
Abstract:
The self-organization of a thermally relativistic magnetized plasma comprising of electrons, positrons and static ions is investigated. The self-organized state is found to be the superposition of three distinct Beltrami fields known as triple Beltrami (TB) state. In general, the eigenvalues associated with the multiscale self-organized vortices may be a pair of complex conjugate and real one. It…
▽ More
The self-organization of a thermally relativistic magnetized plasma comprising of electrons, positrons and static ions is investigated. The self-organized state is found to be the superposition of three distinct Beltrami fields known as triple Beltrami (TB) state. In general, the eigenvalues associated with the multiscale self-organized vortices may be a pair of complex conjugate and real one. It is shown that all the eigenvalues become real when thermal energy increases or the positron density decreases. The impact of relativistic temperature and positron density on the formation of self-organized structures is investigated. The self-organized field and flow vortices may vary simultaneously on vastly different length scales. The disparate variation of self-organized vortices is important in the context of dynamo theory. The present work is useful to study the formation of multiscale vortices and dynamo mechanisms in multi-species thermally relativistic plasmas.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Neuroevolution of Decentralized Decision-Making in N-Bead Swimmers Leads to Scalable and Robust Collective Locomotion
Authors:
Benedikt Hartl,
Michael Levin,
Andreas Zöttl
Abstract:
Many microorganisms are capable of swimming through viscous fluids such as water in order to search for nutrients, swim toward oxygen or light, or to escape from predators. To navigate their environment they often perform large nonreciprocal periodic deformations of their shape, by waving appendages such as cilia or flagella, or by deforming their entire body. Even unicellular organisms are fundam…
▽ More
Many microorganisms are capable of swimming through viscous fluids such as water in order to search for nutrients, swim toward oxygen or light, or to escape from predators. To navigate their environment they often perform large nonreciprocal periodic deformations of their shape, by waving appendages such as cilia or flagella, or by deforming their entire body. Even unicellular organisms are fundamentally made of parts, which need to be cooperatively utilized to allow these creatures to navigate their environment, without using a centralized control mechanism. Here, we investigate the physical implications of decentralized decision-making of the actuators of a generalized N-bead Najafi Golestanian microswimmer, self-propelling via coordinated non-reciprocal swimming strokes. We treat each bead as an artificial neural network-based agent that perceives information about its neighbors and whose actions induce strokes of its adjacent arms. With neuroevolution techniques, we evolve optimal policies for the single-bead decision centers such that the N-bead collective efficiently self-propels as an individual, allowing us to investigate optimal locomotion policies for increasingly large microswimmer bodies. We demonstrate that such decentralized policies are robust and tolerant concerning morphological changes or defects and facilitate cargo transport or drug delivery applications "out of the box", without further optimization. Our approach allows us to train large swimmers ($N=100$ and more), and we show that long-wavelength solutions lead to surprisingly efficient swimming gaits. Our work is of relevance to understand robust locomotion of biological microswimmers, to develop robust artificial microswimmer navigation strategies, and, in a broader conceptional context, for Artificial Life< and in general emergent levels of individuality.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Let Me DeCode You: Decoder Conditioning with Tabular Data
Authors:
Tomasz Szczepański,
Michal K. Grzeszczyk,
Szymon Płotka,
Arleta Adamowicz,
Piotr Fudalej,
Przemysław Korzeniowski,
Tomasz Trzciński,
Arkadiusz Sitek
Abstract:
Training deep neural networks for 3D segmentation tasks can be challenging, often requiring efficient and effective strategies to improve model performance. In this study, we introduce a novel approach, DeCode, that utilizes label-derived features for model conditioning to support the decoder in the reconstruction process dynamically, aiming to enhance the efficiency of the training process. DeCod…
▽ More
Training deep neural networks for 3D segmentation tasks can be challenging, often requiring efficient and effective strategies to improve model performance. In this study, we introduce a novel approach, DeCode, that utilizes label-derived features for model conditioning to support the decoder in the reconstruction process dynamically, aiming to enhance the efficiency of the training process. DeCode focuses on improving 3D segmentation performance through the incorporation of conditioning embedding with learned numerical representation of 3D-label shape features. Specifically, we develop an approach, where conditioning is applied during the training phase to guide the network toward robust segmentation. When labels are not available during inference, our model infers the necessary conditioning embedding directly from the input data, thanks to a feed-forward network learned during the training phase. This approach is tested using synthetic data and cone-beam computed tomography (CBCT) images of teeth. For CBCT, three datasets are used: one publicly available and two in-house. Our results show that DeCode significantly outperforms traditional, unconditioned models in terms of generalization to unseen data, achieving higher accuracy at a reduced computational cost. This work represents the first of its kind to explore conditioning strategies in 3D data segmentation, offering a novel and more efficient method for leveraging annotated data. Our code, pre-trained models are publicly available at https://github.com/SanoScience/DeCode .
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
A novel direct Helmholtz solver in inhomogeneous media based on the operator Fourier transform functional calculus
Authors:
Max Cubillos,
Edwin Jimenez
Abstract:
This article presents novel numerical algorithms based on pseudodifferential operators ($Ψ$DO) for fast, direct solution of the Helmholtz equation in one-, two- and three-dimensional inhomogeneous unbounded media. The proposed approach relies on an Operator Fourier Transform (OFT) representation of $Ψ$DO which frame the problem of computing the inverse Helmholtz operator, with a spatially-dependen…
▽ More
This article presents novel numerical algorithms based on pseudodifferential operators ($Ψ$DO) for fast, direct solution of the Helmholtz equation in one-, two- and three-dimensional inhomogeneous unbounded media. The proposed approach relies on an Operator Fourier Transform (OFT) representation of $Ψ$DO which frame the problem of computing the inverse Helmholtz operator, with a spatially-dependent wave speed, in terms of two sequential applications of an inverse square root $Ψ$DO. The OFT representation of the action of the square root $Ψ$DO, in turn, can be effected as a superposition of solutions of a pseudo-temporal initial-boundary-value problem for a paraxial equation. The OFT framework offers several advantages over traditional direct and iterative approaches for the solution of the Helmholtz equation. The operator integral transform is amenable to standard quadrature methods and the required pseudo-temporal paraxial equation solutions can be obtained using any suitable numerical method. A specialized quadrature is derived to evaluate the OFT efficiently and an alternating direction implicit method, used in conjunction with standard finite differences, is used to solve the requisite component paraxial equation problems. Numerical studies, in 1, 2, and 3 spatial dimensions, are presented to confirm the expected OFT-based Helmholtz solver convergence rate. In addition, the efficiency and versatility of our proposed approach is demonstrated by tackling nontrivial wave propagation problems, including 2D plane wave scattering from a geometrically complex inhomogeneity, 3D scattering from turbulent channel flow and plane wave transmission through a spherically-symmetric gradient-index acoustic lens. All computations, even the latter lens problem which involves solving the Helmholtz equation with more than one billion complex unknowns, are performed in a single workstation.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Authors:
Jessica Echterhoff,
Fartash Faghri,
Raviteja Vemulapalli,
Ting-Yao Hu,
Chun-Liang Li,
Oncel Tuzel,
Hadi Pouransari
Abstract:
Large Language Models (LLMs) are frequently updated due to data or architecture changes to improve their performance. When updating models, developers often focus on increasing overall performance metrics with less emphasis on being compatible with previous model versions. However, users often build a mental model of the functionality and capabilities of a particular machine learning model they ar…
▽ More
Large Language Models (LLMs) are frequently updated due to data or architecture changes to improve their performance. When updating models, developers often focus on increasing overall performance metrics with less emphasis on being compatible with previous model versions. However, users often build a mental model of the functionality and capabilities of a particular machine learning model they are interacting with. They have to adapt their mental model with every update -- a draining task that can lead to user dissatisfaction. In practice, fine-tuned downstream task adapters rely on pretrained LLM base models. When these base models are updated, these user-facing downstream task models experience instance regression or negative flips -- previously correct instances are now predicted incorrectly. This happens even when the downstream task training procedures remain identical. Our work aims to provide seamless model updates to a user in two ways. First, we provide evaluation metrics for a notion of compatibility to prior model versions, specifically for generative tasks but also applicable for discriminative tasks. We observe regression and inconsistencies between different model versions on a diverse set of tasks and model updates. Second, we propose a training strategy to minimize the number of inconsistencies in model updates, involving training of a compatibility model that can enhance task fine-tuned language models. We reduce negative flips -- instances where a prior model version was correct, but a new model incorrect -- by up to 40% from Llama 1 to Llama 2.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
A Perspective on Foundation Models for the Electric Power Grid
Authors:
Hendrik F. Hamann,
Thomas Brunschwiler,
Blazhe Gjorgiev,
Leonardo S. A. Martins,
Alban Puech,
Anna Varbella,
Jonas Weiss,
Juan Bernabe-Moreno,
Alexandre Blondin Massé,
Seong Choi,
Ian Foster,
Bri-Mathias Hodge,
Rishabh Jain,
Kibaek Kim,
Vincent Mai,
François Mirallès,
Martin De Montigny,
Octavio Ramos-Leaños,
Hussein Suprême,
Le Xie,
El-Nasser S. Youssef,
Arnaud Zinflou,
Alexander J. Belvi,
Ricardo J. Bessa,
Bishnu Prasad Bhattari
, et al. (2 additional authors not shown)
Abstract:
Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transi…
▽ More
Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transition and climate change. In this paper, we call for the development of, and state why we believe in, the potential of FMs for electric grids. We highlight their strengths and weaknesses amidst the challenges of a changing grid. We argue that an FM learning from diverse grid data and topologies could unlock transformative capabilities, pioneering a new approach in leveraging AI to redefine how we manage complexity and uncertainty in the electric grid. Finally, we discuss a power grid FM concept, namely GridFM, based on graph neural networks and show how different downstream tasks benefit.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
International Astrophysical Consortium for High-energy Calibration: Summary of the 15th IACHEC Workshop
Authors:
K. K. Madsen,
V. Burwitz,
K. Forster,
C. E. Grant,
M. Guainazzi,
V. Kashyap,
H. L. Marshall,
E. D. Miller,
L. Natalucci,
P. P. Plucinsky,
Y. Terada
Abstract:
In this report, we summarize the activities of the International Astronomical Consortium for High Energy Calibration (IACHEC) from the 15th IACHEC Workshop in Pelham, Germany. Sixty scientists directly involved in the calibration of operational and future high-energy missions gathered for 3.5 days to discuss the status of the cross-calibration between the current international complement of X-ray…
▽ More
In this report, we summarize the activities of the International Astronomical Consortium for High Energy Calibration (IACHEC) from the 15th IACHEC Workshop in Pelham, Germany. Sixty scientists directly involved in the calibration of operational and future high-energy missions gathered for 3.5 days to discuss the status of the cross-calibration between the current international complement of X-ray observatories and the possibilities to improve it. This summary consists of reports from the Working Groups with topics ranging across the identification and characterization of standard calibration sources, multi-observatory cross-calibration campaigns, appropriate and new statistical techniques, calibration of instruments and characterization of background, preservation of knowledge, and results for the benefit of the astronomical community.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Rethinking temporal self-similarity for repetitive action counting
Authors:
Yanan Luo,
**hui Yi,
Yazan Abu Farha,
Moritz Wolter,
Juergen Gall
Abstract:
Counting repetitive actions in long untrimmed videos is a challenging task that has many applications such as rehabilitation. State-of-the-art methods predict action counts by first generating a temporal self-similarity matrix (TSM) from the sampled frames and then feeding the matrix to a predictor network. The self-similarity matrix, however, is not an optimal input to a network since it discards…
▽ More
Counting repetitive actions in long untrimmed videos is a challenging task that has many applications such as rehabilitation. State-of-the-art methods predict action counts by first generating a temporal self-similarity matrix (TSM) from the sampled frames and then feeding the matrix to a predictor network. The self-similarity matrix, however, is not an optimal input to a network since it discards too much information from the frame-wise embeddings. We thus rethink how a TSM can be utilized for counting repetitive actions and propose a framework that learns embeddings and predicts action start probabilities at full temporal resolution. The number of repeated actions is then inferred from the action start probabilities. In contrast to current approaches that have the TSM as an intermediate representation, we propose a novel loss based on a generated reference TSM, which enforces that the self-similarity of the learned frame-wise embeddings is consistent with the self-similarity of repeated actions. The proposed framework achieves state-of-the-art results on three datasets, i.e., RepCount, UCFRep, and Countix.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Phenomenological emergent dark energy in the light of DESI Data Release 1
Authors:
A. Hernández-Almada,
M. L. Mendoza-Martínez,
Miguel A. García-Aspeitia,
V. Motta
Abstract:
This manuscript revisits the phenomenological emergent dark energy model (PEDE) by confronting it with recent cosmological data from early and late times. In particular we analyze PEDE model by using the baryon acoustic oscillation (BAO) measurements coming from both Dark Energy Spectroscopy Instrument (DESI) data release 1 and Sloan Digital Sky Survey (SDSS). Additionally, the measurements from c…
▽ More
This manuscript revisits the phenomenological emergent dark energy model (PEDE) by confronting it with recent cosmological data from early and late times. In particular we analyze PEDE model by using the baryon acoustic oscillation (BAO) measurements coming from both Dark Energy Spectroscopy Instrument (DESI) data release 1 and Sloan Digital Sky Survey (SDSS). Additionally, the measurements from cosmic chronometers, supernovae type Ia (Pantheon+), quasars, hydrogen II galaxies and cosmic background radiation distance priors are considered. By performing a Bayesian analysis based on Monte Carlo Markov Chain, we find consistent results on the constraints when SDSS and DESI are considered. However, we find higher values on the Hubble constant than Supernova $H_0$ for the Equation of State (SH0ES) does although it is still in agreement, within $1σ$ confidence level, when BAO measurements are added. Furthermore, we estimate the age of the Universe younger $\sim3\%$ than the one predicted by the standard cosmology. Additionally, we report values of $q_0 = -0.771^{+0.007}_{-0.007}$, $z_T = 0.764^{+0.011}_{-0.011}$ for the deceleration parameter today and the deceleration-acceleration transition redshift, respectively.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Open (Clinical) LLMs are Sensitive to Instruction Phrasings
Authors:
Alberto Mario Ceballos Arroyo,
Monica Munnangi,
Jiuding Sun,
Karen Y. C. Zhang,
Denis Jered McInerney,
Byron C. Wallace,
Silvio Amir
Abstract:
Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This issue is especially concerning in healthcare, as clinicians are unlikely to be experienced prompt engineers and the potential consequences of inaccurate outputs are heightened in this domain.
This raises a…
▽ More
Instruction-tuned Large Language Models (LLMs) can perform a wide range of tasks given natural language instructions to do so, but they are sensitive to how such instructions are phrased. This issue is especially concerning in healthcare, as clinicians are unlikely to be experienced prompt engineers and the potential consequences of inaccurate outputs are heightened in this domain.
This raises a practical question: How robust are instruction-tuned LLMs to natural variations in the instructions provided for clinical NLP tasks? We collect prompts from medical doctors across a range of tasks and quantify the sensitivity of seven LLMs -- some general, others specialized -- to natural (i.e., non-adversarial) instruction phrasings. We find that performance varies substantially across all models, and that -- perhaps surprisingly -- domain-specific models explicitly trained on clinical data are especially brittle, compared to their general domain counterparts. Further, arbitrary phrasing differences can affect fairness, e.g., valid but distinct instructions for mortality prediction yield a range both in overall performance, and in terms of differences between demographic groups.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Flow-Based Generative Emulation of Grids of Stellar Evolutionary Models
Authors:
Marc Hon,
Yaguang Li,
Joel Ong
Abstract:
We present a flow-based generative approach to emulate grids of stellar evolutionary models. By interpreting the input parameters and output properties of these models as multi-dimensional probability distributions, we train conditional normalizing flows to learn and predict the complex relationships between grid inputs and outputs in the form of conditional joint distributions. Leveraging the exp…
▽ More
We present a flow-based generative approach to emulate grids of stellar evolutionary models. By interpreting the input parameters and output properties of these models as multi-dimensional probability distributions, we train conditional normalizing flows to learn and predict the complex relationships between grid inputs and outputs in the form of conditional joint distributions. Leveraging the expressive power and versatility of these flows, we showcase their ability to emulate a variety of evolutionary tracks and isochrones across a continuous range of input parameters. In addition, we describe a simple Bayesian approach for estimating stellar parameters using these flows and demonstrate its application to asteroseismic datasets of red giants observed by the Kepler mission. By applying this approach to red giants in open clusters NGC 6791 and NGC 6819, we illustrate how large age uncertainties can arise when fitting only to global asteroseismic and spectroscopic parameters without prior information on initial helium abundances and mixing length parameter values. We also conduct inference using the flow at a large scale by determining revised estimates of masses and radii for 15,388 field red giants. These estimates show improved agreement with results from existing grid-based modelling, reveal distinct population-level features in the red clump, and suggest that the masses of Kepler red giants previously determined using the corrected asteroseismic scaling relations have been overestimated by 5-10%.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Infinitesimal conformal restriction and unitarizing measures for Virasoro algebra
Authors:
Maria Gordina,
Wei Qian,
Yilin Wang
Abstract:
We use the SLE$_κ$ loop measure to construct a natural representation of the Virasoro algebra of central charge $c = c(κ) \le 1$. In particular, we introduce a non-degenerate bilinear Hermitian form (not positive-definite) using the SLE loop measure and show that the representation is (indefinite) unitary. Our proof relies on the infinitesimal conformal restriction property of the SLE loop measure…
▽ More
We use the SLE$_κ$ loop measure to construct a natural representation of the Virasoro algebra of central charge $c = c(κ) \le 1$. In particular, we introduce a non-degenerate bilinear Hermitian form (not positive-definite) using the SLE loop measure and show that the representation is (indefinite) unitary. Our proof relies on the infinitesimal conformal restriction property of the SLE loop measure.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models
Authors:
Hang Zou,
Qiyang Zhao,
Yu Tian,
Lina Bariah,
Faouzi Bader,
Thierry Lestable,
Merouane Debbah
Abstract:
Large Language Models (LLMs) have the potential to revolutionize the Sixth Generation (6G) communication networks. However, current mainstream LLMs generally lack the specialized knowledge in telecom domain. In this paper, for the first time, we propose a pipeline to adapt any general purpose LLMs to a telecom-specific LLMs. We collect and build telecom-specific pre-train dataset, instruction data…
▽ More
Large Language Models (LLMs) have the potential to revolutionize the Sixth Generation (6G) communication networks. However, current mainstream LLMs generally lack the specialized knowledge in telecom domain. In this paper, for the first time, we propose a pipeline to adapt any general purpose LLMs to a telecom-specific LLMs. We collect and build telecom-specific pre-train dataset, instruction dataset, preference dataset to perform continual pre-training, instruct tuning and alignment tuning respectively. Besides, due to the lack of widely accepted evaluation benchmarks in telecom domain, we extend existing evaluation benchmarks and proposed three new benchmarks, namely, Telecom Math Modeling, Telecom Open QnA and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs including math modeling, Open-Ended question answering, code generation, infilling, summarization and analysis in telecom domain. Our fine-tuned LLM TelecomGPT outperforms state of the art (SOTA) LLMs including GPT-4, Llama-3 and Mistral in Telecom Math Modeling benchmark significantly and achieve comparable performance in various evaluation benchmarks such as TeleQnA, 3GPP technical documents classification, telecom code summary and generation and infilling.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Ultradifferentiable functions via the Laguerre operator
Authors:
Smiljana Jakšić,
Stevan Pilipović,
Nenad Teofanov,
Đorđe Vučković
Abstract:
We introduce spaces of test functions $\mathbf{G}^α_α(\mathbb R ^d_+)$ and $\mathbf{g}^α_α(\mathbb R ^d_+)$, $α> 0$, by using the iterates of the Laguerre operator, and show that these spaces represent a natural counterpart of global Pilipović spaces on $\mathbb R ^d$. Moreover, we show that $\mathbf{G}^α_α(\mathbb R^d_+)$ and $\mathbf{g}^α_α(\mathbb R ^d_+)$ coincide with $G-$type spaces when…
▽ More
We introduce spaces of test functions $\mathbf{G}^α_α(\mathbb R ^d_+)$ and $\mathbf{g}^α_α(\mathbb R ^d_+)$, $α> 0$, by using the iterates of the Laguerre operator, and show that these spaces represent a natural counterpart of global Pilipović spaces on $\mathbb R ^d$. Moreover, we show that $\mathbf{G}^α_α(\mathbb R^d_+)$ and $\mathbf{g}^α_α(\mathbb R ^d_+)$ coincide with $G-$type spaces when $α\geq 1$, and $α> 1$, respectively. We also consider the associated spaces of ultradistributions. The main tool in our study is the use of sequence spaces $\ell_α(\mathbb N_0 ^d)$ and $\ell_{0,α}(\mathbb N_0 ^d)$ arising from the corresponding Laguerre expansions. In addition, by considering sequence spaces $ \ell_{\flat_σ}(\mathbb N^d_0) $ and $\ell_{0,\flat_σ}(\mathbb N^d_0)$, we introduce and study flat Pilipović spaces on orthants.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Efficient energy-stable parametric finite element methods for surface diffusion flow and applications in solid-state dewetting
Authors:
Meng Li,
Yihang Guo,
**gjiang Bi
Abstract:
Currently existing energy-stable parametric finite element methods for surface diffusion flow and other flows are usually limited to first-order accuracy in time. Designing a high-order algorithm for geometric flows that can also be theoretically proven to be energy-stable poses a significant challenge. Motivated by the new scalar auxiliary variable approach [F.Huang, J.Shen, Z.Yang, SIAM J. SCI.…
▽ More
Currently existing energy-stable parametric finite element methods for surface diffusion flow and other flows are usually limited to first-order accuracy in time. Designing a high-order algorithm for geometric flows that can also be theoretically proven to be energy-stable poses a significant challenge. Motivated by the new scalar auxiliary variable approach [F.Huang, J.Shen, Z.Yang, SIAM J. SCI. Comput., 42 (2020), pp. A2514-A2536], we propose novel energy-stable parametric finite element approximations for isotropic/anisotropic surface diffusion flows, achieving both first-order and second-order accuracy in time. Additionally, we apply the algorithms to simulate the solid-state dewetting of thin films. Finally, extensive numerical experiments validate the accuracy, energy stability, and efficiency of our developed numerical methods. The designed algorithms in this work exhibit strong versatility, as they can be readily extended to other high-order time discretization methods (e.g., BDFk schemes). Meanwhile, the algorithms achieve remarkable computational efficiency and maintain excellent mesh quality. More importantly, the algorithm can be theoretically proven to possess unconditional energy stability, with the energy nearly equal to the original energy.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Multiscale structures in three species magnetoplasmas with two positive ions
Authors:
Shafa Ullah,
Usman Shazad,
M. Iqbal
Abstract:
The self-organization in a multi-ion plasma composed of electrons and two species of positively charged ions is investigated. It is shown that when canonical vorticities and velocities of all the plasma fluids are aligned, the magnetic field self-organizes to Quadruple Beltrami state (superposition of four Beltrami fields). The self-organized magnetic and velocity fields strongly depend on the rel…
▽ More
The self-organization in a multi-ion plasma composed of electrons and two species of positively charged ions is investigated. It is shown that when canonical vorticities and velocities of all the plasma fluids are aligned, the magnetic field self-organizes to Quadruple Beltrami state (superposition of four Beltrami fields). The self-organized magnetic and velocity fields strongly depend on the relative strengths of the generalized vorticities, flows, inertia and densities of the plasma species. Thus, it is possible to generate a wide variety of multiscale magnetic field and flow structures. It is also shown that relaxed magnetic fields and velocities can vary on vastly different length scales simultaneously and are coupled together through singular perturbation generated by Hall effect. In this multi Beltrami self-organized states, then, the dynamo mechanism emerges naturally. The scale separation also suggests the heating of the plasma through a dissipative process. The work could be useful to study the dynamics and morphology of the multiscale magnetic field configurations in laboratory and astrophysical plasmas.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Thunderbolt: Causal Concurrent Consensus and Execution
Authors:
Junchao Chen,
Alberto Sonnino,
Lefteris Kokoris-Kogias,
Mohammad Sadoghi
Abstract:
In the realm of blockchain systems, smart contracts have gained widespread adoption owing to their programmability. Consequently, develo** a system capable of facilitating high throughput and scalability is of paramount importance. Directed acyclic graph (DAG) consensus protocols have demonstrated notable enhancements in both throughput and latency, however, the serial execution is now becoming…
▽ More
In the realm of blockchain systems, smart contracts have gained widespread adoption owing to their programmability. Consequently, develo** a system capable of facilitating high throughput and scalability is of paramount importance. Directed acyclic graph (DAG) consensus protocols have demonstrated notable enhancements in both throughput and latency, however, the serial execution is now becoming a bottleneck. Numerous approaches prove impractical for smart contracts by assuming that read/write sets are known in prior. This paper introduces Thunderbolt, a novel architecture based on DAG-based protocols, that aims to furnish a scalable and concurrent execution for smart contract transactions. Inspired by Hyperledger, Thunderbolt also expands Execute-Order-Validate architecture in which transactions are distributed into distinct replicas, with execution outcomes determined prior to ordering through the DAG-based protocol. Existing protocols adopt serial executions after the ordering to avoid non-determinism. However, Thunderbolt provides parallel pre-execution before the ordering as well as parallel verifications once any source of non-determinism is removed. Each replica validates the transaction results during the construction of the DAG other than after the ordering following the construction to improve the latency. In an effort to enhance smart contract execution, we implement an execution engine that constructs a dependency graph to dynamically assign transaction orders, thus mitigating abort rates due to execution conflicts. Additionally, we introduce a novel shard reconfiguration to withstand malicious attacks by relocating replicas from the current DAG to a new DAG, and rotating the shards among different replicas. Our comparison of the results on SmallBank with serial execution on Narwhal-Tusk revealed a remarkable 50 times speedup with 64 replicas.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
CAACS: A Carbon Aware Ant Colony System
Authors:
Marina Lin,
Laura P. Schaposnik
Abstract:
In an era where sustainability is becoming increasingly crucial, we introduce a new Carbon-Aware Ant Colony System (CAACS) Algorithm that addresses the Generalized Traveling Salesman Problem (GTSP) while minimizing carbon emissions. This novel approach leverages the natural efficiency of ant colony pheromone trails to find optimal routes, balancing both environmental and economic objectives. By in…
▽ More
In an era where sustainability is becoming increasingly crucial, we introduce a new Carbon-Aware Ant Colony System (CAACS) Algorithm that addresses the Generalized Traveling Salesman Problem (GTSP) while minimizing carbon emissions. This novel approach leverages the natural efficiency of ant colony pheromone trails to find optimal routes, balancing both environmental and economic objectives. By integrating sustainability into transportation models, CAACS provides a powerful tool for real-world applications, including network design, delivery route planning, and commercial aircraft logistics. Our algorithm's unique bi-objective optimization advances the study of sustainable transportation solutions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Cosmic topology. Part IIIa. Microwave background parity violation without parity-violating microphysics
Authors:
Amirhossein Samandar,
Javier Carrón Duque,
Craig J. Copi,
Mikel Martin Barandiaran,
Deyan P. Mihaylov,
Thiago S. Pereira,
Glenn D. Starkman,
Yashar Akrami,
Stefano Anselmi,
Fernando Cornet-Gomez,
Johannes R. Eskilt,
Andrew H. Jaffe,
Arthur Kosowsky,
Andrius Tamosiunas
Abstract:
The standard cosmological model, which assumes statistical isotropy and parity invariance, predicts the absence of correlations between even-parity and odd-parity observables of the cosmic microwave background (CMB). Contrary to these predictions, large-angle CMB temperature anomalies generically involve correlations between even-$\ell$ and odd-$\ell$ angular power spectrum $C_\ell$, while recent…
▽ More
The standard cosmological model, which assumes statistical isotropy and parity invariance, predicts the absence of correlations between even-parity and odd-parity observables of the cosmic microwave background (CMB). Contrary to these predictions, large-angle CMB temperature anomalies generically involve correlations between even-$\ell$ and odd-$\ell$ angular power spectrum $C_\ell$, while recent analyses of CMB polarization have revealed non-zero equal-$\ell$ $EB$ correlations. These findings challenge the conventional understanding, suggesting deviations from statistical isotropy, violations of parity, or both. Cosmic topology, which involves changing only the boundary conditions of space relative to standard cosmology, offers a compelling framework to potentially account for such parity-violating observations. Topology inherently breaks statistical isotropy, and can also break homogeneity and parity, providing a natural paradigm for explaining observations of parity-breaking observables without the need to add parity violation to the underlying microphysics. Our investigation delves into the harmonic space implications of topology for CMB correlations, using as an illustrative example $EB$ correlations generated by tensor perturbations under both parity-preserving and parity-violating scenarios. Consequently, these findings not only challenge the foundational assumptions of the standard cosmological model but also open new avenues for exploring the topological structure of the Universe through CMB observations.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Rich and diverse molecular gas environments of closely-separated dual quasars viewed by ALMA
Authors:
Shenli Tang,
John D. Silverman,
Zhaoxuan Liu,
Manda Banerji,
Tomoko Suzuki,
Seiji Fujimoto,
Andy Goulding,
Masatoshi Imanishi,
Toshihiro Kawaguchi,
Connor Bottrell,
Tilman Hartwig,
Knud Jahnke,
Masafusa Onoue,
Malte Schramm,
Yoshihiro Ueda
Abstract:
We present a study of the molecular gas in five closely-spaced ($R_{\perp}<20$ kpc) dual quasars ($L_{\rm bol}\gtrsim10^{44}~\mathrm{erg~s}^{-1}$) at redshifts $0.4<z<0.8$ with the Atacama Large Millimeter/submillimeter Array. The dual quasar phase represents a distinctive stage during the interaction between two galaxies for investigating quasar fueling and feedback effects on the gas reservoir.…
▽ More
We present a study of the molecular gas in five closely-spaced ($R_{\perp}<20$ kpc) dual quasars ($L_{\rm bol}\gtrsim10^{44}~\mathrm{erg~s}^{-1}$) at redshifts $0.4<z<0.8$ with the Atacama Large Millimeter/submillimeter Array. The dual quasar phase represents a distinctive stage during the interaction between two galaxies for investigating quasar fueling and feedback effects on the gas reservoir. The dual quasars were selected from the Sloan Digital Sky Survey and Subaru/Hyper Suprime-Cam Subaru Strategic Program, with confirmatory spectroscopic validation. Based on the detection of the CO J=2--1 emission line with Band 4, we derived key properties including CO luminosities, line widths, and molecular gas masses for these systems. Among the ten quasars of the five pairs, eight have line detections exceeding $5σ$. The detected sources prominently harbor substantial molecular gas reservoirs, with molecular gas masses ($M_{\text{molgas}}$) between $10^{9.6-10.5}~\mathrm{M_{\odot}}$, and molecular gas-to-stellar mass ratios ($μ_{\text{molgas}}$) spanning $18-97\%$. The overall $μ_{\text{molgas}}$ of these dual quasars agrees with that of inactive star-forming main-sequence galaxies at comparable redshifts, indicating no clear evidence of quenching. However, intriguing features in each individual system show possible evidence of AGN feedback, matter transfer, and compaction processes.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
6G: The Intelligent Network of Everything -- A Comprehensive Vision, Survey, and Tutorial
Authors:
Harri Pennanen,
Tuomo Hänninen,
Oskari Tervo,
Antti Tölli,
Matti Latva-aho
Abstract:
The global 6G vision has taken its shape after years of international research and development efforts. This work culminated in ITU-R's Recommendation on "IMT-2030 Framework". While the definition phase of technological requirements is currently ongoing, 3GPP's standardization process on 6G networks is expected to start in 2025 and worldwide commercialization around 2030. This article serves as a…
▽ More
The global 6G vision has taken its shape after years of international research and development efforts. This work culminated in ITU-R's Recommendation on "IMT-2030 Framework". While the definition phase of technological requirements is currently ongoing, 3GPP's standardization process on 6G networks is expected to start in 2025 and worldwide commercialization around 2030. This article serves as a comprehensive guide to 6G by providing an overall vision, a contemporary survey of the main literature, and an informative tutorial-type presentation style. In our vision, 6G will be based on three fundamental elements: wireless, artificial intelligence (AI), and the Internet of Everything (IoE). Consequently, 6G can ultimately become the Intelligent Network of Everything while serving as an enabling platform for the next major disruption in mobile communication, called mobile intelligence. The potential of mobile intelligence is that anything can be made connected, intelligent, and aware of its environment. This will revolutionize the way how devices, systems, and applications are designed; how they operate and interact with humans and each other; and how they can be used for the benefit of people, society, and the world in general. After high-level visioning, the main details of 6G are discussed, including fundamental elements, disruptive applications, key use cases, main performance requirements, potential technologies, and defining features. A special focus is given to a comprehensive set of potential 6G technologies, each of which is introduced in a tutorial manner. Finally, we speculate on what comes after 6G and sketch the first high-level vision of 7G. All in all, the objective of this article is to provide a thorough guide to 6G in order to serve as a source of knowledge and inspiration for further research and development work in academia, industry, and standardization bodies.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
A Cordial Introduction to Double Scaled SYK
Authors:
Micha Berkooz,
Ohad Mamroud
Abstract:
We review recent progress regarding the double scaled Sachdev-Ye-Kitaev model and other $p$-local quantum mechanical random Hamiltonians. These models exhibit an expansion using chord diagrams, which can be solved by combinatorial methods. We describe exact results in these models, including their spectrum, correlation functions, and Lyapunov exponent. In a certain limit, these techniques manifest…
▽ More
We review recent progress regarding the double scaled Sachdev-Ye-Kitaev model and other $p$-local quantum mechanical random Hamiltonians. These models exhibit an expansion using chord diagrams, which can be solved by combinatorial methods. We describe exact results in these models, including their spectrum, correlation functions, and Lyapunov exponent. In a certain limit, these techniques manifest the relation to the Schwarzian quantum mechanics, a theory of quantum gravity in $AdS_2$. More generally, the theory is controlled by a rigid algebraic structure of a quantum group, suggesting a theory of quantum gravity on non-commutative $q$-deformed $AdS_2$. We conclude with discussion of related universality classes, and survey some of the current research directions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents
Authors:
Saber Zerhoudi,
Michael Granitzer
Abstract:
Large Language Models (LLMs) struggle with generating reliable outputs due to outdated knowledge and hallucinations. Retrieval-Augmented Generation (RAG) models address this by enhancing LLMs with external knowledge, but often fail to personalize the retrieval process. This paper introduces PersonaRAG, a novel framework incorporating user-centric agents to adapt retrieval and generation based on r…
▽ More
Large Language Models (LLMs) struggle with generating reliable outputs due to outdated knowledge and hallucinations. Retrieval-Augmented Generation (RAG) models address this by enhancing LLMs with external knowledge, but often fail to personalize the retrieval process. This paper introduces PersonaRAG, a novel framework incorporating user-centric agents to adapt retrieval and generation based on real-time user data and interactions. Evaluated across various question answering datasets, PersonaRAG demonstrates superiority over baseline models, providing tailored answers to user needs. The results suggest promising directions for user-adapted information retrieval systems.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Open-Canopy: A Country-Scale Benchmark for Canopy Height Estimation at Very High Resolution
Authors:
Fajwel Fogel,
Yohann Perron,
Nikola Besic,
Laurent Saint-André,
Agnès Pellissier-Tanon,
Martin Schwartz,
Thomas Boudras,
Ibrahim Fayad,
Alexandre d'Aspremont,
Loic Landrieu,
Phillipe Ciais
Abstract:
Estimating canopy height and canopy height change at meter resolution from satellite imagery has numerous applications, such as monitoring forest health, logging activities, wood resources, and carbon stocks. However, many existing forest datasets are based on commercial or closed data sources, restricting the reproducibility and evaluation of new approaches. To address this gap, we introduce Open…
▽ More
Estimating canopy height and canopy height change at meter resolution from satellite imagery has numerous applications, such as monitoring forest health, logging activities, wood resources, and carbon stocks. However, many existing forest datasets are based on commercial or closed data sources, restricting the reproducibility and evaluation of new approaches. To address this gap, we introduce Open-Canopy, the first open-access and country-scale benchmark for very high resolution (1.5 m) canopy height estimation. Covering more than 87,000 km$^2$ across France, Open-Canopy combines SPOT satellite imagery with high resolution aerial LiDAR data. We also propose Open-Canopy-$Δ$, the first benchmark for canopy height change detection between two images taken at different years, a particularly challenging task even for recent models. To establish a robust foundation for these benchmarks, we evaluate a comprehensive list of state-of-the-art computer vision models for canopy height estimation. The dataset and associated codes can be accessed at https://github.com/fajwel/Open-Canopy.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Tail-robust factor modelling of vector and tensor time series in high dimensions
Authors:
Matteo Barigozzi,
Haeran Cho,
Hyeyoung Maeng
Abstract:
We study the problem of factor modelling vector- and tensor-valued time series in the presence of heavy tails in the data, which produce anomalous observations with non-negligible probability. For this, we propose to combine a two-step procedure with data truncation, which is easy to implement and does not require iteratively searching for a numerical solution. Departing away from the light-tail a…
▽ More
We study the problem of factor modelling vector- and tensor-valued time series in the presence of heavy tails in the data, which produce anomalous observations with non-negligible probability. For this, we propose to combine a two-step procedure with data truncation, which is easy to implement and does not require iteratively searching for a numerical solution. Departing away from the light-tail assumptions often adopted in the time series factor modelling literature, we derive the theoretical properties of the proposed estimators while only assuming the existence of the $(2 + 2\eps)$-th moment for some $\eps \in (0, 1)$, fully characterising the effect of heavy tails on the rates of estimation as well as the level of truncation. Numerical experiments on simulated datasets demonstrate the good performance of the proposed estimator, which is further supported by applications to two macroeconomic datasets.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
GAVEL: Generating Games Via Evolution and Language Models
Authors:
Graham Todd,
Alexander Padula,
Matthew Stephenson,
Éric Piette,
Dennis J. N. J. Soemers,
Julian Togelius
Abstract:
Automatically generating novel and interesting games is a complex task. Challenges include representing game rules in a computationally workable form, searching through the large space of potential games under most such representations, and accurately evaluating the originality and quality of previously unseen games. Prior work in automated game generation has largely focused on relatively restric…
▽ More
Automatically generating novel and interesting games is a complex task. Challenges include representing game rules in a computationally workable form, searching through the large space of potential games under most such representations, and accurately evaluating the originality and quality of previously unseen games. Prior work in automated game generation has largely focused on relatively restricted rule representations and relied on domain-specific heuristics. In this work, we explore the generation of novel games in the comparatively expansive Ludii game description language, which encodes the rules of over 1000 board games in a variety of styles and modes of play. We draw inspiration from recent advances in large language models and evolutionary computation in order to train a model that intelligently mutates and recombines games and mechanics expressed as code. We demonstrate both quantitatively and qualitatively that our approach is capable of generating new and interesting games, including in regions of the potential rules space not covered by existing games in the Ludii dataset. A sample of the generated games are available to play online through the Ludii portal.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Radiance Fields from Photons
Authors:
Sacha Jungerman,
Mohit Gupta
Abstract:
Neural radiance fields, or NeRFs, have become the de facto approach for high-quality view synthesis from a collection of images captured from multiple viewpoints. However, many issues remain when capturing images in-the-wild under challenging conditions, such as low light, high dynamic range, or rapid motion leading to smeared reconstructions with noticeable artifacts. In this work, we introduce q…
▽ More
Neural radiance fields, or NeRFs, have become the de facto approach for high-quality view synthesis from a collection of images captured from multiple viewpoints. However, many issues remain when capturing images in-the-wild under challenging conditions, such as low light, high dynamic range, or rapid motion leading to smeared reconstructions with noticeable artifacts. In this work, we introduce quanta radiance fields, a novel class of neural radiance fields that are trained at the granularity of individual photons using single-photon cameras (SPCs). We develop theory and practical computational techniques for building radiance fields and estimating dense camera poses from unconventional, stochastic, and high-speed binary frame sequences captured by SPCs. We demonstrate, both via simulations and a SPC hardware prototype, high-fidelity reconstructions under high-speed motion, in low light, and for extreme dynamic range settings.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
SynCOM: An Empirical Model for High-Resolution Simulations of Transient Solar Wind Flows
Authors:
Valmir P. Moraes Filho,
Vadim M. Uritsky,
Barbara J. Thompson,
Sarah E. Gibson,
Craig E. DeForest
Abstract:
The Synthetic Corona Outflow Model (SynCOM), an empirical model, simulates the solar corona's dynamics to match high-resolution observations, providing a useful resource for testing velocity measurement algorithms. SynCOM generates synthetic images depicting radial variability in polarized brightness and includes stochastic elements for plasma outflows and instrumental noise. It employs a predefin…
▽ More
The Synthetic Corona Outflow Model (SynCOM), an empirical model, simulates the solar corona's dynamics to match high-resolution observations, providing a useful resource for testing velocity measurement algorithms. SynCOM generates synthetic images depicting radial variability in polarized brightness and includes stochastic elements for plasma outflows and instrumental noise. It employs a predefined flow velocity probability distribution and an adjustable signal-to-noise ratio to evaluate different data analysis methods for coronal flows. By adjusting parameters to match specific coronal and instrumental conditions, SynCOM offers a platform to assess these methods for determining coronal velocity and acceleration. Validating these measurements would help to understand solar wind origins and support missions such as the Polarimeter to Unify the Corona and Heliosphere (PUNCH). In this study, we demonstrate how SynCOM can be employed to assess the precision and performance of two different flow tracking methods. By providing a ground-truth based on observational data, we highlight the importance of SynCOM in confirming observational standards for detecting coronal flows.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.