-
Improved online load balancing with known makespan
Authors:
Martin Böhm,
Matej Lieskovský,
Sören Schmitt,
Jiří Sgall,
Rob van Stee
Abstract:
We break the barrier of $3/2$ for the problem of online load balancing with known makespan, also known as bin stretching. In this problem, $m$ identical machines and the optimal makespan are given. The load of a machine is the total size of all the jobs assigned to it and the makespan is the maximum load of all the machines. Jobs arrive online and the goal is to assign each job to a machine while…
▽ More
We break the barrier of $3/2$ for the problem of online load balancing with known makespan, also known as bin stretching. In this problem, $m$ identical machines and the optimal makespan are given. The load of a machine is the total size of all the jobs assigned to it and the makespan is the maximum load of all the machines. Jobs arrive online and the goal is to assign each job to a machine while staying within a small factor (the competitive ratio) of the optimal makespan. We present an algorithm that maintains a competitive ratio of $139/93<1.495$ for sufficiently large values of $m$, improving the previous bound of $3/2$. The value 3/2 represents a natural bound for this problem: as long as the online bins are of size at least $3/2$ of the offline bin, all items that fit at least two times in an offline bin have two nice properties. They fit three times in an online bin and a single such item can be packed together with an item of any size in an online bin. These properties are now both lost, which means that putting even one job on a wrong machine can leave some job unassigned at the end. It also makes it harder to determine good thresholds for the item types. This was one of the main technical issues in getting below $3/2$. The analysis consists of an intricate mixture of size and weight arguments.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Photometric and Spectroscopic study of Ten Low Mass Ratio Contact Binary Systems: Orbital Stability, O'Connell Effect and Infra-red Calcium Line Filling
Authors:
Surjit S. Wadhwa,
Adam Popowicz,
Raul Michel,
Petar Kostic,
Oliver Vince,
Nick F. H. Tothill,
Ain Y. De Horta,
Miroslav D. Filipovic
Abstract:
Low mass ratio contact binary systems are more likely to have unstable orbits and potentially merge. In addition, such systems exhibit characteristics such as starspots and high energy emissions (UV) suggestive of chromospheric and magnetic activity. Light curve modelling of ten contact binary systems is reported. All were found to be of extreme low mass ratio ranging from 0.122 to 0.24 and three…
▽ More
Low mass ratio contact binary systems are more likely to have unstable orbits and potentially merge. In addition, such systems exhibit characteristics such as starspots and high energy emissions (UV) suggestive of chromospheric and magnetic activity. Light curve modelling of ten contact binary systems is reported. All were found to be of extreme low mass ratio ranging from 0.122 to 0.24 and three were found to be potentially unstable and possible merger candidates. Filling of the infrared Calcium absorption lines is a marker of increased chromospheric activity. We use the available LAMOST spectra along with matched standard spectra (broadened for rotation) to measure the excess filling of the central core depression flux of the two main infrared Calcium absorption lines at 8542 and 8662 angstroms. We find that all reported contact binaries have excess filling of the core flux in the infrared Calcium lines. Three of the systems reported were also observed by the GALEX mission and we find that all three have features of excess ultraviolet emissions further adding evidence for increased chromospheric activity in low mass ratio contact binaries. Analysis of both orbital stability and absorption line filling is dependent on the determination of geometric and absolute parameters from light curve modelling. Not an insignificant number of contact binary light curves exhibit the O'Connell effect, usually attributed to starspots. We discuss the inclusion of starspots in light curve solutions and how they influence the geometric and absolute parameters
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Spine Vision X-Ray Image based GUI Planning of Pedicle Screws Using Enhanced YOLOv5 for Vertebrae Segmentation
Authors:
Yashwanth Rao,
Gaurisankar S,
Durga R,
Aparna Purayath,
Vivek Maik,
Manojkumar Lakshmanan,
Mohanasankar Sivaprakasm
Abstract:
In this paper, we propose an innovative Graphical User Interface (GUI) aimed at improving preoperative planning and intra-operative guidance for precise spinal screw placement through vertebrae segmentation. The methodology encompasses both front-end and back-end computations. The front end comprises a GUI that allows surgeons to precisely adjust the placement of screws on X-Ray images, thereby im…
▽ More
In this paper, we propose an innovative Graphical User Interface (GUI) aimed at improving preoperative planning and intra-operative guidance for precise spinal screw placement through vertebrae segmentation. The methodology encompasses both front-end and back-end computations. The front end comprises a GUI that allows surgeons to precisely adjust the placement of screws on X-Ray images, thereby improving the simulation of surgical screw insertion in the patient's spine. On the other hand, the back-end processing involves several steps, including acquiring spinal X-ray images, performing pre-processing techniques to reduce noise, and training a neural network model to achieve real-time segmentation of the vertebrae. The integration of vertebral segmentation in the GUI ensures precise screw placement, reducing complications like nerve injury and ultimately improving surgical outcomes. The Spine-Vision provides a comprehensive solution with innovative features like synchronous AP-LP planning, accurate screw positioning via vertebrae segmentation, effective screw visualization, and dynamic position adjustments. This X-ray image-based GUI workflow emerges as a valuable tool, enhancing precision and safety in spinal screw placement and planning procedures.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
Authors:
Liang Zeng,
Liangjun Zhong,
Liang Zhao,
Tianwen Wei,
Liu Yang,
Jujie He,
Cheng Cheng,
Rui Hu,
Yang Liu,
Shuicheng Yan,
Han Fang,
Yahui Zhou
Abstract:
In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model…
▽ More
In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model series, supervised fine-tuned (SFT) on common 7B LLMs using our proposed 2.5M-instance Skywork-MathQA dataset. Skywork-Math 7B has achieved impressive accuracies of 51.2% on the competition-level MATH benchmark and 83.9% on the GSM8K benchmark using only SFT data, outperforming an early version of GPT-4 on MATH. The superior performance of Skywork-Math models contributes to our novel two-stage data synthesis and model SFT pipelines, which include three different augmentation methods and a diverse seed problem set, ensuring both the quantity and quality of Skywork-MathQA dataset across varying difficulty levels. Most importantly, we provide several practical takeaways to enhance math reasoning abilities in LLMs for both research and industry applications.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
GUI-based Pedicle Screw Planning on Fluoroscopic Images Utilizing Vertebral Segmentation
Authors:
Vivek Maik,
Aparna Purayath,
Durga R,
Manojkumar Lakshmanan,
Mohanasankar Sivaprakasm
Abstract:
The proposed work establishes a novel Graphical User Interface (GUI) framework, primarily designed for intraoperative pedicle screw planning. Current planning workflow in Image Guided Surgeries primarily relies on pre-operative CT planning. Intraoperative CT planning can be time-consuming and expensive and thus is not a common practice. In situations where efficiency and cost-effectiveness are par…
▽ More
The proposed work establishes a novel Graphical User Interface (GUI) framework, primarily designed for intraoperative pedicle screw planning. Current planning workflow in Image Guided Surgeries primarily relies on pre-operative CT planning. Intraoperative CT planning can be time-consuming and expensive and thus is not a common practice. In situations where efficiency and cost-effectiveness are paramount, planning to utilize fluoroscopic images acquired for image registration emerges as the optimal choice. The methodology proposed in this study employs a simulated 3D pedicle screw to calculate its coronal and sagittal projections for pedicle screw planning using anterior-posterior (AP) and lateral (LP) images. The initialization and placement of pedicle screw is computed by utilizing the bounding box of vertebral segmentation, which is obtained by the application of enhanced YOLOv5. The GUI front end includes functionality that allows surgeons or medical practitioners to efficiently choose, set up, and dynamically maneuver the pedicle screw on AP and LP images. This is based on a novel feature called synchronous planning, which involves correlating pedicle screws from the coronal and sagittal planes. This correlation utilizes projective correspondence to ensure that any movement of the pedicle screw in either the AP or LP image will be reflected in the other image. The proposed GUI framework is a time-efficient and cost-effective tool for synchronizing and planning the movement of pedicle screws during intraoperative surgical procedures.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Polynomial tail solutions of the non-cutoff Boltzmann equation near local Maxwellians
Authors:
Renjun Duan,
Zongguang Li
Abstract:
This paper aims to incorporate the Caflisch's decomposition into the macro-micro decomposition in Boltzmann theory for allowing the microscopic component to exhibit only the polynomial tail in large velocities. In particular, we treat the Cauchy problem on the non-cutoff Boltzmann equation under the compressible Euler scaling in case of three-dimensional whole space. Up to a finite time we constru…
▽ More
This paper aims to incorporate the Caflisch's decomposition into the macro-micro decomposition in Boltzmann theory for allowing the microscopic component to exhibit only the polynomial tail in large velocities. In particular, we treat the Cauchy problem on the non-cutoff Boltzmann equation under the compressible Euler scaling in case of three-dimensional whole space. Up to a finite time we construct the Boltzmann solution around a local Maxwellian corresponding to small-amplitude classical solutions of the full compressible Euler system around constant states. We design a new energy functional which can capture the convergence rate in the small Knudsen number $\varepsilon$ and allow the microscopic part of solutions to decay polynomially in large velocities. Moreover, the energy norm of perturbations can be of the order $\varepsilon^{1/2}$ which the usual method of Hilbert expansion fails to obtain. As a byproduct of the proof, our estimates immediately yield a global-in-time existence result when the Euler solutions are taken to be constant states.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Quantum Thermodynamic Integrability for Foundations of Statistical Physics
Authors:
Ruo-Xun Zhai,
C. P. Sun
Abstract:
We extend the Carathéodory principle of the Second Law to quantum thermodynamics with energy levels depending on macroscopic variables, such as volume and magnetic field. This extension introduces the concept of Quantum Thermodynamic Integrability (QTI), offering an alternative foundation for statistical mechanics. QTI is characterized by the path-independence of work and heat within the thermodyn…
▽ More
We extend the Carathéodory principle of the Second Law to quantum thermodynamics with energy levels depending on macroscopic variables, such as volume and magnetic field. This extension introduces the concept of Quantum Thermodynamic Integrability (QTI), offering an alternative foundation for statistical mechanics. QTI is characterized by the path-independence of work and heat within the thermodynamic manifold, which is locally described by energy levels and specific thermodynamic parameters. Within this framework, temperature naturally emerges as an integrating factor, allowing for the derivation of both canonical and non-canonical distributions from the Entropy Integrable Equations (EIE) based on QTI. Notably, non-canonical states, which become particularly significant outside the thermodynamic limit, reveal the existence of informational correlations in finite-size thermodynamic systems.
△ Less
Submitted 11 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Many wrong models approach to localize an odor source in turbulence: introducing the weighted Bayesian update
Authors:
Lorenzo Piro,
Robin A. Heinonen,
Massimo Cencini,
Luca Biferale
Abstract:
The problem of locating an odor source in turbulent environments is central to key applications such as environmental monitoring and disaster response. We address this challenge by designing an algorithm based on Bayesian inference, which uses odor measurements from an ensemble of static sensors to estimate the source position through a stochastic model of the environment. Given the practical impo…
▽ More
The problem of locating an odor source in turbulent environments is central to key applications such as environmental monitoring and disaster response. We address this challenge by designing an algorithm based on Bayesian inference, which uses odor measurements from an ensemble of static sensors to estimate the source position through a stochastic model of the environment. Given the practical impossibility of achieving a fully consistent turbulent model and guaranteeing convergence to the correct solution, we propose a method to rank 'many wrong models' and to blend their predictions. We evaluate our weighted Bayesian update algorithm by its ability to estimate the source location with predefined accuracy and/or within a specified time frame, and compare it to standard Monte Carlo sampling methods. To demonstrate the robustness and potential applications of both approaches under realistic environmental conditions, we use high-quality direct numerical simulations of the Navier-Stokes equations to mimic the transport of odors in the atmospheric boundary layer. Despite minimal prior information about the source and environmental conditions, our proposed approach consistently proves to be more accurate, reliable, and robust than Monte Carlo methods, thus showing promise as a new tool for addressing the odor source localization problem in real-world scenarios.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Contact operators in renormalization of attractive singular potentials
Authors:
Rui Peng,
Bingwei Long,
Fu-Rong Xu
Abstract:
We discuss renormalization of chiral nuclear forces in the 3P0 channel of N N scattering at next- to-next-to leading order (N2LO) if the one-pion exchange is treated nonperturbatively at leading order. The matrix elements of the subleading contact potentials become nearly dependent of each other for the so-called exceptional ultraviolet momentum cutoff, making it difficult to determine the strengt…
▽ More
We discuss renormalization of chiral nuclear forces in the 3P0 channel of N N scattering at next- to-next-to leading order (N2LO) if the one-pion exchange is treated nonperturbatively at leading order. The matrix elements of the subleading contact potentials become nearly dependent of each other for the so-called exceptional ultraviolet momentum cutoff, making it difficult to determine the strengths of those contact potentials from the empirical phase shifts, as reported in Ref. [1]. We argue that this issue can be resolved by adjusting the strategy by which the low-energy constants are deduced from the data, thus making those exceptional cutoffs amenable to chiral effective field theory.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Constructively describing orbit spaces of finite groups by few inequalities
Authors:
Philippe Moustrou,
Cordian Riener,
Robin Schabert
Abstract:
Let $G$ be a finite group acting linearly on $\mathbb{R}^n$. A celebrated Theorem of Procesi and Schwarz gives an explicit description of the orbit space $\mathbb{R}^n /\!/G$ as a basic closed semi-algebraic set. We give a new proof of this statement and another description as a basic closed semi-algebraic set using elementary tools from real algebraic geometry. Bröcker was able to show that the n…
▽ More
Let $G$ be a finite group acting linearly on $\mathbb{R}^n$. A celebrated Theorem of Procesi and Schwarz gives an explicit description of the orbit space $\mathbb{R}^n /\!/G$ as a basic closed semi-algebraic set. We give a new proof of this statement and another description as a basic closed semi-algebraic set using elementary tools from real algebraic geometry. Bröcker was able to show that the number of inequalities needed to describe the orbit space generically depends only on the group $G$. Here, we construct such inequalities explicitly for abelian groups and in the case where only one inequality is needed. Furthermore, we answer an open question raised by Bröcker concerning the genericity of his result.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
SR-Mamba: Effective Surgical Phase Recognition with State Space Model
Authors:
Rui Cao,
Jiangliu Wang,
Yun-Hui Liu
Abstract:
Surgical phase recognition is crucial for enhancing the efficiency and safety of computer-assisted interventions. One of the fundamental challenges involves modeling the long-distance temporal relationships present in surgical videos. Inspired by the recent success of Mamba, a state space model with linear scalability in sequence length, this paper presents SR-Mamba, a novel attention-free model s…
▽ More
Surgical phase recognition is crucial for enhancing the efficiency and safety of computer-assisted interventions. One of the fundamental challenges involves modeling the long-distance temporal relationships present in surgical videos. Inspired by the recent success of Mamba, a state space model with linear scalability in sequence length, this paper presents SR-Mamba, a novel attention-free model specifically tailored to meet the challenges of surgical phase recognition. In SR-Mamba, we leverage a bidirectional Mamba decoder to effectively model the temporal context in overlong sequences. Moreover, the efficient optimization of the proposed Mamba decoder facilitates single-step neural network training, eliminating the need for separate training steps as in previous works. This single-step training approach not only simplifies the training process but also ensures higher accuracy, even with a lighter spatial feature extractor. Our SR-Mamba establishes a new benchmark in surgical video analysis by demonstrating state-of-the-art performance on the Cholec80 and CATARACTS Challenge datasets. The code is accessible at https://github.com/rcao-hk/SR-Mamba.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Shot noise in Aharonov-Bohm interferometers: Comparison of helical and conventional setups
Authors:
R. A. Niyazov,
I. V. Krainov,
D. N. Aristov,
V. Yu. Kachorovskii
Abstract:
We study the shot noise of current through the edge states of a two-dimensional topological insulator placed in magnetic field and compare the obtained results with the shot noise in conventional single-channel spinless Aharonov-Bohm interferometer. We find general formulas for the Fano factors of these setups, assuming that temperature exceeds level spacing in the system. We demonstrate that both…
▽ More
We study the shot noise of current through the edge states of a two-dimensional topological insulator placed in magnetic field and compare the obtained results with the shot noise in conventional single-channel spinless Aharonov-Bohm interferometer. We find general formulas for the Fano factors of these setups, assuming that temperature exceeds level spacing in the system. We demonstrate that both in helical and in conventional case the interference effects dramatically change the Fano factor and its magnetic field dependence. For weak tunneling coupling with leads, the Fano factors of both setups exhibit a periodic series of sharp Aharonov-Bohm peaks with variation of magnetic flux piercing the system. Our key finding is that the Fano factor in the helical interferometer provides information about the presence of backscattering defects violating topological protection. In particular, the amplitude of Aharonov-Bohm peaks in the helical setup is proportional to the strength of the defect in contrast to conventional setup, where peaks have finite amplitude even in the ballistic case.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Anisotropic Thermal Transport in Tunable Self-Assembled Nanocrystal Supercrystals
Authors:
Matias Feldman,
Charles Vernier,
Rahul Nag,
Juan Barrios,
Sébastien Royer,
Hervé Cruguel,
Emmanuelle Lacaze,
Emmanuel Lhuillier,
Danièle Fournier,
Florian Schulz,
Cyrille Hamon,
Hervé Portalès,
James K. Utterback
Abstract:
Realizing tunable functional materials with built-in nanoscale heat flow directionality represents a significant challenge with the potential to enable novel thermal management strategies. Here we use spatiotemporally-resolved thermoreflectance to visualize lateral thermal transport anisotropy in self-assembled supercrystals of anisotropic Au nanocrystals. Correlative electron and thermoreflectanc…
▽ More
Realizing tunable functional materials with built-in nanoscale heat flow directionality represents a significant challenge with the potential to enable novel thermal management strategies. Here we use spatiotemporally-resolved thermoreflectance to visualize lateral thermal transport anisotropy in self-assembled supercrystals of anisotropic Au nanocrystals. Correlative electron and thermoreflectance microscopy reveal that heat predominantly flows along the long-axis of the anisotropic nanocrystals, and does so across grain boundaries and curved assemblies while voids disrupt heat flow. We finely control the anisotropy via the aspect ratio of constituent nanorods, and it exceeds the aspect ratio for nano-bipyramid supercrystals and certain nanorod arrangements. Finite element simulations and effective medium modeling rationalize the emergent anisotropic behavior in terms of a simple series resistance model, further providing a framework for estimating thermal anisotropy as a function of material and structural parameters. Self-assembly of colloidal nanocrystals promises a novel route to direct heat flow in a wide range of applications that utilize this important class of materials.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
A Cantor-Kantorovich Metric Between Markov Decision Processes with Application to Transfer Learning
Authors:
Adrien Banse,
Venkatraman Renganathan,
Raphaël M. Jungers
Abstract:
We extend the notion of Cantor-Kantorovich distance between Markov chains introduced by (Banse et al., 2023) in the context of Markov Decision Processes (MDPs). The proposed metric is well-defined and can be efficiently approximated given a finite horizon. Then, we provide numerical evidences that the latter metric can lead to interesting applications in the field of reinforcement learning. In par…
▽ More
We extend the notion of Cantor-Kantorovich distance between Markov chains introduced by (Banse et al., 2023) in the context of Markov Decision Processes (MDPs). The proposed metric is well-defined and can be efficiently approximated given a finite horizon. Then, we provide numerical evidences that the latter metric can lead to interesting applications in the field of reinforcement learning. In particular, we show that it could be used for forecasting the performance of transfer learning algorithms.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Leveraging GPT for the Generation of Multi-Platform Social Media Datasets for Research
Authors:
Henry Tari,
Danial Khan,
Justus Rutten,
Darian Othman,
Rishabh Kaushal,
Thales Bertaglia,
Adriana Iamnitchi
Abstract:
Social media datasets are essential for research on disinformation, influence operations, social sensing, hate speech detection, cyberbullying, and other significant topics. However, access to these datasets is often restricted due to costs and platform regulations. As such, acquiring datasets that span multiple platforms which are crucial for a comprehensive understanding of the digital ecosystem…
▽ More
Social media datasets are essential for research on disinformation, influence operations, social sensing, hate speech detection, cyberbullying, and other significant topics. However, access to these datasets is often restricted due to costs and platform regulations. As such, acquiring datasets that span multiple platforms which are crucial for a comprehensive understanding of the digital ecosystem is particularly challenging. This paper explores the potential of large language models to create lexically and semantically relevant social media datasets across multiple platforms, aiming to match the quality of real datasets. We employ ChatGPT to generate synthetic data from two real datasets, each consisting of posts from three different social media platforms. We assess the lexical and semantic properties of the synthetic data and compare them with those of the real data. Our empirical findings suggest that using large language models to generate synthetic multi-platform social media data is promising. However, further enhancements are necessary to improve the fidelity of the outputs.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Neutral hydrogen lensing simulations in the Hubble Frontier Fields
Authors:
Tariq Blecher,
Roger Deane,
Danail Obreschkow,
Ian Heywood
Abstract:
Cold gas evolution ties the formation of dark matter halos to the star formation history of the universe. A primary component of cold gas, neutral atomic hydrogen (HI), can be traced by its 21-cm emission line. However, the faintness of this emission typically limits individual detections to low redshifts ($z\lesssim 0.2$). To address this limitation, we investigate the potential of targeting grav…
▽ More
Cold gas evolution ties the formation of dark matter halos to the star formation history of the universe. A primary component of cold gas, neutral atomic hydrogen (HI), can be traced by its 21-cm emission line. However, the faintness of this emission typically limits individual detections to low redshifts ($z\lesssim 0.2$). To address this limitation, we investigate the potential of targeting gravitationally lensed systems. Building on our prior galaxy-galaxy simulations, we have developed a ray-tracing code to simulate lensed HI images for known galaxies situated behind the massive Hubble Frontier Field galaxy clusters. Our findings reveal the existence of high HI mass, high HI magnification systems in these cluster lensing scenarios. Through simulations of hundreds of sources, we have identified compelling targets within the redshift range $z\approx 0.7 - 1.5$. The most promising candidate from our simulations is the Great Arc at z=0.725 in Abell~370, which should be detectable by MeerKAT in approximately 50 hours. Importantly, the derived HI mass is predicted to be relatively insensitive to systematic uncertainties in the lensing model, and should be constrained within a factor of $\sim 2.5$ for a 95 per cent confidence interval.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Telescope control software and proto-model siderostat for the SDSS-V Local Volume Mapper
Authors:
Hojae Ahn,
Florian Briegel,
Jimin Han,
Mingyu Jeon,
Thomas M. Herbst,
Sumin Lee,
Woo** Park,
Sunwoo Lee,
Inhwan Jung,
Tae-Geun Ji,
Changgon Kim,
Geon Hee Kim,
Wolfgang Gaessler,
Markus Kuhlberg,
Hyun Chul Park,
Soojong Pak,
Nicholas P. Konidaris,
Niv Drory,
José R. Sánchez-Gallego,
Cynthia S. Froning,
Solange Ramirez,
Juna A. Kollmeier
Abstract:
The fifth Sloan Digital Sky Survey (SDSS-V) Local Volume Mapper (LVM) is a wide-field integral field unit (IFU) survey that uses an array of four 160 mm fixed telescopes with siderostats to minimize the number of moving parts. Individual telescope observes the science field or calibration field independently and is synchronized with the science exposure. We developed the LVM Acquisition and Guidin…
▽ More
The fifth Sloan Digital Sky Survey (SDSS-V) Local Volume Mapper (LVM) is a wide-field integral field unit (IFU) survey that uses an array of four 160 mm fixed telescopes with siderostats to minimize the number of moving parts. Individual telescope observes the science field or calibration field independently and is synchronized with the science exposure. We developed the LVM Acquisition and Guiding Package (LVMAGP) optimized telescope control software program for LVM observations, which can simultaneously control four focusers, three K-mirrors, one fiber selector, four mounts (siderostats), and seven guide cameras. This software is built on a hierarchical architecture and the SDSS framework and provides three key sequences: autofocus, field acquisition, and autoguide. We designed and fabricated a proto-model siderostat to test the telescope pointing model and LVMAGP software. The mirrors of the proto-model were designed as an isogrid open-back type, which reduced the weight by 46% and enabled reaching thermal equilibrium quickly. Additionally, deflection due to bolting torque, self-gravity, and thermal deformation was simulated, and the maximum scatter of the pointing model induced by the tilt of optomechanics was predicted to be $4'.4$, which can be compensated for by the field acquisition sequence. We performed a real sky test of LVMAGP with the proto-model siderostat and obtained field acquisition and autoguide accuracies of $0''.38$ and $1''.5$, respectively. It met all requirements except for the autoguide specification, which will be resolved by more precise alignment among the hardware components at Las Campanas Observatory.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Inference procedures in sequential trial emulation with survival outcomes: comparing confidence intervals based on the sandwich variance estimator, bootstrap and jackknife
Authors:
Juliette M. Limozin,
Shaun R. Seaman,
Li Su
Abstract:
Sequential trial emulation (STE) is an approach to estimating causal treatment effects by emulating a sequence of target trials from observational data. In STE, inverse probability weighting is commonly utilised to address time-varying confounding and/or dependent censoring. Then structural models for potential outcomes are applied to the weighted data to estimate treatment effects. For inference,…
▽ More
Sequential trial emulation (STE) is an approach to estimating causal treatment effects by emulating a sequence of target trials from observational data. In STE, inverse probability weighting is commonly utilised to address time-varying confounding and/or dependent censoring. Then structural models for potential outcomes are applied to the weighted data to estimate treatment effects. For inference, the simple sandwich variance estimator is popular but conservative, while nonparametric bootstrap is computationally expensive, and a more efficient alternative, linearised estimating function (LEF) bootstrap, has not been adapted to STE. We evaluated the performance of various methods for constructing confidence intervals (CIs) of marginal risk differences in STE with survival outcomes by comparing the coverage of CIs based on nonparametric/LEF bootstrap, jackknife, and the sandwich variance estimator through simulations. LEF bootstrap CIs demonstrated the best coverage with small/moderate sample sizes, low event rates and low treatment prevalence, which were the motivating scenarios for STE. They were less affected by treatment group imbalance and faster to compute than nonparametric bootstrap CIs. With large sample sizes and medium/high event rates, the sandwich-variance-estimator-based CIs had the best coverage and were the fastest to compute. These findings offer guidance in constructing CIs in causal survival analysis using STE.
△ Less
Submitted 12 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Robust quantum engineering of current flow in carbon nanostructures at room temperature
Authors:
Gaetano Calogero,
Isaac Alcón,
Onurcan Kaya,
Nick Papior,
Aron W. Cummings,
Mads Brandbyge,
Stephan Roche
Abstract:
Bottom-up on-surface synthesis enables the fabrication of carbon nanostructures with atomic precision. Good examples are graphene nanoribbons (GNRs), 1D conjugated polymers, and nanoporous graphenes (NPGs), which are gathering increasing attention for future carbon nanoelectronics. A key step is the ability to manipulate current flow within these nanomaterials. Destructive quantum interference (QI…
▽ More
Bottom-up on-surface synthesis enables the fabrication of carbon nanostructures with atomic precision. Good examples are graphene nanoribbons (GNRs), 1D conjugated polymers, and nanoporous graphenes (NPGs), which are gathering increasing attention for future carbon nanoelectronics. A key step is the ability to manipulate current flow within these nanomaterials. Destructive quantum interference (QI), long studied in the field of single-molecule electronics, has been proposed as the most effective way to achieve such control with molecular-scale precision. However, for practical applications, it is essential that such QI-engineering remains effective near or above room temperature. To assess this important point, here we combine large-scale molecular dynamics simulations and quantum transport calculations and focus our study on NPGs formed as arrays of laterally bonded GNRs. By considering various NPGs with different inter-GNR chemical connections we disentangle the different factors determining electronic transport in these carbon nanomaterials at 300 K. Our findings unequivocally demonstrate that QI survives at room temperature, with thermal vibrations weakly restricting current flow along GNRs while completely blocking transport across GNRs. Our results thus pave the way towards the future realization of QI-engineered carbon nanocircuitry operating at room temperature, which is a fundamental step towards carbon-based nanoelectronics and quantum technologies.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Characterizing a class of accelerating wormholes with periodic potential
Authors:
Soham Chatterjee,
Sagnik Roy,
Ratna Koley
Abstract:
The newly discovered Wormhole C--metric is a solution of Einstein's field equation coupled with a phantom scalar field which describes the accelerated wormholes. In the zero acceleration limit the solution reduces to an asymptotically flat wormhole. For certain range of parameter space this solution doesn't possess any horizon, thus making it a viable candidate of wormhole. To completely unveil th…
▽ More
The newly discovered Wormhole C--metric is a solution of Einstein's field equation coupled with a phantom scalar field which describes the accelerated wormholes. In the zero acceleration limit the solution reduces to an asymptotically flat wormhole. For certain range of parameter space this solution doesn't possess any horizon, thus making it a viable candidate of wormhole. To completely unveil this property we have studied the topological properties of this spacetime and shown that the throat is marginally connected. In the aforementioned range of parameters, the spacetime doesn't posses any photon orbit confirming the absence of shadow. We further analysed the stability of this spacetime under scalar perturbation. Under the usual boundary conditions (outgoing waves at both spatial infinities) there exists a continuous spectra. On the contrary one may achieve the quantization of the modes by exploiting a different but physically intuitive boundary condition. The lowest lying mode behaves as normal mode, and the imaginary part comes into play for the modes corresponding to first overtone number $(n=1)$ marking the onset of quasi-nomral modes for all azimuthal quantum number, $L$. We have also argued that the spacetime has a tendency to hold the excitation in it due to the external perturbation, rather than a fast de-excitation.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Bloch functions with wild boundary behaviour in $\mathbb{C}^N$
Authors:
Stéphane Charpentier,
Nicolas Espoullier,
Rachid Zarouf
Abstract:
We prove the existence of functions $f$ in the Bloch space of the unit ball $\mathbb{B}_N$ of $\mathbb{C}^N$ with the property that, given any measurable function $\varphi$ on the unit sphere $\mathbb{S}_N$, there exists a sequence $(r_n)_n$, $r_n\in (0,1)$, converging to $1$, such that for every $w\in \mathbb{B}_N$,…
▽ More
We prove the existence of functions $f$ in the Bloch space of the unit ball $\mathbb{B}_N$ of $\mathbb{C}^N$ with the property that, given any measurable function $\varphi$ on the unit sphere $\mathbb{S}_N$, there exists a sequence $(r_n)_n$, $r_n\in (0,1)$, converging to $1$, such that for every $w\in \mathbb{B}_N$, $$f(r_n(ζ-w)+w) \to \varphi(ζ)\text{ as }n\to \infty\text{, for almost every }ζ\in \mathbb{S}_N.$$ The set of such functions is residual in the little Bloch space. A similar result is obtained for the Bloch space of the polydisc.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Nonlocal Locking of Observable Quantities: A Faithful Signature of Nonclassical Correlations
Authors:
Mir Alimuddin,
Snehasish Roy Chowdhury,
Ram Krishna Patra,
Subhendu B. Ghosh,
Tommaso Tufarelli,
Gerardo Adesso,
Manik Banik
Abstract:
Nonclassicality in composite quantum systems depicts several puzzling manifestations, with Einstein-Podolsky-Rosen entanglement, Schrödinger steering, and Bell nonlocality being the most celebrated ones. In addition to those, an unentangled quantum state can also exhibit nonclassicality, as evidenced from notions such as quantum discord and work deficit. Here, we propose a general framework to inv…
▽ More
Nonclassicality in composite quantum systems depicts several puzzling manifestations, with Einstein-Podolsky-Rosen entanglement, Schrödinger steering, and Bell nonlocality being the most celebrated ones. In addition to those, an unentangled quantum state can also exhibit nonclassicality, as evidenced from notions such as quantum discord and work deficit. Here, we propose a general framework to investigate nonclassical correlations in multipartite quantum states. The distinct signatures left on observable quantities, depending on whether the sub-parts of a composite system are probed separately or jointly, provide an operational avenue to construct different quantifiers that faithfully capture signatures of nonclassicality in quantum states. Along the line we unveil an intriguing phenomenon referred to as `nonlocal locking of observable quantities', where the value of an observable quantity gets locked in the correlation of a nonclassical state. Our approach reduces the experimental demand for verification of nonclassicality in composite systems and can find applications for enhanced energy storage in quantum thermodynamical devices.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Approximation of topological singularities through free discontinuity functionals: the critical and super-critical regimes
Authors:
Vito Crismale,
Lucia De Luca,
Riccardo Scala
Abstract:
We further investigate the properties of an approach to topological singularities through free discontinuity functionals of Mumford-Shah type proposed in \cite{DLSVG}. We prove the variational equivalence between such energies, Ginzburg-Landau, and Core-Radius for anti-plane screw dislocations energies in dimension two, in the relevant energetic regimes $|\log \varepsilon|^a$, $a\geq 1$, where…
▽ More
We further investigate the properties of an approach to topological singularities through free discontinuity functionals of Mumford-Shah type proposed in \cite{DLSVG}. We prove the variational equivalence between such energies, Ginzburg-Landau, and Core-Radius for anti-plane screw dislocations energies in dimension two, in the relevant energetic regimes $|\log \varepsilon|^a$, $a\geq 1$, where $\varepsilon$ denotes the linear size of the process zone near the defects.
Further, we remove the \emph{a priori} restrictive assumptions that the approximating order parameters have compact jump set. This is obtained by proving a new density result for $\mathbb S^1$-valued $SBV^p$ functions, approximated through functions with essentially closed jump set, in the strong $BV$ norm.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Explainability of Sub-Field Level Crop Yield Prediction using Remote Sensing
Authors:
Hiba Najjar,
Miro Miranda,
Marlon Nuske,
Ribana Roscher,
Andreas Dengel
Abstract:
Crop yield forecasting plays a significant role in addressing growing concerns about food security and guiding decision-making for policymakers and farmers. When deep learning is employed, understanding the learning and decision-making processes of the models, as well as their interaction with the input data, is crucial for establishing trust in the models and gaining insight into their reliabilit…
▽ More
Crop yield forecasting plays a significant role in addressing growing concerns about food security and guiding decision-making for policymakers and farmers. When deep learning is employed, understanding the learning and decision-making processes of the models, as well as their interaction with the input data, is crucial for establishing trust in the models and gaining insight into their reliability. In this study, we focus on the task of crop yield prediction, specifically for soybean, wheat, and rapeseed crops in Argentina, Uruguay, and Germany. Our goal is to develop and explain predictive models for these crops, using a large dataset of satellite images, additional data modalities, and crop yield maps. We employ a long short-term memory network and investigate the impact of using different temporal samplings of the satellite data and the benefit of adding more relevant modalities. For model explainability, we utilize feature attribution methods to quantify input feature contributions, identify critical growth stages, analyze yield variability at the field level, and explain less accurate predictions.
The modeling results show an improvement when adding more modalities or using all available instances of satellite data. The explainability results reveal distinct feature importance patterns for each crop and region. We further found that the most influential growth stages on the prediction are dependent on the temporal sampling of the input data. We demonstrated how these critical growth stages, which hold significant agronomic value, closely align with the existing literature in agronomy and crop development biology.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL
Authors:
Zhenhe Wu,
Zhongqiu Li,
Jie Zhang,
Mengxiang Li,
Yu Zhao,
Ruiyu Fang,
Zhongjiang He,
Xuelong Li,
Zhoujun Li,
Shuangyong Song
Abstract:
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task. Previous works generally focus on using exclusive SQL generation prompt to improve the LLMs' reasoning ability. However, they are mostly hard to handle large databases with numerous tables and columns, and usually ignore the significance of pre-processing database and extracting v…
▽ More
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task. Previous works generally focus on using exclusive SQL generation prompt to improve the LLMs' reasoning ability. However, they are mostly hard to handle large databases with numerous tables and columns, and usually ignore the significance of pre-processing database and extracting valuable information for more efficient prompt engineering. Based on above analysis, we propose RB-SQL, a novel retrieval-based LLM framework for in-context prompt engineering, which consists of three modules that retrieve concise tables and columns as schema, and targeted examples for in-context learning. Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
△ Less
Submitted 12 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Enhancing Thermal Infrared Tracking with Natural Language Modeling and Coordinate Sequence Generation
Authors:
Miao Yan,
** Zhang,
Haofei Zhang,
Ruqian Hao,
Juanxiu Liu,
Xiaoyang Wang,
Lin Liu
Abstract:
Thermal infrared tracking is an essential topic in computer vision tasks because of its advantage of all-weather imaging. However, most conventional methods utilize only hand-crafted features, while deep learning-based correlation filtering methods are limited by simple correlation operations. Transformer-based methods ignore temporal and coordinate information, which is critical for TIR tracking…
▽ More
Thermal infrared tracking is an essential topic in computer vision tasks because of its advantage of all-weather imaging. However, most conventional methods utilize only hand-crafted features, while deep learning-based correlation filtering methods are limited by simple correlation operations. Transformer-based methods ignore temporal and coordinate information, which is critical for TIR tracking that lacks texture and color information. In this paper, to address these issues, we apply natural language modeling to TIR tracking and propose a novel model called NLMTrack, which enhances the utilization of coordinate and temporal information. NLMTrack applies an encoder that unifies feature extraction and feature fusion, which simplifies the TIR tracking pipeline. To address the challenge of low detail and low contrast in TIR images, on the one hand, we design a multi-level progressive fusion module that enhances the semantic representation and incorporates multi-scale features. On the other hand, the decoder combines the TIR features and the coordinate sequence features using a causal transformer to generate the target sequence step by step. Moreover, we explore an adaptive loss aimed at elevating tracking accuracy and a simple template update strategy to accommodate the target's appearance variations. Experiments show that NLMTrack achieves state-of-the-art performance on multiple benchmarks. The Code is publicly available at \url{https://github.com/ELOESZHANG/NLMTrack}.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Verificarlo CI: continuous integration for numerical optimization and debugging
Authors:
Aurélien Delval,
François Coppens,
Eric Petit,
Roman Iakymchuk,
Pablo de Oliveira Castro
Abstract:
Floating-point accuracy is an important concern when develo** numerical simulations or other compute-intensive codes. Tracking the introduction of numerical regression is often delayed until it provokes unexpected bug for the end-user. In this paper, we introduce Verificarlo CI, a continuous integration workflow for the numerical optimization and debugging of a code over the course of its devel…
▽ More
Floating-point accuracy is an important concern when develo** numerical simulations or other compute-intensive codes. Tracking the introduction of numerical regression is often delayed until it provokes unexpected bug for the end-user. In this paper, we introduce Verificarlo CI, a continuous integration workflow for the numerical optimization and debugging of a code over the course of its development. We demonstrate applicability of Verificarlo CI on two test-case applications.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
The OPNV Data Collection: A Dataset for Infrastructure-Supported Perception Research with Focus on Public Transportation
Authors:
Marcel Vosshans,
Alexander Baumann,
Matthias Drueppel,
Omar Ait-Aider,
Ralf Woerner,
Youcef Mezouar,
Thao Dang,
Markus Enzweiler
Abstract:
This paper we present our vision and ongoing work for a novel dataset designed to advance research into the interoperability of intelligent vehicles and infrastructure, specifically aimed at enhancing cooperative perception and interaction in the realm of public transportation. Unlike conventional datasets centered on ego-vehicle data, this approach encompasses both a stationary sensor tower and a…
▽ More
This paper we present our vision and ongoing work for a novel dataset designed to advance research into the interoperability of intelligent vehicles and infrastructure, specifically aimed at enhancing cooperative perception and interaction in the realm of public transportation. Unlike conventional datasets centered on ego-vehicle data, this approach encompasses both a stationary sensor tower and a moving vehicle, each equipped with cameras, LiDARs, and GNSS, while the vehicle additionally includes an inertial navigation system. Our setup features comprehensive calibration and time synchronization, ensuring seamless and accurate sensor data fusion crucial for studying complex, dynamic scenes. Emphasizing public transportation, the dataset targets to include scenes like bus station maneuvers and driving on dedicated bus lanes, reflecting the specifics of small public buses. We introduce the open-source ".4mse" file format for the new dataset, accompanied by a research kit. This kit provides tools such as ego-motion compensation or LiDAR-to-camera projection enabling advanced research on intelligent vehicle-infrastructure integration. Our approach does not include annotations; however, we plan to implement automatically generated labels sourced from state-of-the-art public repositories. Several aspects are still up for discussion, and timely feedback from the community would be greatly appreciated. A sneak preview on one data frame will be available at a Google Colab Notebook. Moreover, we will use the related GitHub Repository to collect remarks and suggestions.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
SALSA: Swift Adaptive Lightweight Self-Attention for Enhanced LiDAR Place Recognition
Authors:
Raktim Gautam Goswami,
Naman Patel,
Prashanth Krishnamurthy,
Farshad Khorrami
Abstract:
Large-scale LiDAR map**s and localization leverage place recognition techniques to mitigate odometry drifts, ensuring accurate map**. These techniques utilize scene representations from LiDAR point clouds to identify previously visited sites within a database. Local descriptors, assigned to each point within a point cloud, are aggregated to form a scene representation for the point cloud. Thes…
▽ More
Large-scale LiDAR map**s and localization leverage place recognition techniques to mitigate odometry drifts, ensuring accurate map**. These techniques utilize scene representations from LiDAR point clouds to identify previously visited sites within a database. Local descriptors, assigned to each point within a point cloud, are aggregated to form a scene representation for the point cloud. These descriptors are also used to re-rank the retrieved point clouds based on geometric fitness scores. We propose SALSA, a novel, lightweight, and efficient framework for LiDAR place recognition. It consists of a Sphereformer backbone that uses radial window attention to enable information aggregation for sparse distant points, an adaptive self-attention layer to pool local descriptors into tokens, and a multi-layer-perceptron Mixer layer for aggregating the tokens to generate a scene descriptor. The proposed framework outperforms existing methods on various LiDAR place recognition datasets in terms of both retrieval and metric localization while operating in real-time.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
GeNet: A Multimodal LLM-Based Co-Pilot for Network Topology and Configuration
Authors:
Beni Ifland,
Elad Duani,
Rubin Krief,
Miro Ohana,
Aviram Zilberman,
Andres Murillo,
Ofir Manor,
Ortal Lavi,
Hikichi Kenji,
Asaf Shabtai,
Yuval Elovici,
Rami Puzis
Abstract:
Communication network engineering in enterprise environments is traditionally a complex, time-consuming, and error-prone manual process. Most research on network engineering automation has concentrated on configuration synthesis, often overlooking changes in the physical network topology. This paper introduces GeNet, a multimodal co-pilot for enterprise network engineers. GeNet is a novel framewor…
▽ More
Communication network engineering in enterprise environments is traditionally a complex, time-consuming, and error-prone manual process. Most research on network engineering automation has concentrated on configuration synthesis, often overlooking changes in the physical network topology. This paper introduces GeNet, a multimodal co-pilot for enterprise network engineers. GeNet is a novel framework that leverages a large language model (LLM) to streamline network design workflows. It uses visual and textual modalities to interpret and update network topologies and device configurations based on user intents. GeNet was evaluated on enterprise network scenarios adapted from Cisco certification exercises. Our results demonstrate GeNet's ability to interpret network topology images accurately, potentially reducing network engineers' efforts and accelerating network design processes in enterprise environments. Furthermore, we show the importance of precise topology understanding when handling intents that require modifications to the network's topology.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Integrated User Matching and Pricing in Round-Trip Car-Sharing
Authors:
Avalpreet Singh Brar,
Rong Su,
Gioele Zardini,
Jaskaranveer Kaur
Abstract:
Traditional round-trip car rental systems mandate users to return vehicles to their point of origin, limiting the system adaptability to meet diverse mobility demands. This constraint often leads to fleet under-utilization and incurs high parking costs for idle vehicles. To address this inefficiency, we propose a N-user matching algorithm which is designed to facilitate one-way trips within the ro…
▽ More
Traditional round-trip car rental systems mandate users to return vehicles to their point of origin, limiting the system adaptability to meet diverse mobility demands. This constraint often leads to fleet under-utilization and incurs high parking costs for idle vehicles. To address this inefficiency, we propose a N-user matching algorithm which is designed to facilitate one-way trips within the round-trip rental framework. Our algorithm addresses the joint problem of optimal pricing and user matching through a Two-Stage Integer Linear Programming (ILP)-based formulation. In the first stage, optimal rental prices are determined by setting a risk factor that governs the likelihood of matching a set of N-user. The second stage involves maximizing expected profit through a novel ILP-based user-matching formulation. Testing our algorithm on real-world scenarios demonstrates an approximate 35\% increase in demand fulfillment. Additionally, we assess the model robustness under uncertainty by varying factors such as the risk factor (probability of user ride acceptance at the offered price), cost factor (rental cost-to-fare ratio), and maximum chain length.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
SwishReLU: A Unified Approach to Activation Functions for Enhanced Deep Neural Networks Performance
Authors:
Jamshaid Ul Rahman,
Rubiqa Zulfiqar,
Asad Khan,
Nimra
Abstract:
ReLU, a commonly used activation function in deep neural networks, is prone to the issue of "Dying ReLU". Several enhanced versions, such as ELU, SeLU, and Swish, have been introduced and are considered to be less commonly utilized. However, replacing ReLU can be somewhat challenging due to its inconsistent advantages. While Swish offers a smoother transition similar to ReLU, its utilization gener…
▽ More
ReLU, a commonly used activation function in deep neural networks, is prone to the issue of "Dying ReLU". Several enhanced versions, such as ELU, SeLU, and Swish, have been introduced and are considered to be less commonly utilized. However, replacing ReLU can be somewhat challenging due to its inconsistent advantages. While Swish offers a smoother transition similar to ReLU, its utilization generally incurs a greater computational burden compared to ReLU. This paper proposes SwishReLU, a novel activation function combining elements of ReLU and Swish. Our findings reveal that SwishReLU outperforms ReLU in performance with a lower computational cost than Swish. This paper undertakes an examination and comparison of different types of ReLU variants with SwishReLU. Specifically, we compare ELU and SeLU along with Tanh on three datasets: CIFAR-10, CIFAR-100 and MNIST. Notably, applying SwishReLU in the VGG16 model described in Algorithm 2 yields a 6% accuracy improvement on the CIFAR-10 dataset.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Wasserstein $k$-Centres Clustering for Distributional Data
Authors:
Ryo Okano,
Masaaki Imaizumi
Abstract:
We develop a novel clustering method for distributional data, where each data point is regarded as a probability distribution on the real line. For distributional data, it has been challenging to develop a clustering method that utilizes the mode of variation of data because the space of probability distributions lacks a vector space structure, preventing the application of existing methods for fu…
▽ More
We develop a novel clustering method for distributional data, where each data point is regarded as a probability distribution on the real line. For distributional data, it has been challenging to develop a clustering method that utilizes the mode of variation of data because the space of probability distributions lacks a vector space structure, preventing the application of existing methods for functional data. In this study, we propose a novel clustering method for distributional data on the real line, which takes account of difference in both the mean and mode of variation structures of clusters, in the spirit of the $k$-centres clustering approach proposed for functional data. Specifically, we consider the space of distributions equipped with the Wasserstein metric and define a geodesic mode of variation of distributional data using geodesic principal component analysis. Then, we utilize the geodesic mode of each cluster to predict the cluster membership of each distribution. We theoretically show the validity of the proposed clustering criterion by studying the probability of correct membership. Through a simulation study and real data application, we demonstrate that the proposed distributional clustering method can improve cluster quality compared to conventional clustering algorithms.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
PINN-Ray: A Physics-Informed Neural Network to Model Soft Robotic Fin Ray Fingers
Authors:
Xing Wang,
Joel Janek Dabrowski,
Josh Pinskier,
Lois Liow,
Vinoth Viswanathan,
Richard Scalzo,
David Howard
Abstract:
Modelling complex deformation for soft robotics provides a guideline to understand their behaviour, leading to safe interaction with the environment. However, building a surrogate model with high accuracy and fast inference speed can be challenging for soft robotics due to the nonlinearity from complex geometry, large deformation, material nonlinearity etc. The reality gap from surrogate models al…
▽ More
Modelling complex deformation for soft robotics provides a guideline to understand their behaviour, leading to safe interaction with the environment. However, building a surrogate model with high accuracy and fast inference speed can be challenging for soft robotics due to the nonlinearity from complex geometry, large deformation, material nonlinearity etc. The reality gap from surrogate models also prevents their further deployment in the soft robotics domain. In this study, we proposed a physics-informed Neural Networks (PINNs) named PINN-Ray to model complex deformation for a Fin Ray soft robotic gripper, which embeds the minimum potential energy principle from elastic mechanics and additional high-fidelity experimental data into the loss function of neural network for training. This method is significant in terms of its generalisation to complex geometry and robust to data scarcity as compared to other data-driven neural networks. Furthermore, it has been extensively evaluated to model the deformation of the Fin Ray finger under external actuation. PINN-Ray demonstrates improved accuracy as compared with Finite element modelling (FEM) after applying the data assimilation scheme to treat the sim-to-real gap. Additionally, we introduced our automated framework to design, fabricate soft robotic fingers, and characterise their deformation by visual tracking, which provides a guideline for the fast prototype of soft robotics.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
GAURA: Generalizable Approach for Unified Restoration and Rendering of Arbitrary Views
Authors:
Vinayak Gupta,
Rongali Simhachala Venkata Girish,
Mukund Varma T,
Ayush Tewari,
Kaushik Mitra
Abstract:
Neural rendering methods can achieve near-photorealistic image synthesis of scenes from posed input images. However, when the images are imperfect, e.g., captured in very low-light conditions, state-of-the-art methods fail to reconstruct high-quality 3D scenes. Recent approaches have tried to address this limitation by modeling various degradation processes in the image formation model; however, t…
▽ More
Neural rendering methods can achieve near-photorealistic image synthesis of scenes from posed input images. However, when the images are imperfect, e.g., captured in very low-light conditions, state-of-the-art methods fail to reconstruct high-quality 3D scenes. Recent approaches have tried to address this limitation by modeling various degradation processes in the image formation model; however, this limits them to specific image degradations. In this paper, we propose a generalizable neural rendering method that can perform high-fidelity novel view synthesis under several degradations. Our method, GAURA, is learning-based and does not require any test-time scene-specific optimization. It is trained on a synthetic dataset that includes several degradation types. GAURA outperforms state-of-the-art methods on several benchmarks for low-light enhancement, dehazing, deraining, and on-par for motion deblurring. Further, our model can be efficiently fine-tuned to any new incoming degradation using minimal data. We thus demonstrate adaptation results on two unseen degradations, desnowing and removing defocus blur. Code and video results are available at vinayak-vg.github.io/GAURA.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
On Tree Automata, Generating Functions, and Differential Equations
Authors:
Rida Ait El Manssour,
Vincent Cheval,
Mahsa Shirmohammadi,
James Worrell
Abstract:
In this paper we introduce holonomic tree automata: a common extension of weighted tree automata and holonomic recurrences. We show that the generating function of the tree series represented by such an automaton is differentially algebraic. Conversely, we give an algorithm that inputs a differentially algebraic power series, represented as a solution of a rational dynamical system, and outputs an…
▽ More
In this paper we introduce holonomic tree automata: a common extension of weighted tree automata and holonomic recurrences. We show that the generating function of the tree series represented by such an automaton is differentially algebraic. Conversely, we give an algorithm that inputs a differentially algebraic power series, represented as a solution of a rational dynamical system, and outputs an automaton whose generating function is the given series. Such an automaton yields a recurrence that can be used to compute the terms of the power series. We use the algorithm to obtain automaton representations of exponential generating functions of families of combinatorial objects given as combinatorial species. Using techniques from differential algebra, we show that it is decidable both whether two automata represent the same formal tree series and whether they have the same generating function.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
A description of classical and quantum cosmology for a single scalar field torsion gravity
Authors:
Dipankar Laya,
Roshni Bhaumik,
Sourav Dutta,
Subenoy Chakraborty
Abstract:
In the background of homogeneous and isotropic flat FLRW space-time, both classical and quantum cosmology has been studied for teleparallel dark energy (DE) model. Using Noether symmetry analysis, not only the symmetry vector but also the coupling function in the Lagrangian and the potential of the scalar field has been determined. Also symmetry analysis identifies a cyclic variable in the Lagrang…
▽ More
In the background of homogeneous and isotropic flat FLRW space-time, both classical and quantum cosmology has been studied for teleparallel dark energy (DE) model. Using Noether symmetry analysis, not only the symmetry vector but also the coupling function in the Lagrangian and the potential of the scalar field has been determined. Also symmetry analysis identifies a cyclic variable in the Lagrangian along the symmetry vector and as a result the Lagrangian simplifies to a great extend so that classical solution is obtained. Subsequently, in quantum cosmology Wheeler-DeWitt(WD) equation has been constructed and the quantum version of the conserved momenta corresponding to Noether symmetry identifies the periodic part of the wave function of the universe and as a result the Wheeler-DeWitt equation becomes solvable. Finally, quantum description shows finite non-zero probability at the classical big-bang singularity.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
PrefCLM: Enhancing Preference-based Reinforcement Learning with Crowdsourced Large Language Models
Authors:
Ruiqi Wang,
Dezhong Zhao,
Ziqin Yuan,
Ike Obi,
Byung-Cheol Min
Abstract:
Preference-based reinforcement learning (PbRL) is emerging as a promising approach to teaching robots through human comparative feedback, sidestep** the need for complex reward engineering. However, the substantial volume of feedback required in existing PbRL methods often lead to reliance on synthetic feedback generated by scripted teachers. This approach necessitates intricate reward engineeri…
▽ More
Preference-based reinforcement learning (PbRL) is emerging as a promising approach to teaching robots through human comparative feedback, sidestep** the need for complex reward engineering. However, the substantial volume of feedback required in existing PbRL methods often lead to reliance on synthetic feedback generated by scripted teachers. This approach necessitates intricate reward engineering again and struggles to adapt to the nuanced preferences particular to human-robot interaction (HRI) scenarios, where users may have unique expectations toward the same task. To address these challenges, we introduce PrefCLM, a novel framework that utilizes crowdsourced large language models (LLMs) as simulated teachers in PbRL. We utilize Dempster-Shafer Theory to fuse individual preferences from multiple LLM agents at the score level, efficiently leveraging their diversity and collective intelligence. We also introduce a human-in-the-loop pipeline that facilitates collective refinements based on user interactive feedback. Experimental results across various general RL tasks show that PrefCLM achieves competitive performance compared to traditional scripted teachers and excels in facilitating more more natural and efficient behaviors. A real-world user study (N=10) further demonstrates its capability to tailor robot behaviors to individual user preferences, significantly enhancing user satisfaction in HRI scenarios.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Classical and Quantum Cosmology in Einstein-aether Scalar-tensor gravity: Noether Symmetry Analysis
Authors:
Dipanakr Laya,
Roshni Bhaumik,
Sourav Dutta,
Subenoy Chakraborty
Abstract:
The present work deals with Einstein-aether Scalar tensor gravity in the background of homogeneous and isotropic flat FLRW space-time model. The Noether symmetry vector identifies a transformation in the augmented space so that the field equations become solvable. The cosmological solutions are analyzed from the observational point of view. Finally, for quantum cosmology, the Wheeler-DeWitt (WD) h…
▽ More
The present work deals with Einstein-aether Scalar tensor gravity in the background of homogeneous and isotropic flat FLRW space-time model. The Noether symmetry vector identifies a transformation in the augmented space so that the field equations become solvable. The cosmological solutions are analyzed from the observational point of view. Finally, for quantum cosmology, the Wheeler-DeWitt (WD) has been formulated and solutions have been determined by identifying the periodic nature of the wave function using conserved (Noether) charge.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Chromosomal Structural Abnormality Diagnosis by Homologous Similarity
Authors:
Juren Li,
Fanzhe Fu,
Ran Wei,
Yifei Sun,
Zeyu Lai,
Ning Song,
Xin Chen,
Yang Yang
Abstract:
Pathogenic chromosome abnormalities are very common among the general population. While numerical chromosome abnormalities can be quickly and precisely detected, structural chromosome abnormalities are far more complex and typically require considerable efforts by human experts for identification. This paper focuses on investigating the modeling of chromosome features and the identification of chr…
▽ More
Pathogenic chromosome abnormalities are very common among the general population. While numerical chromosome abnormalities can be quickly and precisely detected, structural chromosome abnormalities are far more complex and typically require considerable efforts by human experts for identification. This paper focuses on investigating the modeling of chromosome features and the identification of chromosomes with structural abnormalities. Most existing data-driven methods concentrate on a single chromosome and consider each chromosome independently, overlooking the crucial aspect of homologous chromosomes. In normal cases, homologous chromosomes share identical structures, with the exception that one of them is abnormal. Therefore, we propose an adaptive method to align homologous chromosomes and diagnose structural abnormalities through homologous similarity. Inspired by the process of human expert diagnosis, we incorporate information from multiple pairs of homologous chromosomes simultaneously, aiming to reduce noise disturbance and improve prediction performance. Extensive experiments on real-world datasets validate the effectiveness of our model compared to baselines.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
SRPose: Two-view Relative Pose Estimation with Sparse Keypoints
Authors:
Rui Yin,
Yulun Zhang,
Zherong Pan,
Jianjun Zhu,
Cheng Wang,
Biao Jia
Abstract:
Two-view pose estimation is essential for map-free visual relocalization and object pose tracking tasks. However, traditional matching methods suffer from time-consuming robust estimators, while deep learning-based pose regressors only cater to camera-to-world pose estimation, lacking generalizability to different image sizes and camera intrinsics. In this paper, we propose SRPose, a sparse keypoi…
▽ More
Two-view pose estimation is essential for map-free visual relocalization and object pose tracking tasks. However, traditional matching methods suffer from time-consuming robust estimators, while deep learning-based pose regressors only cater to camera-to-world pose estimation, lacking generalizability to different image sizes and camera intrinsics. In this paper, we propose SRPose, a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios. SRPose consists of a sparse keypoint detector, an intrinsic-calibration position encoder, and promptable prior knowledge-guided attention layers. Given two RGB images of a fixed scene or a moving object, SRPose estimates the relative camera or 6D object pose transformation. Extensive experiments demonstrate that SRPose achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed, showing generalizability to both scenarios. It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
On constacyclic codes over a class of non-chain rings
Authors:
Nikita Jain,
Sucheta Dutt,
Ranjeet Sehmi
Abstract:
In this paper, a unique form of generators of a constacyclic code of arbitrary length over a non-chain ring of the type $\mathtt{R_{_θ}}=Z_{4}+νZ_{4}, ν^{2}=θ\in Z_{4}+νZ_{4}$ has been obtained. Further, rank and cardinality of a constacyclic code of arbitrary length over a non-chain ring of the type $\mathtt{R_{_θ}}$ have been obtained by determining a minimal spanning set of the code. Also, nece…
▽ More
In this paper, a unique form of generators of a constacyclic code of arbitrary length over a non-chain ring of the type $\mathtt{R_{_θ}}=Z_{4}+νZ_{4}, ν^{2}=θ\in Z_{4}+νZ_{4}$ has been obtained. Further, rank and cardinality of a constacyclic code of arbitrary length over a non-chain ring of the type $\mathtt{R_{_θ}}$ have been obtained by determining a minimal spanning set of the code. Also, necessary and sufficient conditions for a constacyclic code of arbitrary length over a non-chain ring of the type $\mathtt{R_{_θ}}$ to be reversible have been determined. Examples have also been presented in support of our results.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
fairBERTs: Erasing Sensitive Information Through Semantic and Fairness-aware Perturbations
Authors:
**feng Li,
Yuefeng Chen,
Xiangyu Liu,
Longtao Huang,
Rong Zhang,
Hui Xue
Abstract:
Pre-trained language models (PLMs) have revolutionized both the natural language processing research and applications. However, stereotypical biases (e.g., gender and racial discrimination) encoded in PLMs have raised negative ethical implications for PLMs, which critically limits their broader applications. To address the aforementioned unfairness issues, we present fairBERTs, a general framework…
▽ More
Pre-trained language models (PLMs) have revolutionized both the natural language processing research and applications. However, stereotypical biases (e.g., gender and racial discrimination) encoded in PLMs have raised negative ethical implications for PLMs, which critically limits their broader applications. To address the aforementioned unfairness issues, we present fairBERTs, a general framework for learning fair fine-tuned BERT series models by erasing the protected sensitive information via semantic and fairness-aware perturbations generated by a generative adversarial network. Through extensive qualitative and quantitative experiments on two real-world tasks, we demonstrate the great superiority of fairBERTs in mitigating unfairness while maintaining the model utility. We also verify the feasibility of transferring adversarial components in fairBERTs to other conventionally trained BERT-like models for yielding fairness improvements. Our findings may shed light on further research on building fairer fine-tuned PLMs.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation
Authors:
Ruijie Zhu,
Chuxin Wang,
Ziyang Song,
Li Liu,
Tianzhu Zhang,
Yongdong Zhang
Abstract:
Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing acros…
▽ More
Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing across scenes with significant scale variations. To address this challenge, we propose a novel monocular depth estimation method called ScaleDepth. Our method decomposes metric depth into scene scale and relative depth, and predicts them through a semantic-aware scale prediction (SASP) module and an adaptive relative depth estimation (ARDE) module, respectively. The proposed ScaleDepth enjoys several merits. First, the SASP module can implicitly combine structural and semantic features of the images to predict precise scene scales. Second, the ARDE module can adaptively estimate the relative depth distribution of each image within a normalized depth space. Third, our method achieves metric depth estimation for both indoor and outdoor scenes in a unified framework, without the need for setting the depth range or fine-tuning model. Extensive experiments demonstrate that our method attains state-of-the-art performance across indoor, outdoor, unconstrained, and unseen scenes. Project page: https://ruijiezhu94.github.io/ScaleDepth
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Generalized Diffusive Epidemic Process with Permanent Immunity in Two Dimensions
Authors:
V. R. Carvalho,
T. F. A. Alves,
G. A. Alves,
D. S. M. Alencar,
F. W. S. Lima,
A. Macedo-Filho,
R. S. Ferreira
Abstract:
We introduce the generalized diffusive epidemic process, which is a metapopulation model for an epidemic outbreak where a non-sedentary population of walkers can jump along lattice edges with diffusion rates $D_S$ or $D_I$ if they are susceptible or infected, respectively, and recovered individuals possess permanent immunity. Individuals can be contaminated with rate $μ_c$ if they share the same l…
▽ More
We introduce the generalized diffusive epidemic process, which is a metapopulation model for an epidemic outbreak where a non-sedentary population of walkers can jump along lattice edges with diffusion rates $D_S$ or $D_I$ if they are susceptible or infected, respectively, and recovered individuals possess permanent immunity. Individuals can be contaminated with rate $μ_c$ if they share the same lattice node with an infected individual and recover with rate $μ_r$, being removed from the dynamics. Therefore, the model does not have the conservation of the active particles composed of susceptible and infected individuals. The reaction-diffusion dynamics are separated into two stages: (i) Brownian diffusion, where the particles can jump to neighboring nodes, and (ii) contamination and recovery reactions. The dynamics are mapped into a growing process by activating lattice nodes with successful contaminations where activated nodes are interpreted as infection sources. In all simulations, the epidemic starts with one infected individual in a lattice filled with susceptibles. Our results indicate a phase transition in the dynamic percolation universality class controlled by the population size, irrespective of diffusion rates $D_S$ and $D_I$ and a subexponential growth of the epidemics in the percolation threshold.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
On the interplay of electronic and lattice screening on exciton binding in two-dimensional lead halide perovskites
Authors:
Rohit Rana,
David T. Limmer
Abstract:
We use path integral Monte Carlo to study the energetics of excitons in layered, hybrid organic-inorganic perovskites in order to elucidate the relative contributions of dielectric confinement and electron-phonon coupling. While the dielectric mismatch between polar perovskite layers and non-polar ligand layers significantly increases the exciton binding energy relative to their three dimensional…
▽ More
We use path integral Monte Carlo to study the energetics of excitons in layered, hybrid organic-inorganic perovskites in order to elucidate the relative contributions of dielectric confinement and electron-phonon coupling. While the dielectric mismatch between polar perovskite layers and non-polar ligand layers significantly increases the exciton binding energy relative to their three dimensional bulk crystal counterparts, formation of exciton polarons attenuates this effect. Dielectric confinement is well described by a fractional dimension scaling law as a function of layer thickness. The contribution from polaron formation is found to be a non-monotonic function of the lead halide layer thickness, which is clarified by a general variational theory. Accounting for both of these effects provides a description of exciton binding energies in good agreement with experimental measurements. By studying isolated layers and stacked layered crystals of various thicknesses, with ligands of varying polarity, we provide a systematic understanding of the excitonic behavior of this class of materials and how to engineer their photophysics.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Hierarchical Consensus-Based Multi-Agent Reinforcement Learning for Multi-Robot Cooperation Tasks
Authors:
Pu Feng,
Junkang Liang,
Size Wang,
Xin Yu,
Rongye Shi,
Wenjun Wu
Abstract:
In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-…
▽ More
In multi-agent reinforcement learning (MARL), the Centralized Training with Decentralized Execution (CTDE) framework is pivotal but struggles due to a gap: global state guidance in training versus reliance on local observations in execution, lacking global signals. Inspired by human societal consensus mechanisms, we introduce the Hierarchical Consensus-based Multi-Agent Reinforcement Learning (HC-MARL) framework to address this limitation. HC-MARL employs contrastive learning to foster a global consensus among agents, enabling cooperative behavior without direct communication. This approach enables agents to form a global consensus from local observations, using it as an additional piece of information to guide collaborative actions during execution. To cater to the dynamic requirements of various tasks, consensus is divided into multiple layers, encompassing both short-term and long-term considerations. Short-term observations prompt the creation of an immediate, low-layer consensus, while long-term observations contribute to the formation of a strategic, high-layer consensus. This process is further refined through an adaptive attention mechanism that dynamically adjusts the influence of each consensus layer. This mechanism optimizes the balance between immediate reactions and strategic planning, tailoring it to the specific demands of the task at hand. Extensive experiments and real-world applications in multi-robot systems showcase our framework's superior performance, marking significant advancements over baselines.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Model-agnostic clean-label backdoor mitigation in cybersecurity environments
Authors:
Giorgio Severi,
Simona Boboila,
John Holodnak,
Kendra Kratkiewicz,
Rauf Izmailov,
Alina Oprea
Abstract:
The training phase of machine learning models is a delicate step, especially in cybersecurity contexts. Recent research has surfaced a series of insidious training-time attacks that inject backdoors in models designed for security classification tasks without altering the training labels. With this work, we propose new techniques that leverage insights in cybersecurity threat models to effectively…
▽ More
The training phase of machine learning models is a delicate step, especially in cybersecurity contexts. Recent research has surfaced a series of insidious training-time attacks that inject backdoors in models designed for security classification tasks without altering the training labels. With this work, we propose new techniques that leverage insights in cybersecurity threat models to effectively mitigate these clean-label poisoning attacks, while preserving the model utility. By performing density-based clustering on a carefully chosen feature subspace, and progressively isolating the suspicious clusters through a novel iterative scoring procedure, our defensive mechanism can mitigate the attacks without requiring many of the common assumptions in the existing backdoor defense literature. To show the generality of our proposed mitigation, we evaluate it on two clean-label model-agnostic attacks on two different classic cybersecurity data modalities: network flows classification and malware classification, using gradient boosting and neural network models.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Giant graviton expansion from eigenvalue instantons
Authors:
Yiming Chen,
Raghu Mahajan,
Haifeng Tang
Abstract:
Recently, S. Murthy has proposed a convergent expansion of free partition functions and superconformal indices of finite-$N$ purely adjoint gauge theories based on a Fredholm determinant expansion. This expansion has been dubbed the giant graviton expansion and takes the form of an infinite series of corrections to the $N=\infty$ result, with the $m^\text{th}$ correction being of order $e^{-mN}$.…
▽ More
Recently, S. Murthy has proposed a convergent expansion of free partition functions and superconformal indices of finite-$N$ purely adjoint gauge theories based on a Fredholm determinant expansion. This expansion has been dubbed the giant graviton expansion and takes the form of an infinite series of corrections to the $N=\infty$ result, with the $m^\text{th}$ correction being of order $e^{-mN}$. We show that this expansion can be reproduced using eigenvalue instantons in unitary matrix integrals. This perspective allows us to get the giant graviton expansion proposed by S. Murthy without the intermediate step of the Hubbard Stratonovich transformation.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
SCPNet: Unsupervised Cross-modal Homography Estimation via Intra-modal Self-supervised Learning
Authors:
Runmin Zhang,
Jun Ma,
Si-Yuan Cao,
Lun Luo,
Beinan Yu,
Shu-Jie Chen,
Junwei Li,
Hui-Liang Shen
Abstract:
We propose a novel unsupervised cross-modal homography estimation framework based on intra-modal Self-supervised learning, Correlation, and consistent feature map Projection, namely SCPNet. The concept of intra-modal self-supervised learning is first presented to facilitate the unsupervised cross-modal homography estimation. The correlation-based homography estimation network and the consistent fe…
▽ More
We propose a novel unsupervised cross-modal homography estimation framework based on intra-modal Self-supervised learning, Correlation, and consistent feature map Projection, namely SCPNet. The concept of intra-modal self-supervised learning is first presented to facilitate the unsupervised cross-modal homography estimation. The correlation-based homography estimation network and the consistent feature map projection are combined to form the learnable architecture of SCPNet, boosting the unsupervised learning framework. SCPNet is the first to achieve effective unsupervised homography estimation on the satellite-map image pair cross-modal dataset, GoogleMap, under [-32,+32] offset on a 128x128 image, leading the supervised approach MHN by 14.0% of mean average corner error (MACE). We further conduct extensive experiments on several cross-modal/spectral and manually-made inconsistent datasets, on which SCPNet achieves the state-of-the-art (SOTA) performance among unsupervised approaches, and owns 49.0%, 25.2%, 36.4%, and 10.7% lower MACEs than the supervised approach MHN. Source code is available at https://github.com/RM-Zhang/SCPNet.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.