Search | arXiv e-print repository

arXiv:2401.11690 [pdf, other]

Two Necessary and Sufficient Conditions to the Solvability of the Exterior Dirichlet Problem for the Monge-Ampère Equation

Authors: Cong Wang, Jiguang Bao

Abstract: The present paper provides two necessary and sufficient conditions for the existence of solutions to the exterior Dirichlet problem of the Monge-Ampère equation with prescribed asymptotic behavior at infinity. By an adapted smooth approximation argument, we prove that the problem is solvable if and only if the boundary value is semi-convex with respect to the inner boundary, which is our first pro… ▽ More The present paper provides two necessary and sufficient conditions for the existence of solutions to the exterior Dirichlet problem of the Monge-Ampère equation with prescribed asymptotic behavior at infinity. By an adapted smooth approximation argument, we prove that the problem is solvable if and only if the boundary value is semi-convex with respect to the inner boundary, which is our first proposed new concept. Along the lines of Perron's method for Laplace equation, we obtain the threshold for solvability in the asymptotic behavior at infinity of the solution, and remove the $C^2$ regularity assumptions on the boundary value and on the inner boundary which are required in the proofs of the corresponding existence theorems in the recent literatures. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: 30 pages, 2 figures

MSC Class: 35J96; 35J25; 35B40

arXiv:2401.07918 [pdf, other]

Are the dynamics of wall turbulence in minimal channels and larger domain channels equivalent? A graph-theoretic approach

Authors: Ahmed Elnahhas, Emma Lenz, Parviz Moin, Adrián Lozano-Durán, H. Jane Bae

Abstract: This work proposes two algorithmic approaches to extract critical dynamical mechanisms in wall-bounded turbulence with minimum human bias. In both approaches, multiple types of coherent structures are spatiotemporally tracked, resulting in a complex multilayer network. Network motif analysis, i.e., extracting dominant non-random elemental patterns within these networks, is used to identify the mos… ▽ More This work proposes two algorithmic approaches to extract critical dynamical mechanisms in wall-bounded turbulence with minimum human bias. In both approaches, multiple types of coherent structures are spatiotemporally tracked, resulting in a complex multilayer network. Network motif analysis, i.e., extracting dominant non-random elemental patterns within these networks, is used to identify the most dominant dynamical mechanisms. Both approaches, combined with network motif analysis, are used to answer whether the main dynamical mechanisms of a minimal flow unit (MFU) and a larger unconstrained channel flow, labeled a full channel (FC), at $Re_τ\approx 180$, are equivalent. The first approach tracks traditional coherent structures defined as low- and high-speed streaks, ejections, and sweeps. It is found that the roll-streak pairing, consistent with the current understanding of self-sustaining processes, is the most significant and simplest dynamical mechanism in both flows. However, the MFU has a timescale for this mechanism that is approximately $2.83$ times slower than that of the FC. In the second approach, we use semi-Lagrangian wavepackets and define coherent structures from their energetic streak, roll, and small-scale phase space. This method also shows similar motifs for both the MFU and FC. It indicates that, on average, the most dominant phase-space motifs are similar between the two flows, with the significant events taking place approximately $2.21$ times slower in the MFU than in the FC. This value is more consistent with the implied timescale ratio of only the slow speed streaks taking part in the roll-streak pairing extracted using the first multi-type spatiotemporal approach, which is approximately $2.17$ slower in the MFU than the FC. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.07828 [pdf]

Transient Magnetoelastic Coupling in CrSBr

Authors: Youn Jue Bae, Taketo Handa, Yanan Dai, Jue Wang, Huicong Liu, Allen Scheie, Daniel G. Chica, Michael E. Ziebel, Andrew D. Kent, Xiaodong Xu, Ka Shen, Xavier Roy, Xiaoyang Zhu

Abstract: Recent research has revealed remarkable properties of the two-dimensional (2D) van der Waals layered crystal CrSBr, which is both a semiconductor and an A-type antiferromagnet. Here we show the role of strong magnetoelastic coupling in the generation and propagation of coherent magnons in CrSBr. Time and spatially resolved magneto-optical Kerr effect (tr-MOKE) microscopy reveals two time-varying t… ▽ More Recent research has revealed remarkable properties of the two-dimensional (2D) van der Waals layered crystal CrSBr, which is both a semiconductor and an A-type antiferromagnet. Here we show the role of strong magnetoelastic coupling in the generation and propagation of coherent magnons in CrSBr. Time and spatially resolved magneto-optical Kerr effect (tr-MOKE) microscopy reveals two time-varying transient strain fields induced by out-of-plane transverse and in-plane longitudinal lattice displacements. These transient strain fields launch coherent wavepackets of magnons, optical and acoustic at 24.6 GHz and 33.4 GHz, respectively. These findings suggest mechanisms for controlling and manipulating coherent magnons from distinct magnetoelastic couplings in this 2D van der Waals magnetic semiconductor. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Comments: 12 pages, 4 figures, SI

arXiv:2401.06295 [pdf, ps, other]

Linear and nonlinear Granger causality analysis of turbulent duct flows

Authors: Barbara Lopez-Doriga, Marco Atzori, Ricardo Vinuesa, H. Jane Bae, Ankit Srivastava, Scott T. M. Dawson

Abstract: This research focuses on the identification and causality analysis of coherent structures that arise in turbulent flows in square and rectangular ducts. Coherent structures are first identified from direct numerical simulation data via proper orthogonal decomposition (POD), both by using all velocity components, and after separating the streamwise and secondary components of the flow. The causal r… ▽ More This research focuses on the identification and causality analysis of coherent structures that arise in turbulent flows in square and rectangular ducts. Coherent structures are first identified from direct numerical simulation data via proper orthogonal decomposition (POD), both by using all velocity components, and after separating the streamwise and secondary components of the flow. The causal relations between the mode coefficients are analysed using pairwise-conditional Granger causality analysis. We also formulate a nonlinear Granger causality analysis that can account for nonlinear interactions between modes. Focusing on streamwise-constant structures within a duct of short streamwise extent, we show that the causal relationships are highly sensitive to whether the mode coefficients or their squared values are considered, whether nonlinear effects are explicitly accounted for, and whether streamwise and secondary flow structures are separated prior to causality analyses. We leverage these sensitivities to determine that linear mechanisms underpin causal relationships between modes that share the same symmetry or anti-symmetry properties about the corner bisector, while nonlinear effects govern the causal interactions between symmetric and antisymmetric modes. In all cases, we find that the secondary flow fluctuations (manifesting as streamwise vorticial structures) are the primary cause of both the presence and movement of near-wall streaks towards and away from the duct corners. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.05842 [pdf, ps, other]

A Categorical Approach to DIBI Models

Authors: Tao Gu, Jialu Bao, Justin Hsu, Alexandra Silva, Fabio Zanasi

Abstract: The logic of Dependence and Independence Bunched Implications (DIBI) is a logic to reason about conditional independence (CI); for instance, DIBI formulas can characterise CI in probability distributions and relational databases, using the probabilistic and relational DIBI models, respectively. Despite the similarity of the probabilistic and relational models, a uniform, more abstract account rema… ▽ More The logic of Dependence and Independence Bunched Implications (DIBI) is a logic to reason about conditional independence (CI); for instance, DIBI formulas can characterise CI in probability distributions and relational databases, using the probabilistic and relational DIBI models, respectively. Despite the similarity of the probabilistic and relational models, a uniform, more abstract account remains unsolved. The laborious case-by-case verification of the frame conditions required for constructing new models also calls for such a treatment. In this paper, we develop an abstract framework for systematically constructing DIBI models, using category theory as the unifying mathematical language. In particular, we use string diagrams -- a graphical presentation of monoidal categories -- to give a uniform definition of the parallel composition and subkernel relation in DIBI models. Our approach not only generalises known models, but also yields new models of interest and reduces properties of DIBI models to structures in the underlying categories. Furthermore, our categorical framework enables a logical notion of CI, in terms of the satisfaction of specific DIBI formulas. We compare it with string diagrammatic approaches to CI and show that it is an extension of string diagrammatic CI under reasonable conditions. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 33 pages

arXiv:2401.05290 [pdf, other]

Analysis and Perspectives on the ANA Avatar XPRIZE Competition

Authors: Kris Hauser, Eleanor Watson, Joonbum Bae, Josh Bankston, Sven Behnke, Bill Borgia, Manuel G. Catalano, Stefano Dafarra, Jan B. F. van Erp, Thomas Ferris, Jeremy Fishel, Guy Hoffman, Serena Ivaldi, Fumio Kanehiro, Abderrahmane Kheddar, Gaelle Lannuzel, Jacqueline Ford Morie, Patrick Naughton, Steve NGuyen, Paul Oh, Taskin Padir, Jim Pippine, Jaeheung Park, Daniele Pucci, Jean Vaz , et al. (3 additional authors not shown)

Abstract: The ANA Avatar XPRIZE was a four-year competition to develop a robotic "avatar" system to allow a human operator to sense, communicate, and act in a remote environment as though physically present. The competition featured a unique requirement that judges would operate the avatars after less than one hour of training on the human-machine interfaces, and avatar systems were judged on both objective… ▽ More The ANA Avatar XPRIZE was a four-year competition to develop a robotic "avatar" system to allow a human operator to sense, communicate, and act in a remote environment as though physically present. The competition featured a unique requirement that judges would operate the avatars after less than one hour of training on the human-machine interfaces, and avatar systems were judged on both objective and subjective scoring metrics. This paper presents a unified summary and analysis of the competition from technical, judging, and organizational perspectives. We study the use of telerobotics technologies and innovations pursued by the competing teams in their avatar systems, and correlate the use of these technologies with judges' task performance and subjective survey ratings. It also summarizes perspectives from team leads, judges, and organizers about the competition's execution and impact to inform the future development of telerobotics and telepresence. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: 26 pages, preprint of article appearing in International Journal of Social Robotics

arXiv:2401.03018 [pdf, other]

Map** the Vertical Gas Structure of the Planet-hosting PDS 70 Disk

Authors: Charles J. Law, Myriam Benisty, Stefano Facchini, Richard Teague, Jaehan Bae, Andrea Isella, Inga Kamp, Karin I. Öberg, Bayron Portilla-Revelo, Luna Rampinelli

Abstract: PDS 70 hosts two massive, still-accreting planets and the inclined orientation of its protoplanetary disk presents a unique opportunity to directly probe the vertical gas structure of a planet-hosting disk. Here, we use high-spatial-resolution (${\approx}$0."1;10 au) observations in a set of CO isotopologue lines and HCO$^+$ J=4-3 to map the full 2D $(r,z)$ disk structure from the disk atmosphere,… ▽ More PDS 70 hosts two massive, still-accreting planets and the inclined orientation of its protoplanetary disk presents a unique opportunity to directly probe the vertical gas structure of a planet-hosting disk. Here, we use high-spatial-resolution (${\approx}$0."1;10 au) observations in a set of CO isotopologue lines and HCO$^+$ J=4-3 to map the full 2D $(r,z)$ disk structure from the disk atmosphere, as traced by $^{12}$CO, to closer to the midplane, as probed by less abundant isotopologues and HCO$^+$. In the PDS 70 disk, $^{12}$CO traces a height of $z/r\approx0.3$, $^{13}$CO is found at $z/r\approx0.1$, and C$^{18}$O originates at, or near, the midplane. The HCO$^+$ surface arises from $z/r\approx0.2$ and is one of the few non-CO emission surfaces constrained with high fidelity in disks to date. In the $^{12}$CO J=3-2 line, we resolve a vertical dip and steep rise in height at the cavity wall, making PDS 70 the first transition disk where this effect is directly seen in line emitting heights. In the outer disk, the CO emission heights of PDS 70 appear typical for its stellar mass and disk size and are not substantially altered by the two inner embedded planets. By combining CO isotopologue and HCO$^+$ lines, we derive the 2D gas temperature structure and estimate a midplane CO snowline of ${\approx}$56-85 au. This implies that both PDS 70b and 70c are located interior to the CO snowline and are likely accreting gas with a high C/O ratio of ${\approx}$1.0, which provides context for future planetary atmospheric measurements from, e.g., JWST, and for properly modeling their formation histories. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: 24 pages, 14 figures, accepted for publication in ApJ

arXiv:2401.02792 [pdf, other]

doi 10.1007/JHEP03(2024)140

The Origin of Calabi-Yau Crystals in BPS States Counting

Authors: Jiakang Bao, Rak-Kyeong Seong, Masahito Yamazaki

Abstract: We study the counting problem of BPS D-branes wrap** holomorphic cycles of a general toric Calabi-Yau manifold. We evaluate the Jeffrey-Kirwan residues for the flavoured Witten index for the supersymmetric quiver quantum mechanics on the worldvolume of the D-branes, and find that BPS degeneracies are described by a statistical mechanical model of crystal melting. For Calabi-Yau threefolds, we re… ▽ More We study the counting problem of BPS D-branes wrap** holomorphic cycles of a general toric Calabi-Yau manifold. We evaluate the Jeffrey-Kirwan residues for the flavoured Witten index for the supersymmetric quiver quantum mechanics on the worldvolume of the D-branes, and find that BPS degeneracies are described by a statistical mechanical model of crystal melting. For Calabi-Yau threefolds, we reproduce the crystal melting models long known in the literature. For Calabi-Yau fourfolds, however, we find that the crystal does not contain the full information for the BPS degeneracy and we need to explicitly evaluate non-trivial weights assigned to the crystal configurations. Our discussions treat Calabi-Yau threefolds and fourfolds on equal footing, and include discussions on elliptic and rational generalizations of the BPS states counting, connections to the mathematical definition of generalized Donaldson-Thomas invariants, examples of wall crossings, and of trialities in quiver gauge theories. △ Less

Submitted 27 March, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

Comments: 53 pages; v3 minor corrections, references added

Journal ref: J. High Energ. Phys. 2024, 140 (2024)

arXiv:2312.17510 [pdf, other]

Testing Database Engines via Query Plan Guidance

Authors: **sheng Ba, Manuel Rigger

Abstract: Database systems are widely used to store and query data. Test oracles have been proposed to find logic bugs in such systems, that is, bugs that cause the database system to compute an incorrect result. To realize a fully automated testing approach, such test oracles are paired with a test case generation technique; a test case refers to a database state and a query on which the test oracle can be… ▽ More Database systems are widely used to store and query data. Test oracles have been proposed to find logic bugs in such systems, that is, bugs that cause the database system to compute an incorrect result. To realize a fully automated testing approach, such test oracles are paired with a test case generation technique; a test case refers to a database state and a query on which the test oracle can be applied. In this work, we propose the concept of Query Plan Guidance (QPG) for guiding automated testing towards "interesting" test cases. SQL and other query languages are declarative. Thus, to execute a query, the database system translates every operator in the source language to one of potentially many so-called physical operators that can be executed; the tree of physical operators is referred to as the query plan. Our intuition is that by steering testing towards exploring diverse query plans, we also explore more interesting behaviors-some of which are potentially incorrect. To this end, we propose a mutation technique that gradually applies promising mutations to the database state, causing the DBMS to create diverse query plans for subsequent queries. We applied our method to three mature, widely-used, and extensively-tested database systems-SQLite, TiDB, and CockroachDB-and found 53 unique, previously unknown bugs. Our method exercises 4.85-408.48X more unique query plans than a naive random generation method and 7.46X more than a code coverage guidance method. Since most database systems-including commercial ones-expose query plans to the user, we consider QPG a generally applicable, black-box approach and believe that the core idea could also be applied in other contexts (e.g., to measure the quality of a test suite). △ Less

Submitted 29 December, 2023; originally announced December 2023.

Comments: ACM SIGSOFT Distinguished Paper Award in The 45th International Conference on Software Engineering (ICSE 2023)

arXiv:2312.16514 [pdf]

Global gyrokinetic simulation of magnetic island induced ion temperature gradient turbulence in toroidal plasma

Authors: **gchun Li, J. Bao, Z. Lin, J. Q. Dong, Yong Liu, Y. R. Qu

Abstract: The characteristics of ion temperature gradient (ITG) turbulence in the presence of a magnetic island are numerically investigated using a gyrokinetic model. We observe that in the absence of the usual ITG drive gradient, a solitary magnetic island alone can drive ITG instability. The magnetic island not only drives high-n modes of ITG instability but also induces low-n modes of vortex flow. Moreo… ▽ More The characteristics of ion temperature gradient (ITG) turbulence in the presence of a magnetic island are numerically investigated using a gyrokinetic model. We observe that in the absence of the usual ITG drive gradient, a solitary magnetic island alone can drive ITG instability. The magnetic island not only drives high-n modes of ITG instability but also induces low-n modes of vortex flow. Moreover, as the magnetic island width increases, the width of the vortex flow also increases. This implies that wider islands may more easily induce vortex flows. The study further indicates that the saturated amplitude and transport level of MI-induced ITG turbulence vary with different magnetic island widths. In general, larger magnetic islands enhance both particle and heat transport. When the magnetic island is of the order of 21 times the ion gyroradius (21\{rho}_i), the turbulence-driven transport level can reach the same level in cases where ITG is driven by pressure gradients. △ Less

Submitted 27 December, 2023; originally announced December 2023.

Comments: 12 pages, 4 figures

arXiv:2312.15465 [pdf, ps, other]

Transient growth of wavelet-based resolvent modes in the buffer layer of wall-bounded turbulence

Authors: Eric Ballouz, Scott T. M. Dawson, H. Jane Bae

Abstract: In this work, we study the transient growth of the principal resolvent modes in the minimal flow unit using a reformulation of resolvent analysis in a time-localized wavelet basis. We target the most energetic spatial wavenumbers for the minimal flow unit and obtain modes that are constant in the streamwise direction and once-periodic in the spanwise direction. The forcing modes are in the shape o… ▽ More In this work, we study the transient growth of the principal resolvent modes in the minimal flow unit using a reformulation of resolvent analysis in a time-localized wavelet basis. We target the most energetic spatial wavenumbers for the minimal flow unit and obtain modes that are constant in the streamwise direction and once-periodic in the spanwise direction. The forcing modes are in the shape of streamwise rolls, though pulse-like in time, and the response modes are in the form of transiently growing streaks. We inject the principal transient forcing mode at different intensities into a simulation of the minimal flow unit and compare the resulting nonlinear response to the linear one. The peak energy amplification scales quadratically with the intensity of the injected mode, and this peak occurs roughly at the same time for all forcing intensities. However, the larger energy amplification intensifies the magnitude of the nonlinear terms, which play an important role in dam** the energy growth and accelerating energy decay of the principal resolvent mode. We also observe that the dam** effect of the nonlinearities is less prominent close to the wall. Finally, we find that the principal resolvent forcing mode is more effective than other structures at amplifying the streak energy in the turbulent minimal-flow unit. In addition to lending support to the claim that linear mechanisms are important to near-wall turbulence, this work identifies time scales for the nonlinear breakdown of linearly-generated streaks. △ Less

Submitted 24 December, 2023; originally announced December 2023.

arXiv:2312.11459 [pdf, other]

VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder

Authors: Zhicong Tang, Shuyang Gu, Chunyu Wang, Ting Zhang, Jianmin Bao, Dong Chen, Baining Guo

Abstract: This paper introduces a pioneering 3D volumetric encoder designed for text-to-3D generation. To scale up the training data for the diffusion model, a lightweight network is developed to efficiently acquire feature volumes from multi-view images. The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net. This research further addresses the challenges of inaccur… ▽ More This paper introduces a pioneering 3D volumetric encoder designed for text-to-3D generation. To scale up the training data for the diffusion model, a lightweight network is developed to efficiently acquire feature volumes from multi-view images. The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net. This research further addresses the challenges of inaccurate object captions and high-dimensional feature volumes. The proposed model, trained on the public Objaverse dataset, demonstrates promising outcomes in producing diverse and recognizable samples from text prompts. Notably, it empowers finer control over object part characteristics through textual cues, fostering model creativity by seamlessly combining multiple concepts within a single object. This research significantly contributes to the progress of 3D generation by introducing an efficient, flexible, and scalable representation methodology. Code is available at https://github.com/checkcrab/VolumeDiffusion. △ Less

Submitted 28 April, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.11210 [pdf, other]

Axion-Mediated Inelastic Dark Matter

Authors: Kyu Jung Bae, Jongkuk Kim

Abstract: We consider the axion-mediated scattering processes between dark matter (DM) and nucleon. The substantial contributions are made via the CP-odd gluonic current. Since the QCD axion is too feebly coupled to the visible particles, non-QCD axions are necessary to accomplish the relevant sensitivity from the current DM experiments. In the case of multi-component DM models, the inelastic scattering pro… ▽ More We consider the axion-mediated scattering processes between dark matter (DM) and nucleon. The substantial contributions are made via the CP-odd gluonic current. Since the QCD axion is too feebly coupled to the visible particles, non-QCD axions are necessary to accomplish the relevant sensitivity from the current DM experiments. In the case of multi-component DM models, the inelastic scattering processes also make sizable contributions to the direct detection. The supersymmetry (SUSY) and clockwork (CW) mechanism provide a realistic model for the QCD axion and the axion-mediated DM scattering processes. In the SUSY CW axion model, the lightest axino is the DM particle and the axions mediate the elastic and inelastic scattering processes between the DM axino and nucleon. We show that the current and future XENONnT can produce relevant constraints for some parameter space of the model. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 12 pages, 7 figures

arXiv:2312.08985 [pdf, other]

OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers

Authors: Han Liang, Jiacheng Bao, Ruichi Zhang, Sihan Ren, Yuecheng Xu, Sibei Yang, Xin Chen, **gyi Yu, Lan Xu

Abstract: We have recently seen tremendous progress in realistic text-to-motion generation. Yet, the existing methods often fail or produce implausible motions with unseen text inputs, which limits the applications. In this paper, we present OMG, a novel framework, which enables compelling motion generation from zero-shot open-vocabulary text prompts. Our key idea is to carefully tailor the pretrain-then-fi… ▽ More We have recently seen tremendous progress in realistic text-to-motion generation. Yet, the existing methods often fail or produce implausible motions with unseen text inputs, which limits the applications. In this paper, we present OMG, a novel framework, which enables compelling motion generation from zero-shot open-vocabulary text prompts. Our key idea is to carefully tailor the pretrain-then-finetune paradigm into the text-to-motion generation. At the pre-training stage, our model improves the generation ability by learning the rich out-of-domain inherent motion traits. To this end, we scale up a large unconditional diffusion model up to 1B parameters, so as to utilize the massive unlabeled motion data up to over 20M motion instances. At the subsequent fine-tuning stage, we introduce motion ControlNet, which incorporates text prompts as conditioning information, through a trainable copy of the pre-trained model and the proposed novel Mixture-of-Controllers (MoC) block. MoC block adaptively recognizes various ranges of the sub-motions with a cross-attention mechanism and processes them separately with the text-token-specific experts. Such a design effectively aligns the CLIP token embeddings of text prompts to various ranges of compact and expressive motion features. Extensive experiments demonstrate that our OMG achieves significant improvements over the state-of-the-art methods on zero-shot text-to-motion generation. Project page: https://tr3e.github.io/omg-page. △ Less

Submitted 19 March, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: accepted by CVPR 2024

arXiv:2312.04528 [pdf, other]

Using Large Language Models for Hyperparameter Optimization

Authors: Michael R. Zhang, Nishkrit Desai, Juhan Bae, Jonathan Lorraine, Jimmy Ba

Abstract: This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO). Empirical evaluations demonstrate that in settings with constrained search budgets, LLMs can perform comparably or better than traditional HPO methods like random search and Bayesian optimization on standard benchmarks. Furthermore, we propose to treat the code specifying… ▽ More This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO). Empirical evaluations demonstrate that in settings with constrained search budgets, LLMs can perform comparably or better than traditional HPO methods like random search and Bayesian optimization on standard benchmarks. Furthermore, we propose to treat the code specifying our model as a hyperparameter, which the LLM outputs, going beyond the capabilities of existing HPO approaches. Our findings suggest that LLMs are a promising tool for improving efficiency in the traditional decision-making problem of hyperparameter optimization. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: 29 pages

arXiv:2312.02728 [pdf, other]

Overview of RIS-Enabled Secure Transmission in 6G Wireless Networks

Authors: JungSook Bae, Waqas Khalid, Anseok Lee, Heesoo Lee, Song Noh, Heejung Yu

Abstract: As sixth-generation (6G) wireless communication networks evolve, privacy concerns are expected due to the transmission of vast amounts of security-sensitive private information. In this context, a reconfigurable intelligent surface (RIS) emerges as a promising technology capable of enhancing transmission efficiency and strengthening information security. This study demonstrates how RISs can play a… ▽ More As sixth-generation (6G) wireless communication networks evolve, privacy concerns are expected due to the transmission of vast amounts of security-sensitive private information. In this context, a reconfigurable intelligent surface (RIS) emerges as a promising technology capable of enhancing transmission efficiency and strengthening information security. This study demonstrates how RISs can play a crucial role in making 6G networks more secure against eavesdrop** attacks. We discuss the fundamentals, and standardization aspects of RISs, along with an in-depth analysis of physical-layer security (PLS). Our discussion centers on PLS design using RIS, highlighting aspects like beamforming, resource allocation, artificial noise, and cooperative communications. We also identify the research issues, propose potential solutions, and explore future perspectives. Finally, numerical results are provided to support our discussions and demonstrate the enhanced security enabled by RIS. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: Accepted for Digital Communications and Networks(DCN)

arXiv:2312.02520 [pdf, other]

Towards More Unified In-context Visual Understanding

Authors: Dianmo Sheng, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Tao Gong, Bin Liu, Shengwei Xu, Nenghai Yu

Abstract: The rapid advancement of large language models (LLMs) has accelerated the emergence of in-context learning (ICL) as a cutting-edge approach in the natural language processing domain. Recently, ICL has been employed in visual understanding tasks, such as semantic segmentation and image captioning, yielding promising results. However, existing visual ICL framework can not enable producing content ac… ▽ More The rapid advancement of large language models (LLMs) has accelerated the emergence of in-context learning (ICL) as a cutting-edge approach in the natural language processing domain. Recently, ICL has been employed in visual understanding tasks, such as semantic segmentation and image captioning, yielding promising results. However, existing visual ICL framework can not enable producing content across multiple modalities, which limits their potential usage scenarios. To address this issue, we present a new ICL framework for visual understanding with multi-modal output enabled. First, we quantize and embed both text and visual prompt into a unified representational space, structured as interleaved in-context sequences. Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them, facilitating in-context learning. Thanks to this design, the model is capable of handling in-context vision understanding tasks with multimodal output in a unified pipeline.Experimental results demonstrate that our model achieves competitive performance compared with specialized models and previous ICL baselines. Overall, our research takes a further step toward unified multimodal in-context learning. △ Less

Submitted 16 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: Accepted by CVPR 2024

arXiv:2311.18834 [pdf, other]

ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with Diffusion Models

Authors: Wenming Weng, Ruoyu Feng, Yanhui Wang, Qi Dai, Chunyu Wang, Dacheng Yin, Zhiyuan Zhao, Kai Qiu, Jianmin Bao, Yuhui Yuan, Chong Luo, Yueyi Zhang, Zhiwei Xiong

Abstract: We present ART$\boldsymbol{\cdot}$V, an efficient framework for auto-regressive video generation with diffusion models. Unlike existing methods that generate entire videos in one-shot, ART$\boldsymbol{\cdot}$V generates a single frame at a time, conditioned on the previous ones. The framework offers three distinct advantages. First, it only learns simple continual motions between adjacent frames,… ▽ More We present ART$\boldsymbol{\cdot}$V, an efficient framework for auto-regressive video generation with diffusion models. Unlike existing methods that generate entire videos in one-shot, ART$\boldsymbol{\cdot}$V generates a single frame at a time, conditioned on the previous ones. The framework offers three distinct advantages. First, it only learns simple continual motions between adjacent frames, therefore avoiding modeling complex long-range motions that require huge training data. Second, it preserves the high-fidelity generation ability of the pre-trained image diffusion models by making only minimal network modifications. Third, it can generate arbitrarily long videos conditioned on a variety of prompts such as text, image or their combinations, making it highly versatile and flexible. To combat the common drifting issue in AR models, we propose masked diffusion model which implicitly learns which information can be drawn from reference images rather than network predictions, in order to reduce the risk of generating inconsistent appearances that cause drifting. Moreover, we further enhance generation coherence by conditioning it on the initial frame, which typically contains minimal noise. This is particularly useful for long video generation. When trained for only two weeks on four GPUs, ART$\boldsymbol{\cdot}$V already can generate videos with natural motions, rich details and a high level of aesthetic quality. Besides, it enables various appealing applications, e.g., composing a long video from multiple text prompts. △ Less

Submitted 30 November, 2023; originally announced November 2023.

Comments: 24 pages, 21 figures. Project page at https://warranweng.github.io/art.v

arXiv:2311.18829 [pdf, other]

MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation

Authors: Yanhui Wang, Jianmin Bao, Wenming Weng, Ruoyu Feng, Dacheng Yin, Tao Yang, **gxu Zhang, Qi Dai Zhiyuan Zhao, Chunyu Wang, Kai Qiu, Yuhui Yuan, Chuanxin Tang, Xiaoyan Sun, Chong Luo, Baining Guo

Abstract: We present MicroCinema, a straightforward yet effective framework for high-quality and coherent text-to-video generation. Unlike existing approaches that align text prompts with video directly, MicroCinema introduces a Divide-and-Conquer strategy which divides the text-to-video into a two-stage process: text-to-image generation and image\&text-to-video generation. This strategy offers two signific… ▽ More We present MicroCinema, a straightforward yet effective framework for high-quality and coherent text-to-video generation. Unlike existing approaches that align text prompts with video directly, MicroCinema introduces a Divide-and-Conquer strategy which divides the text-to-video into a two-stage process: text-to-image generation and image\&text-to-video generation. This strategy offers two significant advantages. a) It allows us to take full advantage of the recent advances in text-to-image models, such as Stable Diffusion, Midjourney, and DALLE, to generate photorealistic and highly detailed images. b) Leveraging the generated image, the model can allocate less focus to fine-grained appearance details, prioritizing the efficient learning of motion dynamics. To implement this strategy effectively, we introduce two core designs. First, we propose the Appearance Injection Network, enhancing the preservation of the appearance of the given image. Second, we introduce the Appearance Noise Prior, a novel mechanism aimed at maintaining the capabilities of pre-trained 2D diffusion models. These design elements empower MicroCinema to generate high-quality videos with precise motion, guided by the provided text prompts. Extensive experiments demonstrate the superiority of the proposed framework. Concretely, MicroCinema achieves SOTA zero-shot FVD of 342.86 on UCF-101 and 377.40 on MSR-VTT. See https://wangyanhui666.github.io/MicroCinema.github.io/ for video samples. △ Less

Submitted 29 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

Comments: Project page: https://wangyanhui666.github.io/MicroCinema.github.io/

arXiv:2311.17866 [pdf, other]

Rabinowitz Floer homology for prequantization bundles and Floer Gysin sequence

Authors: Joonghyun Bae, Jungsoo Kang, Sungho Kim

Abstract: Let $Y$ be a prequantization bundle over a closed spherically monotone symplectic manifold $Σ$. Adapting an idea due to Diogo and Lisi, we study a split version of Rabinowitz Floer homology for $Y$ in the following two settings. First, $Σ$ is a symplectic hyperplane section of a closed symplectic manifold $X$ satisfying a certain monotonicity condition; in this case, $X \setminus Σ$ is a Liouville… ▽ More Let $Y$ be a prequantization bundle over a closed spherically monotone symplectic manifold $Σ$. Adapting an idea due to Diogo and Lisi, we study a split version of Rabinowitz Floer homology for $Y$ in the following two settings. First, $Σ$ is a symplectic hyperplane section of a closed symplectic manifold $X$ satisfying a certain monotonicity condition; in this case, $X \setminus Σ$ is a Liouville filling of $Y$. Second, the minimal Chern number of $Σ$ is greater than one, which is the case where the Rabinowitz Floer homology of the symplectization $\mathbb{R} \times Y$ is defined. In both cases, we construct a Gysin-type exact sequence connecting the Rabinowitz Floer homology of $X\setminusΣ$ or $\mathbb{R} \times Y$ and the quantum homology of $Σ$. As applications, we discuss the invertibility of a symplectic hyperplane section class in quantum homology, the isotopy problem for fibered Dehn twists, the orderability problem for prequantization bundles, and the existence of translated points. We also provide computational results based on the exact sequence that we construct. △ Less

Submitted 9 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: 80 pages, 7 figures. Comments welcome! v2: minor revision, mistakes in Section 6.2.3 corrected

MSC Class: 53D40

arXiv:2311.16401 [pdf, ps, other]

On the quantum time complexity of divide and conquer

Authors: Jonathan Allcock, **ge Bao, Aleksandrs Belovs, Troy Lee, Miklos Santha

Abstract: We initiate a systematic study of the time complexity of quantum divide and conquer algorithms for classical problems. We establish generic conditions under which search and minimization problems with classical divide and conquer algorithms are amenable to quantum speedup and apply these theorems to an array of problems involving strings, integers, and geometric objects. They include LONGEST DISTI… ▽ More We initiate a systematic study of the time complexity of quantum divide and conquer algorithms for classical problems. We establish generic conditions under which search and minimization problems with classical divide and conquer algorithms are amenable to quantum speedup and apply these theorems to an array of problems involving strings, integers, and geometric objects. They include LONGEST DISTINCT SUBSTRING, KLEE'S COVERAGE, several optimization problems on stock transactions, and k-INCREASING SUBSEQUENCE. For most of these results, our quantum time upper bound matches the quantum query lower bound for the problem, up to polylogarithmic factors. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 48 pages, accepted to QIP 2024

arXiv:2311.15615 [pdf, other]

Technical Report for Argoverse Challenges on Unified Sensor-based Detection, Tracking, and Forecasting

Authors: Zhepeng Wang, Feng Chen, Kanokphan Lertniphonphan, Siwei Chen, **yao Bao, Pengfei Zheng, **bao Zhang, Kaer Huang, Tao Zhang

Abstract: This report presents our Le3DE2E solution for unified sensor-based detection, tracking, and forecasting in Argoverse Challenges at CVPR 2023 Workshop on Autonomous Driving (WAD). We propose a unified network that incorporates three tasks, including detection, tracking, and forecasting. This solution adopts a strong Bird's Eye View (BEV) encoder with spatial and temporal fusion and generates unifie… ▽ More This report presents our Le3DE2E solution for unified sensor-based detection, tracking, and forecasting in Argoverse Challenges at CVPR 2023 Workshop on Autonomous Driving (WAD). We propose a unified network that incorporates three tasks, including detection, tracking, and forecasting. This solution adopts a strong Bird's Eye View (BEV) encoder with spatial and temporal fusion and generates unified representations for multi-tasks. The solution was tested in the Argoverse 2 sensor dataset to evaluate the detection, tracking, and forecasting of 26 object categories. We achieved 1st place in Detection, Tracking, and Forecasting on the E2E Forecasting track in Argoverse Challenges at CVPR 2023 WAD. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.13308 [pdf, ps, other]

Superconductivity and Charge-density-wave-like Transition in Th2Cu4As5

Authors: Qing-Chen Duan, Shao-Hua Liu, Bai-Zhuo Li, Jiao-Jiao Meng, Wu-Zhang Yang, Yi Liu, Yi-Qiang Lin, Si-Qi Wu, Jia-Yi Lu, **-Ke Bao, Yu-Sen Xiao, Xin-Yu Zhao, Yu-Xue Mei, Yu-** Sun, Dan Yu, Shu-Gang Tan, Qiang **g, Rui-Dan Zhong, Yong-Liang Chen, Yong Zhao, Zhi Ren, Cao Wang, Guang-Han Cao

Abstract: We report the synthesis, crystal structure, and physical properties of a novel ternary compound, Th$_2$Cu$_4$As$_5$. The material crystallizes in a tetragonal structure with lattice parameters $a=4.0716(1)$ Å and $c=24.8131(4)$ Å. Its structure can be described as an alternating stacking of fluorite-type Th$_2$As$_2$ layers with antifluorite-type double-layered Cu$_4$As$_3$ slabs. The measurement… ▽ More We report the synthesis, crystal structure, and physical properties of a novel ternary compound, Th$_2$Cu$_4$As$_5$. The material crystallizes in a tetragonal structure with lattice parameters $a=4.0716(1)$ Å and $c=24.8131(4)$ Å. Its structure can be described as an alternating stacking of fluorite-type Th$_2$As$_2$ layers with antifluorite-type double-layered Cu$_4$As$_3$ slabs. The measurement of electrical resistivity, magnetic susceptibility and specific heat reveals that Th$_2$Cu$_4$As$_5$ undergoes bulk superconducting transition at 4.2 K. Moreover, all these physical quantities exhibit anomalies at 48 K, where the Hall coefficient change the sign. These findings suggest a charge-density-wave-like (CDW) transition, making Th$_2$Cu$_4$As$_5$ a rare example for studying the interplay between CDW and superconductivity. △ Less

Submitted 22 November, 2023; originally announced November 2023.

Comments: 11 pages, 6 figures, and 1 table

arXiv:2311.10827 [pdf, other]

A well-balanced lattice Boltzmann model for binary fluids based on the incompressible phase-field theory

Authors: Long Ju, Peiyao Liu, Bicheng Yan, ** Bao, Shuyu Sun, Zhaoli Guo

Abstract: Spurious velocities arising from the imperfect offset of the undesired term at the discrete level are frequently observed in numerical simulations of equilibrium multiphase flow systems using the lattice Boltzmann equation (LBE) method. To capture the physical equilibrium state of two-phase fluid systems and eliminate spurious velocities, a well-balanced LBE model based on the incompressible phase… ▽ More Spurious velocities arising from the imperfect offset of the undesired term at the discrete level are frequently observed in numerical simulations of equilibrium multiphase flow systems using the lattice Boltzmann equation (LBE) method. To capture the physical equilibrium state of two-phase fluid systems and eliminate spurious velocities, a well-balanced LBE model based on the incompressible phase-field theory is developed. In this model, the equilibrium distribution function for the Cahn-Hilliard (CH) equation is designed by treating the convection term as a source to avoid the introduction of undesired terms, enabling achievement of possible discrete force balance. Furthermore, this approach allows for the attainment of a divergence-free velocity field, effectively mitigating the impact of artificial compression effects and enhancing numerical stability. Numerical tests, including a flat interface problem, a stationary droplet, and the coalescence of two droplets, demonstrate the well-balanced properties and improvements in the stability of the present model. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2311.04496 [pdf, other]

PersonMAE: Person Re-Identification Pre-Training with Masked AutoEncoders

Authors: Hezhen Hu, Xiaoyi Dong, Jianmin Bao, Dongdong Chen, Lu Yuan, Dong Chen, Houqiang Li

Abstract: Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID). We argue that a high-quality ReID representation should have three properties, namely, multi-level awareness, occlusion robustness, and cross-region invariance. To this end, we propose a simple yet effective pre-training framework, namely PersonMAE, which involves… ▽ More Pre-training is playing an increasingly important role in learning generic feature representation for Person Re-identification (ReID). We argue that a high-quality ReID representation should have three properties, namely, multi-level awareness, occlusion robustness, and cross-region invariance. To this end, we propose a simple yet effective pre-training framework, namely PersonMAE, which involves two core designs into masked autoencoders to better serve the task of Person Re-ID. 1) PersonMAE generates two regions from the given image with RegionA as the input and \textit{RegionB} as the prediction target. RegionA is corrupted with block-wise masking to mimic common occlusion in ReID and its remaining visible parts are fed into the encoder. 2) Then PersonMAE aims to predict the whole RegionB at both pixel level and semantic feature level. It encourages its pre-trained feature representations with the three properties mentioned above. These properties make PersonMAE compatible with downstream Person ReID tasks, leading to state-of-the-art performance on four downstream ReID tasks, i.e., supervised (holistic and occluded setting), and unsupervised (UDA and USL setting). Notably, on the commonly adopted supervised setting, PersonMAE with ViT-B backbone achieves 79.8% and 69.5% mAP on the MSMT17 and OccDuke datasets, surpassing the previous state-of-the-art by a large margin of +8.0 mAP, and +5.3 mAP, respectively. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2311.03558 [pdf]

Replication and study of anomalies in LK-99--the alleged ambient-pressure, room-temperature superconductor

Authors: T. Habamahoro, T. Bontke, M. Chirom, Z. Wu, J. M. Bao, L. Z. Deng, C. W. Chu

Abstract: We have studied LK-99 [Pb$_{10-x}$Cu$_x$(PO$_4$)$_6$O], alleged by Lee et al. to exhibit superconductivity above room temperature and at ambient pressure, and have reproduced all anomalies in electric and magnetic measurements that they reported as evidence for the claim of LK-99 being an ambient-pressure, room-temperature superconductor. We found that these anomalies are associated with the struc… ▽ More We have studied LK-99 [Pb$_{10-x}$Cu$_x$(PO$_4$)$_6$O], alleged by Lee et al. to exhibit superconductivity above room temperature and at ambient pressure, and have reproduced all anomalies in electric and magnetic measurements that they reported as evidence for the claim of LK-99 being an ambient-pressure, room-temperature superconductor. We found that these anomalies are associated with the structural transition of the Cu$_2$S impurity in their sample and not with superconductivity. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 15 pages, 7 figures

arXiv:2310.19808 [pdf]

doi 10.1016/j.ces.2020.115693

Investigation of countercurrent flow profile and liquid holdup in random packed column with local CFD data

Authors: Yucheng Fu, Jie Bao, Rajesh Kumar Singh, Chao Wang, Zhijie Xu

Abstract: Liquid holdup and mass transfer area are critical parameters for packed column design and CO2 capture efficiency prediction. In this paper, a framework was established for modeling the liquid-gas countercurrent flow hydrodynamics in a random packed column with pall rings. Besides the column-averaged information, the radial pall ring distribution, velocity, and liquid holdup profiles are obtained t… ▽ More Liquid holdup and mass transfer area are critical parameters for packed column design and CO2 capture efficiency prediction. In this paper, a framework was established for modeling the liquid-gas countercurrent flow hydrodynamics in a random packed column with pall rings. Besides the column-averaged information, the radial pall ring distribution, velocity, and liquid holdup profiles are obtained to study the entrance effect and the wall influence in the packed column. With local CFD data, the validated packing specific area ap and liquid velocity uL range for liquid holdup correlation is significantly expanded with respect to existing experimental or column-averaged CFD data. The proposed liquid holdup correlation $h_L \propto u_L^{0.44}$ indicates the random packed column falls in a viscous to turbulent transition regime and it covers a Reynolds Number range of [6.7-40.2]. The derived liquid holdup correlation is in good agreement with existing correlations developed using the column-averaged experimental data. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Journal ref: Chemical Engineering Science, Vol. 221 115693 2020

arXiv:2310.19264 [pdf, other]

Sound of Story: Multi-modal Storytelling with Audio

Authors: Jaeyeon Bae, Seokhoon Jeong, Seokun Kang, Namgi Han, Jae-Yon Lee, Hyounghun Kim, Taehwan Kim

Abstract: Storytelling is multi-modal in the real world. When one tells a story, one may use all of the visualizations and sounds along with the story itself. However, prior studies on storytelling datasets and tasks have paid little attention to sound even though sound also conveys meaningful semantics of the story. Therefore, we propose to extend story understanding and telling areas by establishing a new… ▽ More Storytelling is multi-modal in the real world. When one tells a story, one may use all of the visualizations and sounds along with the story itself. However, prior studies on storytelling datasets and tasks have paid little attention to sound even though sound also conveys meaningful semantics of the story. Therefore, we propose to extend story understanding and telling areas by establishing a new component called "background sound" which is story context-based audio without any linguistic information. For this purpose, we introduce a new dataset, called "Sound of Story (SoS)", which has paired image and text sequences with corresponding sound or background music for a story. To the best of our knowledge, this is the largest well-curated dataset for storytelling with sound. Our SoS dataset consists of 27,354 stories with 19.6 images per story and 984 hours of speech-decoupled audio such as background music and other sounds. As benchmark tasks for storytelling with sound and the dataset, we propose retrieval tasks between modalities, and audio generation tasks from image-text sequences, introducing strong baselines for them. We believe the proposed dataset and tasks may shed light on the multi-modal understanding of storytelling in terms of sound. Downloading the dataset and baseline codes for each task will be released in the link: https://github.com/Sosdatasets/SoS_Dataset. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: Findings of EMNLP 2023, project: https://github.com/Sosdatasets/SoS_Dataset/

arXiv:2310.16873 [pdf, other]

doi 10.1051/0004-6361/202347109

Disk Evolution Study Through Imaging of Nearby Young Stars (DESTINYS): HD 34700 A unveils an inner ring

Authors: G. Columba, E. Rigliaco, R. Gratton, D. Mesa, V. D'Orazi, C. Ginski, N. Engler, J. P. Williams, J. Bae, M. Benisty, T. Birnstiel, P. Delorme, C. Dominik, S. Facchini, F. Menard, P. Pinilla, C. Rab, Á. Ribas, V. Squicciarini, R. G. van Holstein, A. Zurlo

Abstract: Context. The study of protoplanetary disks is fundamental to understand their evolution and interaction with the surrounding environment, and to constrain planet formation mechanisms. Aims. We aim at characterising the young binary system HD 34700 A, which shows a wealth of structures. Methods. Taking advantage of the high-contrast imaging instruments SPHERE at the VLT, LMIRCam at the LBT, and… ▽ More Context. The study of protoplanetary disks is fundamental to understand their evolution and interaction with the surrounding environment, and to constrain planet formation mechanisms. Aims. We aim at characterising the young binary system HD 34700 A, which shows a wealth of structures. Methods. Taking advantage of the high-contrast imaging instruments SPHERE at the VLT, LMIRCam at the LBT, and of ALMA observations, we analyse this system at multiple wavelengths. We study the rings and spiral arms morphology and the scattering properties of the dust. We discuss the possible causes of all the observed features. Results. We detect for the first time, in the H$α$ band, a ring extending from $\sim$65 au to ${\sim}$120 au, inside the ring already known from recent studies. These two have different physical and geometrical properties. Based on the scattering properties, the outer ring may consist of grains of typical size $a_{out} > 4 μm$, while the inner ring of smaller grains ($a_{in} <= 0.4 {μm}$). Two extended logarithmic spiral arms stem from opposite sides of the disk. The outer ring appears as a spiral arm itself, with a variable radial distance from the centre and extended substructures. ALMA data confirm the presence of a millimetric dust substructure centred just outside the outer ring, and detect misaligned gas rotation patterns for HD 34700 A and B. Conclusions. The complexity of HD 34700 A, revealed by the variety of observed features, suggests the existence of one or more disk-sha** physical mechanisms. Possible scenarios, compatible with our findings, involve the presence inside the disk of a yet undetected planet of several Jupiter masses and the system interaction with the surroundings by means of gas cloudlet capture or flybys. Further observations with JWST/MIRI or ALMA (gas kinematics) could shed more light on these. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: Accepted for publication on A&A. 14 + 5 pages, 9 + 7 figures (text + appendix)

Journal ref: A&A 681, A19 (2024)

arXiv:2310.15430 [pdf, other]

A Companion in V1247 Ori Supported by Spiral Arm Pattern Motion

Authors: Bin B. Ren, Chen Xie, Myriam Benisty, Ruobing Dong, Jaehan Bae, Tomas Stolker, Rob G. van Holstein, John H. Debes, Antonio Garufi, Christian Ginski, Stefan Kraus

Abstract: While there have been nearly two dozen of spiral arms detected from planet-forming disks in near-infrared scattered light, none of their substellar drivers have been confirmed. By observing spiral systems in at least two epochs spanning multiple years, and measuring the motion of the spirals, we can distinguish the cause of the spirals, and locate the orbits of the driving planets if they trigger… ▽ More While there have been nearly two dozen of spiral arms detected from planet-forming disks in near-infrared scattered light, none of their substellar drivers have been confirmed. By observing spiral systems in at least two epochs spanning multiple years, and measuring the motion of the spirals, we can distinguish the cause of the spirals, and locate the orbits of the driving planets if they trigger the spirals. Upon a recent validation of this approach using the co-motion between a stellar companion and a spiral, we obtained a second epoch observation for the spiral system in the disk of V1247 Ori in the $H$-band polarized scattered light using VLT/SPHERE/IRDIS. Combining our observations with archival IRDIS data, we established a $4.8$ yr timeline to constrain the V1247 Ori spiral motion. We obtained a pattern speed of $0.40^{\circ} \pm 0.09^{\circ}$ yr$^{-1}$ for the north-east spiral. This corresponds to an orbital period of $900\pm200$ yr, and thus the semi-major axis of the hidden planetary driver is $118\pm19$ au for a 2.0 $\pm$ 0.1 M$_\odot$ central star. The location agrees with the gap in ALMA dust continuum observations, providing joint support for the existence of a companion driving the scattered-light spirals while carving a millimeter gap. With an angular separation of 0.29" $\pm$ 0.05", this hidden companion is an ideal target for JWST imaging. △ Less

Submitted 7 December, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted for publication in Astronomy and Astrophysics; 6 pages, 5 figures

arXiv:2310.13356

Sync-NeRF: Generalizing Dynamic NeRFs to Unsynchronized Videos

Authors: Seoha Kim, Jeongmin Bae, Youngsik Yun, Hahyun Lee, Gun Bang, Youngjung Uh

Abstract: Recent advancements in 4D scene reconstruction using neural radiance fields (NeRF) have demonstrated the ability to represent dynamic scenes from multi-view videos. However, they fail to reconstruct the dynamic scenes and struggle to fit even the training views in unsynchronized settings. It happens because they employ a single latent embedding for a frame while the multi-view images at the same f… ▽ More Recent advancements in 4D scene reconstruction using neural radiance fields (NeRF) have demonstrated the ability to represent dynamic scenes from multi-view videos. However, they fail to reconstruct the dynamic scenes and struggle to fit even the training views in unsynchronized settings. It happens because they employ a single latent embedding for a frame while the multi-view images at the same frame were actually captured at different moments. To address this limitation, we introduce time offsets for individual unsynchronized videos and jointly optimize the offsets with NeRF. By design, our method is applicable for various baselines and improves them with large margins. Furthermore, finding the offsets naturally works as synchronizing the videos without manual effort. Experiments are conducted on the common Plenoptic Video Dataset and a newly built Unsynchronized Dynamic Blender Dataset to verify the performance of our method. Project page: https://seoha-kim.github.io/sync-nerf △ Less

Submitted 21 May, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

Comments: I need to revise the text (it takes more than a month)

arXiv:2310.09579 [pdf, other]

Existence and asymptotic behavior of entire large solutions for Hessian equations

Authors: Xiang Li, Jiguang Bao

Abstract: In this paper, we give some existence and nonexistence results for nonradial entire large solutions of the Hessian equation $S_k\left(D^2 u\right)=b(x) u^γ$ in the sublinear case $0<γ<k$. The exact asymptotic behavior of large solutions at infinity is also studied when $b(x)$ is the oscillation of a radial function $|x|^{-l}$ at infinity for $l\leq k-1$. In this paper, we give some existence and nonexistence results for nonradial entire large solutions of the Hessian equation $S_k\left(D^2 u\right)=b(x) u^γ$ in the sublinear case $0<γ<k$. The exact asymptotic behavior of large solutions at infinity is also studied when $b(x)$ is the oscillation of a radial function $|x|^{-l}$ at infinity for $l\leq k-1$. △ Less

Submitted 14 October, 2023; originally announced October 2023.

arXiv:2310.08589 [pdf, other]

doi 10.1051/0004-6361/202347353

Protoplanetary disks in $K_s$-band total intensity and polarized light

Authors: Bin B. Ren, Myriam Benisty, Christian Ginski, Ryo Tazaki, Nicole L. Wallack, Julien Milli, Antonio Garufi, Jaehan Bae, Stefano Facchini, François Ménard, Paola Pinilla, C. Swastik, Richard Teague, Zahed Wahhaj

Abstract: Diverse protoplanetary disk morphology can result from planet-disk interaction, suggesting planetary presence. To date, most scattered light imaging campaigns have probed polarized light, which is only a fraction of the total light and not very sensitive to planets. To observe and characterize protoplanetary disk systems in the near-infrared in both polarized and total intensity light, we carried… ▽ More Diverse protoplanetary disk morphology can result from planet-disk interaction, suggesting planetary presence. To date, most scattered light imaging campaigns have probed polarized light, which is only a fraction of the total light and not very sensitive to planets. To observe and characterize protoplanetary disk systems in the near-infrared in both polarized and total intensity light, we carried out an unprecedented study of scattering properties of disks, as well as of any planetary companions. Using SPHERE with star-hop** at the Very Large Telescope, we observed 29 disk hosts and their reference stars in $K_s$-band polarized light. We extracted disks in total intensity by adopting the data imputation concept with sequential non-negative matrix factorization (DI-sNMF). We obtained high-quality disk images in total intensity for 15 systems and in polarized light for 23. For well-recovered disks in polarized light and total intensity, we parameterized the polarization fraction phase functions using scaled beta distribution: the peak of polarization fraction tentatively correlates with the peak scattering angle, which could be reproduced using certain compact dust, yet more detailed modeling studies are needed. We investigated the empirical DI-sNMF detectability of disks using logistic regression: total intensity detectability of disks primarily depends on host star brightness. For disks with SPHERE data in $Y$-/$J$-/$H$-band, we summarized their polarized color at ~90 deg scattering angle: most of disks are blue in polarized $J-K_s$ color, and they are relatively redder as stellar luminosity increases, indicating larger scatterers. High-quality disk imagery in both total intensity and polarized light thus allows for disk characterization in polarization fraction, and reduces the confusion between disk and planetary signals. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 25 pages, 16 figures, 3 tables, A&A accepted. Data files in FITS format will be publicly available

Journal ref: A&A 680, A114 (2023)

arXiv:2310.06786 [pdf, other]

OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text

Authors: Keiran Paster, Marco Dos Santos, Zhangir Azerbayev, Jimmy Ba

Abstract: There is growing evidence that pretraining on high quality, carefully thought-out tokens such as code or mathematics plays an important role in improving the reasoning abilities of large language models. For example, Minerva, a PaLM model finetuned on billions of tokens of mathematical documents from arXiv and the web, reported dramatically improved performance on problems that require quantitativ… ▽ More There is growing evidence that pretraining on high quality, carefully thought-out tokens such as code or mathematics plays an important role in improving the reasoning abilities of large language models. For example, Minerva, a PaLM model finetuned on billions of tokens of mathematical documents from arXiv and the web, reported dramatically improved performance on problems that require quantitative reasoning. However, because all known open source web datasets employ preprocessing that does not faithfully preserve mathematical notation, the benefits of large scale training on quantitive web documents are unavailable to the research community. We introduce OpenWebMath, an open dataset inspired by these works containing 14.7B tokens of mathematical webpages from Common Crawl. We describe in detail our method for extracting text and LaTeX content and removing boilerplate from HTML documents, as well as our methods for quality filtering and deduplication. Additionally, we run small-scale experiments by training 1.4B parameter language models on OpenWebMath, showing that models trained on 14.7B tokens of our dataset surpass the performance of models trained on over 20x the amount of general language data. We hope that our dataset, openly released on the Hugging Face Hub, will help spur advances in the reasoning abilities of large language models. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.05366 [pdf, other]

Rotation Matters: Generalized Monocular 3D Object Detection for Various Camera Systems

Authors: SungHo Moon, **Woo Bae, SungHoon Im

Abstract: Research on monocular 3D object detection is being actively studied, and as a result, performance has been steadily improving. However, 3D object detection performance is significantly reduced when applied to a camera system different from the system used to capture the training datasets. For example, a 3D detector trained on datasets from a passenger car mostly fails to regress accurate 3D boundi… ▽ More Research on monocular 3D object detection is being actively studied, and as a result, performance has been steadily improving. However, 3D object detection performance is significantly reduced when applied to a camera system different from the system used to capture the training datasets. For example, a 3D detector trained on datasets from a passenger car mostly fails to regress accurate 3D bounding boxes for a camera mounted on a bus. In this paper, we conduct extensive experiments to analyze the factors that cause performance degradation. We find that changing the camera pose, especially camera orientation, relative to the road plane caused performance degradation. In addition, we propose a generalized 3D object detection method that can be universally applied to various camera systems. We newly design a compensation module that corrects the estimated 3D bounding box location and heading direction. The proposed module can be applied to most of the recent 3D object detection networks. It increases AP3D score (KITTI moderate, IoU $> 70\%$) about 6-to-10-times above the baselines without additional training. Both quantitative and qualitative results show the effectiveness of the proposed method. △ Less

Submitted 8 October, 2023; originally announced October 2023.

Comments: Accepted to CVPRw 2023

arXiv:2310.04846 [pdf, other]

Soft finger rotational stability for precision grasps

Authors: Hun Jang, Valentyn Petrichenko, Joonbum Bae, Kevin Haninger

Abstract: Soft robotic fingers can safely grasp fragile or variable form objects, but their force capacity is limited, especially with less contact area: precision grasps and when objects are smaller or not spherical. Current research is improving force capacity through mechanical design by increasing contact area or stiffness, typically without models which explain soft finger force limitations. To address… ▽ More Soft robotic fingers can safely grasp fragile or variable form objects, but their force capacity is limited, especially with less contact area: precision grasps and when objects are smaller or not spherical. Current research is improving force capacity through mechanical design by increasing contact area or stiffness, typically without models which explain soft finger force limitations. To address this, this paper considers two types of soft grip failure, slip and dynamic rotational stability. For slip, the validity of a Coulomb model investigated, identifying the effect of contact area, pressure, and relative pose. For rotational stability, bulk linear stiffness of the fingers is used to develop conditions for dynamic stability and identify when rotation leads to slip. Together, these models suggest contact area improves force capacity by increasing transverse stiffness and normal force. The models are validated on pneumatic fingers, both custom PneuNets-based and commercially available. The models are used to find grip parameters which increase force capacity without failure. △ Less

Submitted 24 March, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

Comments: Submitted IROS24

arXiv:2310.03538 [pdf, other]

Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis

Authors: Jae-Sung Bae, Joun Yeop Lee, Ji-Hyun Lee, Seongkyu Mun, Taehwa Kang, Hoon-Young Cho, Chanwoo Kim

Abstract: Previous works in zero-shot text-to-speech (ZS-TTS) have attempted to enhance its systems by enlarging the training data through crowd-sourcing or augmenting existing speech data. However, the use of low-quality data has led to a decline in the overall system performance. To avoid such degradation, instead of directly augmenting the input data, we propose a latent filling (LF) method that adopts s… ▽ More Previous works in zero-shot text-to-speech (ZS-TTS) have attempted to enhance its systems by enlarging the training data through crowd-sourcing or augmenting existing speech data. However, the use of low-quality data has led to a decline in the overall system performance. To avoid such degradation, instead of directly augmenting the input data, we propose a latent filling (LF) method that adopts simple but effective latent space data augmentation in the speaker embedding space of the ZS-TTS system. By incorporating a consistency loss, LF can be seamlessly integrated into existing ZS-TTS systems without the need for additional training stages. Experimental results show that LF significantly improves speaker similarity while preserving speech quality. △ Less

Submitted 22 January, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

Comments: Accepted to ICASSP 2024

arXiv:2310.02266 [pdf]

doi 10.1016/j.cej.2020.125548

Hydrodynamics of countercurrent flows in a structured packed column: effects of initial wetting and dynamic contact angle

Authors: Rajesh Kumar Singh, Jie Bao, Chao Wang, Yucheng Fu, Zhijie Xu

Abstract: Computational countercurrent flow investigation in the structured packed column is a multiscale problem. Multiphase flow studies using volume of fluid (VOF) method in the representative elementary unit (REU) of the packed column can insight into the local hydrodynamics such as interfacial area, film thickness, etc. The interfacial area dictates the mass transfer in absorption process and thereby o… ▽ More Computational countercurrent flow investigation in the structured packed column is a multiscale problem. Multiphase flow studies using volume of fluid (VOF) method in the representative elementary unit (REU) of the packed column can insight into the local hydrodynamics such as interfacial area, film thickness, etc. The interfacial area dictates the mass transfer in absorption process and thereby overall efficiency of column. Impacts of solvent's physical properties, liquid loads and static contact angle (SCA) on the interfacial area were examined earlier. In the present study, the dynamic contact angle (DCA) was used to explore the impact of contact angle hysteresis on the interfacial area. DCA has more pronounced impact on the interfacial area (10%) for aqueous solvent of 0.10M Sodium hydroxide (NaOH). The interfacial area shows undulation and does not achieve the pseudo-steady state. In contrary, the interfacial area gets a net pseudo-steady value for the aqueous solvent having 40% monoethanolamine (MEA) by weight. The wetting hysteresis was also explored via simulations conducted with initially dry and wetted sheets. For 0.10M NaOH aqueous solvent, the initially wetted sheets lead to slightly higher value of the interfacial area (10%) as compared to the initially dry sheets at the same liquid load and DCA. As expected, wetting hysteresis reduces with increasing liquid loads. On the other hand, wetting hysteresis is not significant for 40% MEA aqueous solvent which might be lower surface tension and higher viscosity. Overall, the effect of the dynamic contact angle is not pronounced as compared to those found in a flat surface. △ Less

Submitted 31 December, 2022; originally announced October 2023.

Journal ref: Chem. Eng. J. 398 (2020) 125548

arXiv:2310.00839 [pdf, other]

Subsurface Characterization using Ensemble-based Approaches with Deep Generative Models

Authors: Jichao Bao, Hongkyu Yoon, Jonghyun Lee

Abstract: Estimating spatially distributed properties such as hydraulic conductivity (K) from available sparse measurements is a great challenge in subsurface characterization. However, the use of inverse modeling is limited for ill-posed, high-dimensional applications due to computational costs and poor prediction accuracy with sparse datasets. In this paper, we combine Wasserstein Generative Adversarial N… ▽ More Estimating spatially distributed properties such as hydraulic conductivity (K) from available sparse measurements is a great challenge in subsurface characterization. However, the use of inverse modeling is limited for ill-posed, high-dimensional applications due to computational costs and poor prediction accuracy with sparse datasets. In this paper, we combine Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP), a deep generative model that can accurately capture complex subsurface structure, and Ensemble Smoother with Multiple Data Assimilation (ES-MDA), an ensemble-based inversion method, for accurate and accelerated subsurface characterization. WGAN-GP is trained to generate high-dimensional K fields from a low-dimensional latent space and ES-MDA then updates the latent variables by assimilating available measurements. Several subsurface examples are used to evaluate the accuracy and efficiency of the proposed method and the main features of the unknown K fields are characterized accurately with reliable uncertainty quantification. Furthermore, the estimation performance is compared with a widely-used variational, i.e., optimization-based, inversion approach, and the proposed approach outperforms the variational inversion method, especially for the channelized and fractured field examples. We explain such superior performance by visualizing the objective function in the latent space: because of nonlinear and aggressive dimension reduction via generative modeling, the objective function surface becomes extremely complex while the ensemble approximation can smooth out the multi-modal surface during the minimization. This suggests that the ensemble-based approach works well over the variational approach when combined with deep generative models at the cost of forward model runs unless convergence-ensuring modifications are implemented in the variational inversion. △ Less

Submitted 9 October, 2023; v1 submitted 1 October, 2023; originally announced October 2023.

arXiv:2309.16496 [pdf, other]

CCEdit: Creative and Controllable Video Editing via Diffusion Models

Authors: Ruoyu Feng, Wenming Weng, Yanhui Wang, Yuhui Yuan, Jianmin Bao, Chong Luo, Zhibo Chen, Baining Guo

Abstract: In this paper, we present CCEdit, a versatile generative video editing framework based on diffusion models. Our approach employs a novel trident network structure that separates structure and appearance control, ensuring precise and creative editing capabilities. Utilizing the foundational ControlNet architecture, we maintain the structural integrity of the video during editing. The incorporation… ▽ More In this paper, we present CCEdit, a versatile generative video editing framework based on diffusion models. Our approach employs a novel trident network structure that separates structure and appearance control, ensuring precise and creative editing capabilities. Utilizing the foundational ControlNet architecture, we maintain the structural integrity of the video during editing. The incorporation of an additional appearance branch enables users to exert fine-grained control over the edited key frame. These two side branches seamlessly integrate into the main branch, which is constructed upon existing text-to-image (T2I) generation models, through learnable temporal layers. The versatility of our framework is demonstrated through a diverse range of choices in both structure representations and personalized T2I models, as well as the option to provide the edited key frame. To facilitate comprehensive evaluation, we introduce the BalanceCC benchmark dataset, comprising 100 videos and 4 target prompts for each video. Our extensive user studies compare CCEdit with eight state-of-the-art video editing methods. The outcomes demonstrate CCEdit's substantial superiority over all other methods. △ Less

Submitted 6 April, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

arXiv:2309.15817 [pdf, other]

Identifying the Risks of LM Agents with an LM-Emulated Sandbox

Authors: Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J. Maddison, Tatsunori Hashimoto

Abstract: Recent advances in Language Model (LM) agents and tool use, exemplified by applications like ChatGPT Plugins, enable a rich set of capabilities but also amplify potential risks - such as leaking private data or causing financial losses. Identifying these risks is labor-intensive, necessitating implementing the tools, setting up the environment for each test scenario manually, and finding risky cas… ▽ More Recent advances in Language Model (LM) agents and tool use, exemplified by applications like ChatGPT Plugins, enable a rich set of capabilities but also amplify potential risks - such as leaking private data or causing financial losses. Identifying these risks is labor-intensive, necessitating implementing the tools, setting up the environment for each test scenario manually, and finding risky cases. As tools and agents become more complex, the high cost of testing these agents will make it increasingly difficult to find high-stakes, long-tailed risks. To address these challenges, we introduce ToolEmu: a framework that uses an LM to emulate tool execution and enables the testing of LM agents against a diverse range of tools and scenarios, without manual instantiation. Alongside the emulator, we develop an LM-based automatic safety evaluator that examines agent failures and quantifies associated risks. We test both the tool emulator and evaluator through human evaluation and find that 68.8% of failures identified with ToolEmu would be valid real-world agent failures. Using our curated initial benchmark consisting of 36 high-stakes tools and 144 test cases, we provide a quantitative risk analysis of current LM agents and identify numerous failures with potentially severe outcomes. Notably, even the safest LM agent exhibits such failures 23.9% of the time according to our evaluator, underscoring the need to develop safer LM agents for real-world deployment. △ Less

Submitted 17 May, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.13555 [pdf, ps, other]

Sensitivity analysis of wall-modeled large-eddy simulation for separated turbulent flow

Authors: Di Zhou, H. Jane Bae

Abstract: In this study, we conduct a parametric analysis to evaluate the sensitivities of wall-modeled large-eddy simulation (LES) with respect to subgrid-scale (SGS) models, mesh resolution, wall boundary conditions and mesh anisotropy. While such investigations have been conducted for attached/flat-plate flow configurations, systematic studies specifically targeting turbulent flows with separation are no… ▽ More In this study, we conduct a parametric analysis to evaluate the sensitivities of wall-modeled large-eddy simulation (LES) with respect to subgrid-scale (SGS) models, mesh resolution, wall boundary conditions and mesh anisotropy. While such investigations have been conducted for attached/flat-plate flow configurations, systematic studies specifically targeting turbulent flows with separation are notably sparse. To bridge this gap, our study focuses on the flow over a two-dimensional Gaussian-shaped bump at a moderately high Reynolds number, which involves smooth-body separation of a turbulent boundary layer under pressure-gradient and surface-curvature effects. In the simulations, the no-slip condition at the wall is replaced by three different forms of boundary condition based on the thin boundary layer equations and the mean wall-shear stress from high-fidelity numerical simulation to avoid the additional complexity of modeling the wall-shear stress. Various statistics, including the mean separation bubble size, mean velocity profile, and dissipation from SGS model, are compared and analyzed. The results reveal that capturing the separation bubble strongly depends on the choice of SGS model. While simulations approach grid convergence with resolutions nearing those of wall-resolved LES meshes, above this limit, the LES predictions exhibit intricate sensitivities to mesh resolution. Furthermore, both wall boundary conditions and the anisotropy of mesh cells exert discernible impacts on the turbulent flow predictions, yet the magnitudes of these impacts vary based on the specific SGS model chosen for the simulation. △ Less

Submitted 23 March, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.13514 [pdf, ps, other]

Superconductivity emerging from density-wave-like order in a correlated kagome metal

Authors: Yi Liu, Zi-Yi Liu, **-Ke Bao, Peng-Tao Yang, Liang-Wen Ji, Si-Qi Wu, Qin-Xin Shen, Jun Luo, Jie Yang, Ji-Yong Liu, Chen-Chao Xu, Wu-Zhang Yang, Wan-Li Chai, Jia-Yi Lu, Chang-Chao Liu, Bo-Sen Wang, Hao Jiang, Qian Tao, Zhi Ren, Xiao-Feng Xu, Chao Cao, Zhu-An Xu, Rui Zhou, **-Guang Cheng, Guang-Han Cao

Abstract: Unconventional superconductivity (USC) in a highly correlated kagome system has been theoretically proposed for years, yet the experimental realization is hard to achieve. The recently discovered vanadium-based kagome materials, which exhibit both superconductivity and charge density wave (CDW) orders, are nonmagnetic and weakly correlated, thus unlikely host USC as theories proposed. Here we repo… ▽ More Unconventional superconductivity (USC) in a highly correlated kagome system has been theoretically proposed for years, yet the experimental realization is hard to achieve. The recently discovered vanadium-based kagome materials, which exhibit both superconductivity and charge density wave (CDW) orders, are nonmagnetic and weakly correlated, thus unlikely host USC as theories proposed. Here we report the discovery of a chromium-based kagome metal, CsCr$_3$Sb$_5$, which is contrastingly characterised by strong electron correlations, frustrated magnetism, and characteristic flat bands close to the Fermi level. Under ambient pressure, it undergoes a concurrent structural and magnetic phase transition at 55 K, accompanying with a stripe-like $4a_0$ structural modulation. At high pressure, the phase transition evolves into two transitions, probably associated with CDW and antiferromagnetic spin-density-wave orderings, respectively. These density-wave (DW)-like orders are gradually suppressed with pressure and, remarkably, a superconducting dome emerges at 3.65-8.0 GPa. The maximum of the superconducting transition temperature, $T_\mathrm{c}^{\mathrm{max}}=$ 6.4 K, appears when the DW-like orders are completely suppressed at 4.2 GPa, and the normal state exhibits a non-Fermi-liquid behaviour, reminiscent of USC and quantum criticality in iron-based superconductors. Our work offers an unprecedented platform for investigating possible USC in a correlated kagome system. △ Less

Submitted 16 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

Comments: 32 pages, 14 figures

arXiv:2309.11319 [pdf, other]

WFTNet: Exploiting Global and Local Periodicity in Long-term Time Series Forecasting

Authors: Peiyuan Liu, Beiliang Wu, Naiqi Li, Tao Dai, Fengmao Lei, Jigang Bao, Yong Jiang, Shu-Tao Xia

Abstract: Recent CNN and Transformer-based models tried to utilize frequency and periodicity information for long-term time series forecasting. However, most existing work is based on Fourier transform, which cannot capture fine-grained and local frequency structure. In this paper, we propose a Wavelet-Fourier Transform Network (WFTNet) for long-term time series forecasting. WFTNet utilizes both Fourier and… ▽ More Recent CNN and Transformer-based models tried to utilize frequency and periodicity information for long-term time series forecasting. However, most existing work is based on Fourier transform, which cannot capture fine-grained and local frequency structure. In this paper, we propose a Wavelet-Fourier Transform Network (WFTNet) for long-term time series forecasting. WFTNet utilizes both Fourier and wavelet transforms to extract comprehensive temporal-frequency information from the signal, where Fourier transform captures the global periodic patterns and wavelet transform captures the local ones. Furthermore, we introduce a Periodicity-Weighted Coefficient (PWC) to adaptively balance the importance of global and local frequency patterns. Extensive experiments on various time series datasets show that WFTNet consistently outperforms other state-of-the-art baseline. Code is available at https://github.com/Hank0626/WFTNet. △ Less

Submitted 4 January, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: Accepted to ICASSP 2024

arXiv:2309.08977 [pdf]

Operando Insights on the Degradation Mechanisms of Rhenium-doped and Undoped Molybdenum Disulfide Nanocatalysts for Electrolyzer Applications

Authors: Raquel Aymerich-Armengol, Miquel Vega-Paredes, Zhenbin Wang, Andrea M. Mingers, Luca Camuti, Jeeung Kim, Jeongwook Bae, Ilias Efthimiopoulos, Rajib Sahu, Filip Podjaski, Martin Rabe, Christina Scheu, Joohyun Lim, Siyuan Zhang

Abstract: MoS2 nanostructures are promising catalysts for proton-exchange-membrane (PEM) electrolyzers to replace expensive noble metals. Their broadscale application demands high activity for the hydrogen evolution reaction (HER) as well as robust durability. Do** is commonly applied to enhance the HER activity of MoS2-based nanocatalysts, but the effect of dopants in the electrochemical and structural s… ▽ More MoS2 nanostructures are promising catalysts for proton-exchange-membrane (PEM) electrolyzers to replace expensive noble metals. Their broadscale application demands high activity for the hydrogen evolution reaction (HER) as well as robust durability. Do** is commonly applied to enhance the HER activity of MoS2-based nanocatalysts, but the effect of dopants in the electrochemical and structural stability is yet to be discussed. Herein, we correlate operando electrochemical measurements to the structural evolution of the materials down to the nanometric scale by identical location electron microscopy and spectroscopy. The range of stable operation for MoS2 nanocatalysts with and without rhenium do** is experimentally defined. The responsible degradation mechanisms at first electrolyte contact, open circuit stabilization and HER conditions are experimentally identified and confirmed with the calculated Pourbaix diagram of Re-doped MoS2. Do** MoS2-based nanocatalysts is validated as a promising strategy for the continuous improvement of high performance and durable PEM electrolyzers. △ Less

Submitted 21 April, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

arXiv:2309.08959 [pdf, other]

doi 10.1103/PhysRevResearch.6.023277

Non-centrosymmetric, transverse structural modulation in SrAl4, and elucidation of its origin in the BaAl4 family of compounds

Authors: Sitaram Ramakrishnan, Surya Rohith Kotla, Hanqi Pi, Bishal Baran Maity, Jia Chen, **-Ke Bao, Zhaopeng Guo, Masaki Kado, Harshit Agarwal, Claudio Eisele, Minoru Nohara, Leila Noohinejad, Hongming Weng, Srinivasan Ramakrishnan, Arumugam Thamizhavel, Sander van Smaalen

Abstract: At ambient conditions SrAl4 adopts the BaAl4 structure type with space group I4/mmm. It undergoes a charge-density-wave (CDW) transition at TCDW = 243 K, followed by a structural transition at TS = 87 K. Temperature-dependent single-crystal X-ray diffraction (SXRD) leads to the observation of incommensurate superlattice reflections at q = σc* with σ= 0.1116 at 200 K. The CDW has orthorhombic symme… ▽ More At ambient conditions SrAl4 adopts the BaAl4 structure type with space group I4/mmm. It undergoes a charge-density-wave (CDW) transition at TCDW = 243 K, followed by a structural transition at TS = 87 K. Temperature-dependent single-crystal X-ray diffraction (SXRD) leads to the observation of incommensurate superlattice reflections at q = σc* with σ= 0.1116 at 200 K. The CDW has orthorhombic symmetry with the acentric superspace group F222(00sigma)00s, where F222 is a subgroup of Fmmm as well as of I4/mmm. Atomic displacements mainly represent a transverse wave, with displacements that are 90 deg out of phase between the two diagonal directions of the I-centered unit cell, resulting in a helical wave. Small longitudinal displacements are provided by the second harmonic modulation. The orthorhombic phase realized in SrAl4 is similar to that found in EuAl4. Electronic structure calculations and phonon calculations by density functional theory (DFT) have failed to reveal the mechanism of CDW formation. However, DFT reveals that Al atoms dominate the density of states near the Fermi level, thus, corroborating the SXRD measurements. SrAl4 remains incommensurately modulated at the structural transition, where the symmetry lowers from orthorhombic to b-unique monoclinic. We have identified a simple criterion, that correlates the presence of a phase transition with the interatomic distances. Only those compounds XAl4-xGax(X = Ba, Eu, Sr, Ca; 0 < x <4) undergo phase transitions, for which the ratio c/a falls within the narrow range 2.51 < c/a < 2.54. △ Less

Submitted 16 March, 2024; v1 submitted 16 September, 2023; originally announced September 2023.

Journal ref: Physical Review Research 2024

arXiv:2309.06076 [pdf, other]

Grain Growth and Dust Segregation Revealed by Multi-wavelength Analysis of the Class I Protostellar Disk WL 17

Authors: Ilseung Han, Woo** Kwon, Yusuke Aso, Jaehan Bae, Patrick Sheehan

Abstract: The first step toward planet formation is grain growth from (sub-)micrometer to millimeter/centimeter sizes. Grain growth has been reported not only in Class II protoplanetary disks but also in Class 0/I protostellar envelopes. However, early-stage grain growth occurring in Class 0/I stages has rarely been observed on the protostellar disk scale. Here we present the results from the ALMA Band 3 (… ▽ More The first step toward planet formation is grain growth from (sub-)micrometer to millimeter/centimeter sizes. Grain growth has been reported not only in Class II protoplanetary disks but also in Class 0/I protostellar envelopes. However, early-stage grain growth occurring in Class 0/I stages has rarely been observed on the protostellar disk scale. Here we present the results from the ALMA Band 3 ($λ$ = 3.1 mm) and 7 ($λ$ = 0.87 mm) archival data of the Class I protostellar disk WL 17 in the $ρ$ Ophiuchus molecular cloud. Disk substructures are found in both bands, but they are different: while a central hole and a symmetric ring appear in Band 3, an off-center hole and an asymmetric ring are shown in Band 7. Furthermore, we obtain an asymmetric spectral index map with a low mean value of $α$ = 2.28 $\pm$ 0.02, suggestive of grain growth and dust segregation on the protostellar disk scale. Our radiative transfer modeling verifies these two features by demonstrating that 10 cm-sized large grains are symmetrically distributed, whereas 10 $μ$m-sized small grains are asymmetrically distributed. Also, the analysis shows that the disk is expected to be massive and gravitationally unstable. We thus suggest a single Jupiter-mass protoplanet formed by gravitational instability as the origin of the ring-like structure, grain growth, and dust segregation identified in WL 17. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 23 pages, 7 figures; to be published in The Astrophysical Journal

arXiv:2309.03895 [pdf, other]

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

Authors: Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo

Abstract: We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions. Unlike existing approaches that integrate prior knowledge and pre-define the output space (e.g., categories and coordinates) for each vision task, we cast diverse vision tasks into a human-intuitive image-manipulating process whose output space is a flexible and interactive pi… ▽ More We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions. Unlike existing approaches that integrate prior knowledge and pre-define the output space (e.g., categories and coordinates) for each vision task, we cast diverse vision tasks into a human-intuitive image-manipulating process whose output space is a flexible and interactive pixel space. Concretely, the model is built upon the diffusion process and is trained to predict pixels according to user instructions, such as encircling the man's left shoulder in red or applying a blue mask to the left car. InstructDiffusion could handle a variety of vision tasks, including understanding tasks (such as segmentation and keypoint detection) and generative tasks (such as editing and enhancement). It even exhibits the ability to handle unseen tasks and outperforms prior methods on novel datasets. This represents a significant step towards a generalist modeling interface for vision tasks, advancing artificial general intelligence in the field of computer vision. △ Less

Submitted 7 September, 2023; originally announced September 2023.

arXiv:2308.16223 [pdf, other]

UV-Optical Emission of AB Aur b is Consistent with Scattered Stellar Light

Authors: Yifan Zhou, Brendan P. Bowler, Haifeng Yang, Aniket Sanghi, Gregory J. Herczeg, Adam L. Kraus, Jaehan Bae, Feng Long, Katherine B. Follette, Kimberley Ward-Duong, Zhaohuan Zhu, Lauren I. Biddle, Laird M. Close, Lillian Yushu Jiang, Ya-Lin Wu

Abstract: The proposed protoplanet AB Aur b is a spatially concentrated emission source imaged in the mm-wavelength disk gap of the Herbig Ae/Be star AB Aur. Its near-infrared spectrum and absence of strong polarized light have been interpreted as evidence supporting the protoplanet interpretation. However, the complex scattered light structures in the AB Aur disk pose challenges in resolving the emission s… ▽ More The proposed protoplanet AB Aur b is a spatially concentrated emission source imaged in the mm-wavelength disk gap of the Herbig Ae/Be star AB Aur. Its near-infrared spectrum and absence of strong polarized light have been interpreted as evidence supporting the protoplanet interpretation. However, the complex scattered light structures in the AB Aur disk pose challenges in resolving the emission source and interpreting the true nature of AB Aur b. We present new images of the AB Aur system obtained using the Hubble Space Telescope Wide Field Camera 3 in the ultraviolet (UV) and optical bands. AB Aur b and the known disk spirals are recovered in the F336W, F410M, and F645N bands. The spectral energy distribution of AB Aur b shows absorption in the Balmer jump, mimicking those of early-type stars. By comparing the colors of AB Aur b to those of the host star, the disk spirals, and predictions from scattered light and self-luminous models, we find that the emission from AB Aur b is inconsistent with planetary photospheric or accretion shock models. Instead, it is consistent with those measured in the circumstellar disks that trace scattered light. We conclude that the UV and visible emission from AB Aur b does not necessitate the presence of a protoplanet. We synthesize observational constraints on AB Aur b and discuss inconsistent interpretations of AB Aur b among different datasets. Considering the significance of the AB Aur b discovery, we advocate for further observational evidence to verify its planetary nature. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 11 pages, 6 figures. Accepted for publication in AJ

arXiv:2308.15939 [pdf, other]

Bootstrap Fine-Grained Vision-Language Alignment for Unified Zero-Shot Anomaly Localization

Authors: Hanqiu Deng, Zhaoxiang Zhang, **an Bao, Xingyu Li

Abstract: Contrastive Language-Image Pre-training (CLIP) models have shown promising performance on zero-shot visual recognition tasks by learning visual representations under natural language supervision. Recent studies attempt the use of CLIP to tackle zero-shot anomaly detection by matching images with normal and abnormal state prompts. However, since CLIP focuses on building correspondence between paire… ▽ More Contrastive Language-Image Pre-training (CLIP) models have shown promising performance on zero-shot visual recognition tasks by learning visual representations under natural language supervision. Recent studies attempt the use of CLIP to tackle zero-shot anomaly detection by matching images with normal and abnormal state prompts. However, since CLIP focuses on building correspondence between paired text prompts and global image-level representations, the lack of fine-grained patch-level vision to text alignment limits its capability on precise visual anomaly localization. In this work, we propose AnoCLIP for zero-shot anomaly localization. In the visual encoder, we introduce a training-free value-wise attention mechanism to extract intrinsic local tokens of CLIP for patch-level local description. From the perspective of text supervision, we particularly design a unified domain-aware contrastive state prompting template for fine-grained vision-language matching. On top of the proposed AnoCLIP, we further introduce a test-time adaptation (TTA) mechanism to refine visual anomaly localization results, where we optimize a lightweight adapter in the visual encoder using AnoCLIP's pseudo-labels and noise-corrupted tokens. With both AnoCLIP and TTA, we significantly exploit the potential of CLIP for zero-shot anomaly localization and demonstrate the effectiveness of AnoCLIP on various datasets. △ Less

Submitted 26 February, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

Showing 51–100 of 917 results for author: Bae, J