Skip to main content

Showing 1–50 of 275 results for author: Xiang, J

.
  1. arXiv:2406.19055  [pdf, other

    cs.CV

    SimpleFusion: A Simple Fusion Framework for Infrared and Visible Images

    Authors: Ming Chen, Yuxuan Cheng, Xinwei He, Xinyue Wang, Yan Aze, **hai Xiang

    Abstract: Integrating visible and infrared images into one high-quality image, also known as visible and infrared image fusion, is a challenging yet critical task for many downstream vision tasks. Most existing works utilize pretrained deep neural networks or design sophisticated frameworks with strong priors for this task, which may be unsuitable or lack flexibility. This paper presents SimpleFusion, a sim… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: code:https://github.com/hxwxss/SimpleFusion-A-Simple-Fusion-Framework-for-Infrared-and-Visible-Images

  2. arXiv:2406.09904  [pdf, other

    cs.LG

    QQQ: Quality Quattuor-Bit Quantization for Large Language Models

    Authors: Ying Zhang, Peng Zhang, Mincong Huang, **gyang Xiang, Yujie Wang, Chao Wang, Yineng Zhang, Lei Yu, Chuan Liu, Wei Lin

    Abstract: Quantization is a proven effective method for compressing large language models. Although popular techniques like W8A8 and W4A16 effectively maintain model performance, they often fail to concurrently speed up the prefill and decoding stages of inference. W4A8 is a promising strategy to accelerate both of them while usually leads to a significant performance degradation. To address these issues, w… ▽ More

    Submitted 28 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2406.09455  [pdf, other

    cs.CV cs.AI cs.CL

    Pandora: Towards General World Model with Natural Language Actions and Video States

    Authors: Jiannan Xiang, Guangyi Liu, Yi Gu, Qiyue Gao, Yuting Ning, Yuheng Zha, Zeyu Feng, Tianhua Tao, Shibo Hao, Yemin Shi, Zhengzhong Liu, Eric P. Xing, Zhiting Hu

    Abstract: World models simulate future states of the world in response to different actions. They facilitate interactive content creation and provides a foundation for grounded, long-horizon reasoning. Current foundation models do not fully meet the capabilities of general world models: large language models (LLMs) are constrained by their reliance on language modality and their limited understanding of the… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Website: https://world-model.maitrix.org/

  4. arXiv:2406.06534  [pdf, other

    cs.CV eess.IV physics.optics

    Compressed Meta-Optical Encoder for Image Classification

    Authors: Anna Wirth-Singh, **lin Xiang, Minho Choi, Johannes E. Fröch, Luocheng Huang, Shane Colburn, Eli Shlizerman, Arka Majumdar

    Abstract: Optical and hybrid convolutional neural networks (CNNs) recently have become of increasing interest to achieve low-latency, low-power image classification and computer vision tasks. However, implementing optical nonlinearity is challenging, and omitting the nonlinear layers in a standard CNN comes at a significant reduction in accuracy. In this work, we use knowledge distillation to compress modif… ▽ More

    Submitted 14 June, 2024; v1 submitted 22 April, 2024; originally announced June 2024.

  5. arXiv:2406.00671  [pdf, other

    cs.RO

    An Efficient Trajectory Generation for Bi-copter Flight in Tight Space

    Authors: Xin Dong, Yangjie Cui, **gwu Xiang, Daochun Li, Zhan Tu

    Abstract: Unlike squared (or alike) quadrotors, elongated bi-copters leverage natural superiority in crossing tight spaces. To date, extensive works have focused on the design, modeling, and control of bi-copters. Besides, a proper motion planner utilizing bi-copters' shape characteristics is essential to efficiently and safely traverse tight spaces, yet it has rarely been studied. Current motion planning m… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 8 pages,8 figures

  6. arXiv:2405.15889  [pdf, other

    astro-ph.CO gr-qc hep-ph

    Dual Inflation and Bounce Cosmologies Interpretation of Pulsar Timing Array Data

    Authors: Changhong Li, Junrong Lai, **jie Xiang, Chaofan Wu

    Abstract: We explore a dual scenario of generalized inflation and bounce cosmologies, producing a scale-invariant curvature perturbation spectrum. Bayesian analysis with pulsar timing array data identifies, for the first time, viable regions from inflation and bounce that simultaneously explain stochastic gravitational wave background (SGWB) signals and CMB anisotropies. Bayes factor calculations strongly f… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 29 pages, 6 figures

  7. arXiv:2405.07468  [pdf

    cs.CL cs.AI

    Evaluating large language models in medical applications: a survey

    Authors: Xiaolan Chen, Jiayang Xiang, Shanfu Lu, Yexin Liu, Mingguang He, Danli Shi

    Abstract: Large language models (LLMs) have emerged as powerful tools with transformative potential across numerous domains, including healthcare and medicine. In the medical domain, LLMs hold promise for tasks ranging from clinical decision support to patient education. However, evaluating the performance of LLMs in medical contexts presents unique challenges due to the complex and critical nature of medic… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 4 figures, 1 table

  8. arXiv:2404.19152  [pdf

    cond-mat.mtrl-sci

    Symmetry Strategy for Rapid Discovery of Abundant Fractional Quantum Ferroelectrics

    Authors: Guoliang Yu, Junyi Ji, Changsong Xu, H. J. Xiang

    Abstract: Traditional ferroelectrics are limited by Neumann's principle, which confines exploration of ferroelectrics within polar point groups. Our recent work [Nat. Commun. 15, 135, (2024)] proposes the concept of fractional quantum ferroelectricity (FQFE) that extend the playground of ferroelectricity to non-polar point groups. Here, we apply group theory and introduce an efficient symmetry strategy to i… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 16 pages, 4 figures

  9. arXiv:2404.15997  [pdf, other

    cond-mat.str-el

    Spin Supersolid Phase and Double Magnon-Roton Excitations in a Cobalt-based Triangular Lattice

    Authors: Yuan Gao, Chuandi Zhang, Junsen Xiang, Dehong Yu, Xingye Lu, Peijie Sun, Wentao **, Gang Su, Wei Li

    Abstract: Supersolid is an exotic quantum state of matter that hosts spontaneously the features of both solid and superfluidity, which breaks the lattice translational symmetry and U(1) gauge symmetry. Here we conduct inelastic neutron scattering (INS) measurements and tensor-network calculations on the triangular-lattice cobaltate Na$_2$BaCo(PO$_4$)$_2$, which is proposed in [Xiang ${\it et al.}$, Nature 6… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  10. arXiv:2404.12833  [pdf, other

    cs.SE

    How Far Can We Go with Practical Function-Level Program Repair?

    Authors: Jiahong Xiang, Xiaoyang Xu, Fanchu Kong, Mingyuan Wu, Haotian Zhang, Yuqun Zhang

    Abstract: Recently, multiple Automated Program Repair (APR) techniques based on Large Language Models (LLMs) have been proposed to enhance the repair performance. While these techniques mainly focus on the single-line or hunk-level repair, they face significant challenges in real-world application due to the limited repair task scope and costly statement-level fault localization. However, the more practical… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: https://github.com/GhabiX/SRepair/

  11. arXiv:2404.07833  [pdf

    cs.CV cs.LG

    Streamlined Photoacoustic Image Processing with Foundation Models: A Training-Free Solution

    Authors: Handi Deng, Yucheng Zhou, Jiaxuan Xiang, Liujie Gu, Yan Luo, Hai Feng, Mingyuan Liu, Cheng Ma

    Abstract: Foundation models have rapidly evolved and have achieved significant accomplishments in computer vision tasks. Specifically, the prompt mechanism conveniently allows users to integrate image prior information into the model, making it possible to apply models without any training. Therefore, we propose a method based on foundation models and zero training to solve the tasks of photoacoustic (PA) i… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  12. arXiv:2404.06821  [pdf, other

    math.AP

    Uniqueness to inverse acoustic and elastic medium scattering problems with hyper-singular source method

    Authors: Chun Liu, Guanghui Hu, Jianli Xiang, Jiayi Zhang

    Abstract: This paper is concerned with inverse scattering problems of determining the support of an isotropic and homogeneous penetrable body from knowledge of multi-static far-field patterns in acoustics and in linear elasticity. The normal derivative of the total fields admits no jump on the interface of the scatterer in the trace sense. If the contrast function of the refractive index function or the den… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  13. arXiv:2404.06760  [pdf, other

    cs.CL cs.AI

    DiffusionDialog: A Diffusion Model for Diverse Dialog Generation with Latent Space

    Authors: Jianxiang Xiang, Zhenhua Liu, Haodong Liu, Yin Bai, Jia Cheng, Wenliang Chen

    Abstract: In real-life conversations, the content is diverse, and there exists the one-to-many problem that requires diverse generation. Previous studies attempted to introduce discrete or Gaussian-based continuous latent variables to address the one-to-many problem, but the diversity is limited. Recently, diffusion models have made breakthroughs in computer vision, and some attempts have been made in natur… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: LREC-COLING 2024 camera ready

  14. arXiv:2404.01600  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    C-type antiferromagnetic structure of topological semimetal CaMnSb$_2$

    Authors: Bo Li, Xu-Tao Zeng, Qianhui Xu, Fan Yang, Junsen Xiang, Hengyang Zhong, Sihao Deng, Lunhua He, Ju** Xu, Wen Yin, Xingye Lu, Huiying Liu, Xian-Lei Sheng, Wentao **

    Abstract: Determination of the magnetic structure and confirmation of the presence or absence of inversion ($\mathcal{P}$) and time reversal ($\mathcal{T}$) symmetry is imperative for correctly understanding the topological magnetic materials. Here high-quality single crystals of the layered manganese pnictide CaMnSb$_2$ are synthesized using the self-flux method. De Haas-van Alphen oscillations indicate a… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 7 Pages, 6 figures

    Journal ref: Chinese Physics Letters 41, 037104 (2024)

  15. arXiv:2404.01592  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Structural, magnetic and magnetocaloric properties of triangular-lattice transition-metal phosphates

    Authors: Chuandi Zhang, Junsen Xiang, Quanliang Zhu, Longfei Wu, Shanfeng Zhang, Ju** Xu, Wen Yin, Peijie Sun, Wei Li, Gang Su, Wentao **

    Abstract: The recent discovery of the spin supersolid candidate Na$_2$BaCo(PO$_4$)$_2$ stimulates numerous research interest on the triangular-lattice transition-metal phosphates. Here we report a comprehensive study on the structural, magnetic and magnetocaloric properties of polycrystalline Na$_2$$A$$T$(PO$_4$)$_2$ ($A$ = Ba, Sr; $T$ = Co, Ni, Mn). X-ray and neutron diffraction measurements confirm that N… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 10 Pages, 6 figures, accepted for publication in Physical Review Materials

    Journal ref: Physical Review Materials 8, 044409 (2024)

  16. arXiv:2404.00361  [pdf, other

    cs.CL

    Controllable and Diverse Data Augmentation with Large Language Model for Low-Resource Open-Domain Dialogue Generation

    Authors: Zhenhua Liu, Tong Zhu, Jianxiang Xiang, Wenliang Chen

    Abstract: Data augmentation (DA) is crucial to mitigate model training instability and over-fitting problems in low-resource open-domain dialogue generation. However, traditional DA methods often neglect semantic data diversity, restricting the overall quality. Recently, large language models (LLM) have been used for DA to generate diversified dialogues. However, they have limited controllability and tend t… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 13 pages, 5 figures

  17. arXiv:2403.11503  [pdf, other

    cs.CV

    Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors

    Authors: Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang, Xin Tong

    Abstract: We propose a novel image editing technique that enables 3D manipulations on single images, such as object rotation and translation. Existing 3D-aware image editing approaches typically rely on synthetic multi-view datasets for training specialized models, thus constraining their effectiveness on open-domain images featuring significantly more varied layouts and styles. In contrast, our method dire… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Project page: https://wangrc.site/DiffCriticEdit/

  18. arXiv:2403.08204  [pdf, other

    cs.LG cs.CV

    AutoDFP: Automatic Data-Free Pruning via Channel Similarity Reconstruction

    Authors: Siqi Li, Jun Chen, **gyang Xiang, Chengrui Zhu, Yong Liu

    Abstract: Structured pruning methods are developed to bridge the gap between the massive scale of neural networks and the limited hardware resources. Most current structured pruning methods rely on training datasets to fine-tune the compressed model, resulting in high computational burdens and being inapplicable for scenarios with stringent requirements on privacy and security. As an alternative, some data-… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 11 pages, 16 figures

  19. arXiv:2403.05829  [pdf, ps, other

    eess.SY cs.CR cs.ET cs.LO

    Measuring Robustness in Cyber-Physical Systems under Sensor Attacks

    Authors: Jian Xiang, Ruggero Lanotte, Simone Tini, Stephen Chong, Massimo Merro

    Abstract: This paper contributes a formal framework for quantitative analysis of bounded sensor attacks on cyber-physical systems, using the formalism of differential dynamic logic. Given a precondition and postcondition of a system, we formalize two quantitative safety notions, quantitative forward and backward safety, which respectively express (1) how strong the strongest postcondition of the system is w… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Preprint submitted to Elsevier

  20. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  21. arXiv:2403.02110  [pdf, ps, other

    astro-ph.SR

    Generalized Coronal Loop Scaling Laws and Their Implication for Turbulence in Solar Active Region Loops

    Authors: Y. Dai, J. J. Xiang, M. D. Ding

    Abstract: Recent coronal loop modeling has emphasized the importance of combining both Coulomb collisions and turbulent scattering to characterize field-aligned thermal conduction, which invokes a hybrid loop model. In this work we generalize the hybrid model by incorporating nonuniform heating and cross section that are both formulated by a power-law function of temperature. Based on the hybrid model solut… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 18 pages, 8 figures and 2 tables, accepted for publication in The Astrophysical Journal

  22. arXiv:2402.17262  [pdf, other

    cs.CL cs.AI

    Speak Out of Turn: Safety Vulnerability of Large Language Models in Multi-turn Dialogue

    Authors: Zhenhong Zhou, Jiuyang Xiang, Haopeng Chen, Quan Liu, Zherui Li, Sen Su

    Abstract: Large Language Models (LLMs) have been demonstrated to generate illegal or unethical responses, particularly when subjected to "jailbreak." Research on jailbreak has highlighted the safety issues of LLMs. However, prior studies have predominantly focused on single-turn dialogue, ignoring the potential complexities and risks presented by multi-turn dialogue, a crucial mode through which humans deri… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: working in progress 23pages, 18 figures

  23. arXiv:2402.16043  [pdf, other

    cs.CR cs.SE

    LuaTaint: A Static Taint Analysis System for Web Interface Framework Vulnerability of IoT Devices

    Authors: Jiahui Xiang, Wenhai Wang, Tong Ye, Peiyu Liu

    Abstract: IoT devices are currently facing continuous malicious attacks due to their widespread use. Among these IoT devices, web vulnerabilities are also widely exploited because of their inherent characteristics, such as improper permission controls and insecure interfaces. Recently, the embedded system web interface framework has become highly diverse, and specific vulnerabilities can arise if developers… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  24. arXiv:2402.07788  [pdf, other

    cs.CL

    Multi-Intent Attribute-Aware Text Matching in Searching

    Authors: Mingzhe Li, Xiuying Chen, **g Xiang, Qishen Zhang, Changsheng Ma, Chenchen Dai, **xiong Chang, Zhongyi Liu, Guannan Zhang

    Abstract: Text matching systems have become a fundamental service in most searching platforms. For instance, they are responsible for matching user queries to relevant candidate items, or rewriting the user-input query to a pre-selected high-performing one for a better search experience. In practice, both the queries and items often contain multiple attributes, such as the category of the item and the locat… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 9 pages

  25. arXiv:2401.08743  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    MMToM-QA: Multimodal Theory of Mind Question Answering

    Authors: Chuanyang **, Yutong Wu, **g Cao, Jiannan Xiang, Yen-Ling Kuo, Zhiting Hu, Tomer Ullman, Antonio Torralba, Joshua B. Tenenbaum, Tianmin Shu

    Abstract: Theory of Mind (ToM), the ability to understand people's mental states, is an essential ingredient for develo** machines with human-level social intelligence. Recent machine learning models, particularly large language models, seem to show some aspects of ToM understanding. However, existing ToM benchmarks use unimodal datasets - either video or text. Human ToM, on the other hand, is more than v… ▽ More

    Submitted 15 June, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: ACL 2024. 26 pages, 11 figures, 7 tables

  26. arXiv:2312.15430  [pdf, other

    cs.CV

    Make-A-Character: High Quality Text-to-3D Character Generation within Minutes

    Authors: Jianqiang Ren, Chao He, Lin Liu, Jiahao Chen, Yutong Wang, Yafei Song, Jianfang Li, Tangli Xue, Siqi Hu, Tao Chen, Kunkun Zheng, Jian**g Xiang, Liefeng Bo

    Abstract: There is a growing demand for customized and expressive 3D characters with the emergence of AI agents and Metaverse, but creating 3D characters using traditional computer graphics tools is a complex and time-consuming task. To address these challenges, we propose a user-friendly framework named Make-A-Character (Mach) to create lifelike 3D avatars from text descriptions. The framework leverages th… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    Comments: Technical Report

  27. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  28. arXiv:2312.11555  [pdf, other

    cs.CV

    CR-SFP: Learning Consistent Representation for Soft Filter Pruning

    Authors: **gyang Xiang, Zhuangzhi Chen, Jianbiao Mei, Siqi Li, Jun Chen, Yong Liu

    Abstract: Soft filter pruning~(SFP) has emerged as an effective pruning technique for allowing pruned filters to update and the opportunity for them to regrow to the network. However, this pruning strategy applies training and pruning in an alternative manner, which inevitably causes inconsistent representations between the reconstructed network~(R-NN) at the training and the pruned network~(P-NN) at the in… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 11 pages, 4 figures

  29. arXiv:2312.07061  [pdf, other

    cs.CV

    MaxQ: Multi-Axis Query for N:M Sparsity Network

    Authors: **gyang Xiang, Siqi Li, Junhao Chen, Zhuangzhi Chen, Tianxin Huang, Linpeng Peng, Yong Liu

    Abstract: N:M sparsity has received increasing attention due to its remarkable performance and latency trade-off compared with structured and unstructured sparsity. However, existing N:M sparsity methods do not differentiate the relative importance of weights among blocks and leave important weights underappreciated. Besides, they directly apply N:M sparsity to the whole network, which will cause severe inf… ▽ More

    Submitted 16 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted by the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (CVPR2024)

  30. arXiv:2312.02214  [pdf, other

    cs.CV cs.GR

    FlashAvatar: High-fidelity Head Avatar with Efficient Gaussian Embedding

    Authors: Jun Xiang, Xuan Gao, Yudong Guo, Juyong Zhang

    Abstract: We propose FlashAvatar, a novel and lightweight 3D animatable avatar representation that could reconstruct a digital avatar from a short monocular video sequence in minutes and render high-fidelity photo-realistic images at 300FPS on a consumer-grade GPU. To achieve this, we maintain a uniform 3D Gaussian field embedded in the surface of a parametric face model and learn extra spatial offset to mo… ▽ More

    Submitted 29 March, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: Project page: https://ustc3dv.github.io/FlashAvatar/

  31. arXiv:2312.01791  [pdf, other

    nucl-th

    Coupling shape and pairing vibrations in a collective Hamiltonian based on nuclear energy density functionals (II): low-energy excitation spectra of triaxial nuclei

    Authors: J. Xiang, Z. P. Li, T. Nikšić, D. Vretenar, W. H. Long, X. Y. Wu

    Abstract: The triaxial quadrupole collective Hamiltonian, based on relativistic energy density functionals, is extended to include a pairing collective coordinate. In addition to triaxial shape vibrations and rotations, the model describes pairing vibrations and the coupling between triaxial shape and pairing degrees of freedom. The parameters of the collective Hamiltonian are determined by a covariant ener… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 11 pages, 8 figures, Submitted to Phys. Rev. C. arXiv admin note: text overlap with arXiv:2002.00327

  32. arXiv:2311.12185  [pdf, other

    cs.RO

    Kitchen Artist: Precise Control of Liquid Dispensing for Gourmet Plating

    Authors: Hung-Jui Huang, **gyi Xiang, Wenzhen Yuan

    Abstract: Manipulating liquid is widely required for many tasks, especially in cooking. A common way to address this is extruding viscous liquid from a squeeze bottle. In this work, our goal is to create a sauce plating robot, which requires precise control of the thickness of squeezed liquids on a surface. Different liquids demand different manipulation policies. We command the robot to tilt the container… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Submitted to ICRA 2024

  33. arXiv:2310.15437  [pdf

    physics.optics

    Ultrabroadband, high color purity multispectral color filter arrays

    Authors: Jiewei Xiang, Meiting Song, Yi Zhang, Jennifer Kruschwitz, Jaime Cardenas

    Abstract: Multispectral imagers that capture spatial and spectral information are of growing importance in various fields, particularly in remote sensing and metrology. To enable integrated snapshot multispectral imagers and eliminate the drawbacks of traditional systems, such as bulkiness and slow scanning mechanisms, miniature, broadband multispectral filter arrays with narrow line widths, high transmissi… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 24 pages,5 figures,article

  34. arXiv:2310.13245  [pdf, other

    cs.RO

    Simultaneous Shape Tracking of Multiple Deformable Linear Objects with Global-Local Topology Preservation

    Authors: **gyi Xiang, Holly Dinkel

    Abstract: This work presents an algorithm for tracking the shape of multiple entangling Deformable Linear Objects (DLOs) from a sequence of RGB-D images. This algorithm runs in real-time and improves on previous single-DLO tracking approaches by enabling tracking of multiple objects. This is achieved using Global-Local Topology Preservation (GLTP). This work uses the geodesic distance in GLTP to define the… ▽ More

    Submitted 23 October, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: 3 pages, 3 figures, presented at the 3rd Workshop on Representing and Manipulating Deformable Objects at the IEEE International Conference on Robotics and Automation. Video presentation [https://youtu.be/hfiqwMxitqA]. 3rd Workshop on Representing and Manipulating Deformable Objects [https://deformable-workshop.github.io/icra2023/]

  35. arXiv:2310.12987  [pdf, other

    eess.IV cs.CV cs.GR

    Spec-NeRF: Multi-spectral Neural Radiance Fields

    Authors: Jiabao Li, Yuqi Li, Ciliang Sun, Chong Wang, **hui Xiang

    Abstract: We propose Multi-spectral Neural Radiance Fields(Spec-NeRF) for jointly reconstructing a multispectral radiance field and spectral sensitivity functions(SSFs) of the camera from a set of color images filtered by different filters. The proposed method focuses on modeling the physical imaging process, and applies the estimated SSFs and radiance field to synthesize novel views of multispectral scenes… ▽ More

    Submitted 14 September, 2023; originally announced October 2023.

  36. arXiv:2310.12004  [pdf, other

    cs.CV

    Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach

    Authors: Feng Luo, **xi Xiang, Jun Zhang, Xiao Han, Wei Yang

    Abstract: The recent use of diffusion prior, enhanced by pre-trained text-image models, has markedly elevated the performance of image super-resolution (SR). To alleviate the huge computational cost required by pixel-based diffusion SR, latent-based methods utilize a feature encoder to transform the image and then implement the SR image generation in a compact latent space. Nevertheless, there are two major… ▽ More

    Submitted 13 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: 15 pages, 7 figures

  37. arXiv:2310.10292  [pdf, other

    cs.CV cs.MM

    Effortless Cross-Platform Video Codec: A Codebook-Based Method

    Authors: Kuan Tian, Yonghang Guan, **xi Xiang, Jun Zhang, Xiao Han, Wei Yang

    Abstract: Under certain circumstances, advanced neural video codecs can surpass the most complex traditional codecs in their rate-distortion (RD) performance. One of the main reasons for the high performance of existing neural video codecs is the use of the entropy model, which can provide more accurate probability distribution estimations for compressing the latents. This also implies the rigorous requirem… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 15 pages, 11 figures

  38. arXiv:2310.06218  [pdf, other

    cs.LG cs.AI

    SUBP: Soft Uniform Block Pruning for 1xN Sparse CNNs Multithreading Acceleration

    Authors: **gyang Xiang, Siqi Li, Jun Chen, Shipeng Bai, Yukai Ma, Guang Dai, Yong Liu

    Abstract: The study of sparsity in Convolutional Neural Networks (CNNs) has become widespread to compress and accelerate models in environments with limited resources. By constraining N consecutive weights along the output channel to be group-wise non-zero, the recent network with 1$\times$N sparsity has received tremendous popularity for its three outstanding advantages: 1) A large amount of storage space… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 14 pages, 4 figures, Accepted by 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  39. arXiv:2310.05391  [pdf, other

    cs.GR cs.CV

    Neural Impostor: Editing Neural Radiance Fields with Explicit Shape Manipulation

    Authors: Ruiyang Liu, **xu Xiang, Bowen Zhao, Ran Zhang, **gyi Yu, Changxi Zheng

    Abstract: Neural Radiance Fields (NeRF) have significantly advanced the generation of highly realistic and expressive 3D scenes. However, the task of editing NeRF, particularly in terms of geometry modification, poses a significant challenge. This issue has obstructed NeRF's wider adoption across various applications. To tackle the problem of efficiently editing neural implicit fields, we introduce Neural I… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted at Pacific Graphics 2023 and Computer Graphics Forum

  40. Towards Real-Time Neural Video Codec for Cross-Platform Application Using Calibration Information

    Authors: Kuan Tian, Yonghang Guan, **xi Xiang, Jun Zhang, Xiao Han, Wei Yang

    Abstract: The state-of-the-art neural video codecs have outperformed the most sophisticated traditional codecs in terms of RD performance in certain cases. However, utilizing them for practical applications is still challenging for two major reasons. 1) Cross-platform computational errors resulting from floating point operations can lead to inaccurate decoding of the bitstream. 2) The high computational com… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: 14 pages

  41. Non-parametric Ensemble Empirical Mode Decomposition for extracting weak features to identify bearing defects

    Authors: Anil Kumar, Yaakoub Berrouche, Radosław Zimroz, Govind Vashishtha, Sumika Chauhan, C. P. Gandhi, Hesheng Tang, Jiawei Xiang

    Abstract: A non-parametric complementary ensemble empirical mode decomposition (NPCEEMD) is proposed for identifying bearing defects using weak features. NPCEEMD is non-parametric because, unlike existing decomposition methods such as ensemble empirical mode decomposition, it does not require defining the ideal SNR of noise and the number of ensembles, every time while processing the signals. The simulation… ▽ More

    Submitted 2 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

    Journal ref: Measurement 211, 112615 (2023)

  42. arXiv:2309.02186  [pdf, other

    cs.CV cs.AI cs.GR

    AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections

    Authors: Yue Wu, Sicheng Xu, Jianfeng Xiang, Fangyun Wei, Qifeng Chen, Jiaolong Yang, Xin Tong

    Abstract: Previous animatable 3D-aware GANs for human generation have primarily focused on either the human head or full body. However, head-only videos are relatively uncommon in real life, and full body generation typically does not deal with facial expression control and still has challenges in generating high-quality results. Towards applicable video avatars, we present an animatable 3D-aware GAN that g… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: SIGGRAPH Asia 2023. Project Page: https://yuewuhkust.github.io/AniPortraitGAN/

  43. arXiv:2308.15727  [pdf, other

    cs.CL

    Quantifying and Analyzing Entity-level Memorization in Large Language Models

    Authors: Zhenhong Zhou, Jiuyang Xiang, Chaomeng Chen, Sen Su

    Abstract: Large language models (LLMs) have been proven capable of memorizing their training data, which can be extracted through specifically designed prompts. As the scale of datasets continues to grow, privacy risks arising from memorization have attracted increasing attention. Quantifying language model memorization helps evaluate potential privacy risks. However, prior works on quantifying memorization… ▽ More

    Submitted 5 November, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: 9 pages, 7 figures

  44. arXiv:2308.07733  [pdf, other

    eess.IV cs.CV cs.MM

    Dynamic Low-Rank Instance Adaptation for Universal Neural Image Compression

    Authors: Yue Lv, **xi Xiang, Jun Zhang, Wenming Yang, Xiao Han, Wei Yang

    Abstract: The latest advancements in neural image compression show great potential in surpassing the rate-distortion performance of conventional standard codecs. Nevertheless, there exists an indelible domain gap between the datasets utilized for training (i.e., natural images) and those utilized for inference (e.g., artistic images). Our proposal involves a low-rank adaptation approach aimed at addressing… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023, 13 pages, 12 figures

    ACM Class: I.4.2; E.4

  45. arXiv:2308.02195  [pdf, ps, other

    math.PR

    Stochastic averaging principle and stability for multi-valued McKean-Vlasov stochastic differential equations with jumps

    Authors: Guangjun Shen, Jie Xiang, Jiang-Lun Wu

    Abstract: In this paper, we consider the stochastic averaging principle and stability for multi-valued McKean-Vlasov stochastic differential equations with jumps. First, under certain averaging conditions, we are able to show that the solutions of the equations concerned can be approximated by solutions of the associated averaged multi-valued McKean-Vlasov stochastic differential equations with jumps in the… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: 31 pages. arXiv admin note: text overlap with arXiv:2106.12080 by other authors

  46. arXiv:2307.13300  [pdf, other

    cs.CV

    Mini-PointNetPlus: a local feature descriptor in deep learning model for 3d environment perception

    Authors: Chuanyu Luo, Nuo Cheng, Sikun Ma, Jun Xiang, Xiaohan Li, Shengguang Lei, Pu Li

    Abstract: Common deep learning models for 3D environment perception often use pillarization/voxelization methods to convert point cloud data into pillars/voxels and then process it with a 2D/3D convolutional neural network (CNN). The pioneer work PointNet has been widely applied as a local feature descriptor, a fundamental component in deep learning models for 3D perception, to extract features of a point c… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  47. arXiv:2307.09831  [pdf, other

    cs.AI

    A Fast and Map-Free Model for Trajectory Prediction in Traffics

    Authors: Junhong Xiang, **gmin Zhang, Zhixiong Nan

    Abstract: To handle the two shortcomings of existing methods, (i)nearly all models rely on high-definition (HD) maps, yet the map information is not always available in real traffic scenes and HD map-building is expensive and time-consuming and (ii) existing models usually focus on improving prediction accuracy at the expense of reducing computing efficiency, yet the efficiency is crucial for various real a… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: 7 pages, 3 figures

  48. arXiv:2306.08177  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall physics.comp-ph quant-ph

    Nonlinear phonon Hall effects in ferroelectrics: its existence and non-volatile electrical control

    Authors: W. Luo, J. Y. Ji, P. Chen, Y. Xu, L. F. Zhang, H. J. Xiang, L. Bellaiche

    Abstract: Nonlinear Hall effects have been previously investigated in non-centrosymmetric systems for electronic systems. However, they only exist in metallic systems and are not compatible with ferroelectrics since these latter are insulators, hence limiting their applications. On the other hand, ferroelectrics naturally break inversion symmetry and can induce a non-zero Berry curvature. Here, we show that… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 16 pages, 2 figures

    Report number: https://journals.aps.org/prb/pdf/10.1103/PhysRevB.107.L241107

    Journal ref: Letter in PRB (2023)

  49. arXiv:2305.10626  [pdf, other

    cs.CL cs.AI cs.LG

    Language Models Meet World Models: Embodied Experiences Enhance Language Models

    Authors: Jiannan Xiang, Tianhua Tao, Yi Gu, Tianmin Shu, Zirui Wang, Zichao Yang, Zhiting Hu

    Abstract: While large language models (LMs) have shown remarkable capabilities across numerous tasks, they often struggle with simple reasoning and planning in physical environments, such as understanding object permanence or planning household activities. The limitation arises from the fact that LMs are trained only on written text and miss essential embodied knowledge and skills. In this paper, we propose… ▽ More

    Submitted 28 October, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  50. arXiv:2304.13240  [pdf, other

    cs.CV cs.LG

    Structure Diagram Recognition in Financial Announcements

    Authors: Meixuan Qiao, Jun Wang, Junfu Xiang, Qiyu Hou, Ruixuan Li

    Abstract: Accurately extracting structured data from structure diagrams in financial announcements is of great practical importance for building financial knowledge graphs and further improving the efficiency of various financial applications. First, we proposed a new method for recognizing structure diagrams in financial announcements, which can better detect and extract different types of connecting lines… ▽ More

    Submitted 1 May, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: ICDAR2023