Skip to main content

Showing 1–50 of 289 results for author: Zhang, E

.
  1. arXiv:2406.16828  [pdf, other

    cs.IR cs.AI cs.CL

    Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track

    Authors: Ronak Pradeep, Nandan Thakur, Sahel Sharifymoghaddam, Eric Zhang, Ryan Nguyen, Daniel Campos, Nick Craswell, Jimmy Lin

    Abstract: Did you try out the new Bing Search? Or maybe you fiddled around with Google AI~Overviews? These might sound familiar because the modern-day search stack has recently evolved to include retrieval-augmented generation (RAG) systems. They allow searching and incorporating real-time data into large language models (LLMs) to provide a well-informed, attributed, concise summary in contrast to the tradi… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.13869  [pdf, other

    cs.LG q-bio.BM

    Global Human-guided Counterfactual Explanations for Molecular Properties via Reinforcement Learning

    Authors: Danqing Wang, Antonis Antoniades, Kha-Dinh Luong, Edwin Zhang, Mert Kosan, Jiachen Li, Ambuj Singh, William Yang Wang, Lei Li

    Abstract: Counterfactual explanations of Graph Neural Networks (GNNs) offer a powerful way to understand data that can naturally be represented by a graph structure. Furthermore, in many domains, it is highly desirable to derive data-driven global explanations or rules that can better explain the high-level properties of the models and data in question. However, evaluating global counterfactual explanations… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024

  3. arXiv:2406.13085  [pdf

    physics.app-ph cond-mat.mes-hall cond-mat.mtrl-sci

    Ultralow thermal conductance across the [FePt/h-BN/FePt] interface

    Authors: chengchao Xu, Enbo Zhang, Bo-Yuan Yang, B. S. D. Ch. S. Varaprasad, David E. Laughlin, Jian-Gang, Zhu

    Abstract: Heat transfer in nanocomposite materials has attracted great interest for various applications. Multilayer structures provide an important platform to study interfacial thermal transport and to engineer materials with ultralow thermal conductivity. Here we report on the fabrication and thermal characterization of [h-BN/$L1_0$-FePt]xN multilayers, where hexagonal boron nitride (h-BN) nanosheets (2.… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 22 page, 5 figures

  4. arXiv:2406.11741  [pdf, other

    cs.LG cs.AI

    Transcendence: Generative Models Can Outperform The Experts That Train Them

    Authors: Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham M. Kakade, Eran Malach

    Abstract: Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outperform the humans on their original objectives. In this work, we study the phenomenon of transcendence: when a generative model achieves capabilities… ▽ More

    Submitted 28 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Code, models, and data at https://transcendence.eddie.win

  5. arXiv:2406.10338  [pdf, other

    astro-ph.GA astro-ph.CO astro-ph.HE

    Bursty Star Formation in Dwarfs is Sensitive to Numerical Choices in Supernova Feedback Models

    Authors: Eric Zhang, Laura V Sales, Federico Marinacci, Paul Torrey, Mark Vogelsberger, Volker Springel, Hui Li, Rüdiger Pakmor, Thales A Gutcke

    Abstract: Simulations of galaxy formation are mostly unable to resolve the energy-conserving phase of individual supernova events, having to resort to subgrid models to distribute the energy and momentum resulting from stellar feedback. However, the properties of these simulated galaxies, including the morphology, stellar mass formed and the burstiness of the star formation history, are highly sensitive to… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Submitted ApJ; 15 pages, 12 figures; comments welcome

  6. arXiv:2406.10057  [pdf, other

    cs.CV cs.AI

    First Multi-Dimensional Evaluation of Flowchart Comprehension for Multimodal Large Language Models

    Authors: Enming Zhang, Ruobing Yao, Huanyong Liu, Junhui Yu, Jiale Wang

    Abstract: With the development of Multimodal Large Language Models (MLLMs) technology, its general capabilities are increasingly powerful. To evaluate the various abilities of MLLMs, numerous evaluation systems have emerged. But now there is still a lack of a comprehensive method to evaluate MLLMs in the tasks related to flowcharts, which are very important in daily life and work. We propose the first compr… ▽ More

    Submitted 18 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  7. arXiv:2406.09215  [pdf, other

    cs.IR cs.AI

    On Softmax Direct Preference Optimization for Recommendation

    Authors: Yuxin Chen, Junfei Tan, An Zhang, Zhengyi Yang, Leheng Sheng, Enzhi Zhang, Xiang Wang, Tat-Seng Chua

    Abstract: Recommender systems aim to predict personalized rankings based on user preference data. With the rise of Language Models (LMs), LM-based recommenders have been widely explored due to their extensive world knowledge and powerful reasoning abilities. Most of the LM-based recommenders convert historical interactions into language prompts, pairing with a positive item as the target response and fine-t… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.05633  [pdf, other

    stat.ML cs.LG econ.EM

    Heterogeneous Treatment Effects in Panel Data

    Authors: Retsef Levi, Elisabeth Paulson, Georgia Perakis, Emily Zhang

    Abstract: We address a core problem in causal inference: estimating heterogeneous treatment effects using panel data with general treatment patterns. Many existing methods either do not utilize the potential underlying structure in panel data or have limitations in the allowable treatment patterns. In this work, we propose and evaluate a new method that first partitions observations into disjoint clusters w… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  9. arXiv:2406.05436  [pdf, other

    cs.NE

    Introducing Competitive Mechanism to Differential Evolution for Numerical Optimization

    Authors: Rui Zhong, Yang Cao, Enzhi Zhang, Masaharu Munetomo

    Abstract: This paper introduces a novel competitive mechanism into differential evolution (DE), presenting an effective DE variant named competitive DE (CDE). CDE features a simple yet efficient mutation strategy: DE/winner-to-best/1. Essentially, the proposed DE/winner-to-best/1 strategy can be recognized as an intelligent integration of the existing mutation strategies of DE/rand-to-best/1 and DE/cur-to-b… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted by The 30th Int'l Conf on Parallel and Distributed Processing Techniques and Applications (PDPTA'24)

  10. arXiv:2406.00005  [pdf, other

    cs.IR cs.AI

    Disentangling Specificity for Abstractive Multi-document Summarization

    Authors: Congbo Ma, Wei Emma Zhang, Hu Wang, Haojie Zhuang, Mingyu Guo

    Abstract: Multi-document summarization (MDS) generates a summary from a document set. Each document in a set describes topic-relevant concepts, while per document also has its unique contents. However, the document specificity receives little attention from existing MDS approaches. Neglecting specific information for each document limits the comprehensiveness of the generated summaries. To solve this proble… ▽ More

    Submitted 12 May, 2024; originally announced June 2024.

    Comments: The IEEE World Congress on Computational Intelligence (WCCI 2024)

  11. arXiv:2405.14225  [pdf, other

    q-bio.QM cs.CL cs.MM

    ReactXT: Understanding Molecular "Reaction-ship" via Reaction-Contextualized Molecule-Text Pretraining

    Authors: Zhiyuan Liu, Yaorui Shi, An Zhang, Sihang Li, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua

    Abstract: Molecule-text modeling, which aims to facilitate molecule-relevant tasks with a textual interface and textual knowledge, is an emerging research direction. Beyond single molecules, studying reaction-text modeling holds promise for hel** the synthesis of new materials and drugs. However, previous works mostly neglect reaction-text modeling: they primarily focus on modeling individual molecule-tex… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: ACL 2024 Findings, 9 pages

  12. arXiv:2405.12833  [pdf, other

    cs.CV

    A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data

    Authors: Xinyi Wang, Grazziela Figueredo, Ruizhe Li, Wei Emma Zhang, Weitong Chen, Xin Chen

    Abstract: Automatic radiology report generation can alleviate the workload for physicians and minimize regional disparities in medical resources, therefore becoming an important topic in the medical image analysis field. It is a challenging task, as the computational model needs to mimic physicians to obtain information from multi-modal input data (i.e., medical images, clinical information, medical knowled… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  13. arXiv:2405.12735  [pdf, other

    astro-ph.GA

    Multiple chemical tracers finally unveil the intricate NGC\,1333 IRAS\,4A outflow system. FAUST XVI

    Authors: Layal Chahine, Cecilia Ceccarelli, Marta De Simone, Claire J. Chandler, Claudio Codella, Linda Podio, Ana López-Sepulcre, Nami Sakai, Laurent Loinard, Mathilde Bouvier, Paola Caselli, Charlotte Vastel, Eleonora Bianchi, Nicolás Cuello, Francesco Fontani, Doug Johnstone, Giovanni Sabatini, Tomoyuki Hanawa, Ziwei E. Zhang, Yuri Aikawa, Gemma Busquet, Emmanuel Caux, Aurore Durán, Eric Herbst, François Ménard , et al. (32 additional authors not shown)

    Abstract: The exploration of outflows in protobinary systems presents a challenging yet crucial endeavour, offering valuable insights into the dynamic interplay between protostars and their evolution. In this study, we examine the morphology and dynamics of jets and outflows within the IRAS\,4A protobinary system. This analysis is based on ALMA observations of SiO(5--4), H$_2$CO(3$_{0,3}$--2$_{0,3}$), and H… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  14. arXiv:2405.12564  [pdf, other

    q-bio.QM cs.CL cs.MM

    ProtT3: Protein-to-Text Generation for Text-based Protein Understanding

    Authors: Zhiyuan Liu, An Zhang, Hao Fei, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua

    Abstract: Language Models (LMs) excel in understanding textual descriptions of proteins, as evident in biomedical question-answering tasks. However, their capability falters with raw protein data, such as amino acid sequences, due to a deficit in pretraining on such data. Conversely, Protein Language Models (PLMs) can understand and convert protein data into high-quality representations, but struggle to pro… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: ACL 2024, 9 pages

  15. arXiv:2405.12380  [pdf, other

    cs.LG physics.comp-ph

    Large scale scattering using fast solvers based on neural operators

    Authors: Zongren Zou, Adar Kahana, Enrui Zhang, Eli Turkel, Rishikesh Ranade, Jay Pathak, George Em Karniadakis

    Abstract: We extend a recently proposed machine-learning-based iterative solver, i.e. the hybrid iterative transferable solver (HINTS), to solve the scattering problem described by the Helmholtz equation in an exterior domain with a complex absorbing boundary condition. The HINTS method combines neural operators (NOs) with standard iterative solvers, e.g. Jacobi and Gauss-Seidel (GS), to achieve better perf… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  16. arXiv:2405.01461  [pdf, other

    cs.CV

    SATO: Stable Text-to-Motion Framework

    Authors: Wenshuo Chen, Hongru Xiao, Erhang Zhang, Lijie Hu, Lei Wang, Mengyuan Liu, Chen Chen

    Abstract: Is the Text to Motion model robust? Recent advancements in Text to Motion models primarily stem from more accurate predictions of specific actions. However, the text modality typically relies solely on pre-trained Contrastive Language-Image Pretraining (CLIP) models. Our research has uncovered a significant issue with the text-to-motion model: its predictions often exhibit inconsistent outputs, re… ▽ More

    Submitted 3 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

  17. arXiv:2404.16935  [pdf, other

    cond-mat.str-el

    Disentangling spin excitation continua in classical and quantum magnets using 2D nonlinear spectroscopy

    Authors: Emily Z. Zhang, Ciarán Hickey, Yong Baek Kim

    Abstract: Inelastic neutron scattering (INS) has traditionally been one of the primary methods for investigating quantum magnets, particularly in identifying a continuum of excitations as a hallmark of spin fractionalization in quantum spin liquids (QSLs). However, INS faces severe limitations due to its inability to distinguish between such QSL signatures and similar excitation continua arising from highly… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5+1 figures

  18. arXiv:2404.14949  [pdf, other

    cs.CV

    Multi-Modal Prompt Learning on Blind Image Quality Assessment

    Authors: Wensheng Pan, Timin Gao, Yan Zhang, Runze Hu, Xiawu Zheng, Enwei Zhang, Yuting Gao, Yutao Liu, Yunhang Shen, Ke Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji

    Abstract: Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Currently, leveraging semantic information to enhance IQA is a crucial research direction. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semant… ▽ More

    Submitted 18 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  19. arXiv:2404.10357  [pdf, other

    cs.CV

    Optimization of Prompt Learning via Multi-Knowledge Representation for Vision-Language Models

    Authors: Enming Zhang, Bingke Zhu, Yingying Chen, Qinghai Miao, Ming Tang, **qiao Wang

    Abstract: Vision-Language Models (VLMs), such as CLIP, play a foundational role in various cross-modal applications. To fully leverage VLMs' potential in adapting to downstream tasks, context optimization methods like Prompt Tuning are essential. However, one key limitation is the lack of diversity in prompt templates, whether they are hand-crafted or learned through additional modules. This limitation rest… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  20. arXiv:2404.09707  [pdf, other

    cs.CV cs.AI cs.LG

    Adaptive Patching for High-resolution Image Segmentation with Transformers

    Authors: Enzhi Zhang, Isaac Lyngaas, Peng Chen, Xiao Wang, Jun Igarashi, Yuankai Huo, Mohamed Wahib, Masaharu Munetomo

    Abstract: Attention-based models are proliferating in the space of image analytics, including segmentation. The standard method of feeding images to transformer encoders is to divide the images into patches and then feed the patches to the model as a linear sequence of tokens. For high-resolution images, e.g. microscopic pathology images, the quadratic compute and memory cost prohibits the use of an attenti… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  21. arXiv:2404.09235  [pdf, other

    astro-ph.GA

    PDRs4All IX. Sulfur elemental abundance in the Orion Bar

    Authors: Asunción Fuente, Evelyne Roueff, Franck Le Petit, Jacques Le Bourlot, Emeric Bron, Mark G. Wolfire, James F. Babb, Pei-Gen Yan, Takashi Onaka, John H. Black, Ilane Schroetter, Dries Van De Putte, Ameek Sidhu, Amélie Canin, Boris Trahin, Felipe Alarcón, Ryan Chown, Olga Kannavou, Olivier Berné, Emilie Habart, Els Peeters, Javier R. Goicoechea, Marion Zannese, Raphael Meshaka, Yoko Okada , et al. (9 additional authors not shown)

    Abstract: One of the main problems in astrochemistry is determining the amount of sulfur in volatiles and refractories in the interstellar medium. The detection of the main sulfur reservoirs (icy H$_2$S and atomic gas) has been challenging, and estimates are based on the reliability of models to account for the abundances of species containing less than 1% of the total sulfur. The high sensitivity of the Ja… ▽ More

    Submitted 4 June, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: 16 pages, 6 figures. Accepted for publication in Astronomy and Astrophysics

  22. arXiv:2404.08675  [pdf, other

    cs.IR cs.AI cs.CL

    RecGPT: Generative Personalized Prompts for Sequential Recommendation via ChatGPT Training Paradigm

    Authors: Yabin Zhang, Wenhui Yu, Erhan Zhang, Xu Chen, Lantao Hu, Peng Jiang, Kun Gai

    Abstract: ChatGPT has achieved remarkable success in natural language understanding. Considering that recommendation is indeed a conversation between users and the system with items as words, which has similar underlying pattern with ChatGPT, we design a new chat framework in item index level for the recommendation task. Our novelty mainly contains three parts: model, training and inference. For the model p… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  23. PDRs4All VIII: Mid-IR emission line inventory of the Orion Bar

    Authors: Dries Van De Putte, Raphael Meshaka, Boris Trahin, Emilie Habart, Els Peeters, Olivier Berné, Felipe Alarcón, Amélie Canin, Ryan Chown, Ilane Schroetter, Ameek Sidhu, Christiaan Boersma, Emeric Bron, Emmanuel Dartois, Javier R. Goicoechea, Karl D. Gordon, Takashi Onaka, Alexander G. G. M. Tielens, Laurent Verstraete, Mark G. Wolfire, Alain Abergel, Edwin A. Bergin, Jeronimo Bernard-Salas, Jan Cami, Sara Cuadrado , et al. (113 additional authors not shown)

    Abstract: Mid-infrared emission features probe the properties of ionized gas, and hot or warm molecular gas. The Orion Bar is a frequently studied photodissociation region (PDR) containing large amounts of gas under these conditions, and was observed with the MIRI IFU aboard JWST as part of the "PDRs4All" program. The resulting IR spectroscopic images of high angular resolution (0.2") reveal a rich observat… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 26 pages, 12 figures, 3 tables. Submitted to A&A, under review (1st revision)

    Journal ref: A&A 687, A86 (2024)

  24. arXiv:2403.18108  [pdf, other

    astro-ph.GA astro-ph.SR

    FAUST XIII. Dusty cavity and molecular shock driven by IRS7B in the Corona Australis cluster

    Authors: G. Sabatini, L. Podio, C. Codella, Y. Watanabe, M. De Simone, E. Bianchi, C. Ceccarelli, C. J. Chandler, N. Sakai, B. Svoboda, L. Testi, Y. Aikawa, N. Balucani, M. Bouvier, P. Caselli, E. Caux, L. Chahine, S. Charnley, N. Cuello, F. Dulieu, L. Evans, D. Fedele, S. Feng, F. Fontani, T. Hama , et al. (32 additional authors not shown)

    Abstract: The origin of the chemical diversity observed around low-mass protostars probably resides in the earliest history of these systems. We aim to investigate the impact of protostellar feedback on the chemistry and grain growth in the circumstellar medium of multiple stellar systems. In the context of the ALMA Large Program FAUST, we present high-resolution (50 au) observations of CH$_3$OH, H$_2$CO, a… ▽ More

    Submitted 2 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: 12 pages, 8 figures, 3 tables. Accepted Letter in Astronomy & Astrophysics

  25. arXiv:2403.15315  [pdf, other

    cond-mat.str-el quant-ph

    Quantum Fluctuations Suppress the Critical Fields in BaCo$_2$(AsO$_4$)$_2$

    Authors: Shiva Safari, William Bateman-Hemphill, Asimpunya Mitra, Félix Desrochers, Emily Z. Zhang, Lubuna Shafeek, Austin Ferrenti, Tyrel M. McQueen, Arkady Shekhter, Zoltán Köllö, Yong Baek Kim, B. J. Ramshaw, K. A. Modic

    Abstract: Early efforts to realize exotic quantum ground states in frustrated magnets focused on frustration arising from the lattice geometry alone. Attention has shifted to bond-dependent anisotropic interactions, as well as further-neighbor interactions, on non-geometrically-frustrated lattices due to their greater versatility. The honeycomb magnet BaCo$_2$(AsO$_4$)$_2$ recently emerged as a candidate ho… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 16 pages, 12 figures

  26. arXiv:2403.14598  [pdf, other

    cs.CV

    PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model

    Authors: Zheng Zhang, Yeyao Ma, Enming Zhang, Xiang Bai

    Abstract: PSALM is a powerful extension of the Large Multi-modal Model (LMM) to address the segmentation task challenges. To overcome the limitation of the LMM being limited to textual output, PSALM incorporates a mask decoder and a well-designed input schema to handle a variety of segmentation tasks. This schema includes images, task instructions, conditional prompts, and mask tokens, which enable the mode… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  27. arXiv:2403.09142  [pdf, other

    cs.IR cs.AI

    USimAgent: Large Language Models for Simulating Search Users

    Authors: Erhan Zhang, Xingzhu Wang, Peiyuan Gong, Yankai Lin, Jiaxin Mao

    Abstract: Due to the advantages in the cost-efficiency and reproducibility, user simulation has become a promising solution to the user-centric evaluation of information retrieval systems. Nonetheless, accurately simulating user search behaviors has long been a challenge, because users' actions in search are highly complex and driven by intricate cognitive processes such as learning, reasoning, and planning… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  28. arXiv:2403.01259  [pdf, other

    physics.ins-det hep-ex

    Improved Modelling of Detector Response Effects in Phonon-based Crystal Detectors used for Dark Matter Searches

    Authors: M. J. Wilson, A. Zaytsev, B. von Krosigk, I. Alkhatib, M. Buchanan, R. Chen, M. D. Diamond, E. Figueroa-Feliciano, S. A. S. Harms, Z. Hong, K. T. Kennard, N. A. Kurinsky, R. Mahapatra, N. Mirabolfathi, V. Novati, M. Platt, R. Ren, A. Sattari, B. Schmidt, Y. Wang, S. Zatschler, E. Zhang, A. Zuniga

    Abstract: Various dark matter search experiments employ phonon-based crystal detectors operated at cryogenic temperatures. Some of these detectors, including certain silicon detectors used by the SuperCDMS Collaboration, are able to achieve single-charge sensitivity when a voltage bias is applied across the detector. The total amount of phonon energy measured by such a detector is proportional to the number… ▽ More

    Submitted 24 June, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: 19 pages, 7 figures

    Journal ref: Phys. Rev. D 109, 112018 (2024)

  29. arXiv:2403.00160  [pdf, other

    astro-ph.GA

    A far-ultraviolet-driven photoevaporation flow observed in a protoplanetary disk

    Authors: Olivier Berné, Emilie Habart, Els Peeters, Ilane Schroetter, Amélie Canin, Ameek Sidhu, Ryan Chown, Emeric Bron, Thomas J. Haworth, Pamela Klaassen, Boris Trahin, Dries Van De Putte, Felipe Alarcón, Marion Zannese, Alain Abergel, Edwin A. Bergin, Jeronimo Bernard-Salas, Christiaan Boersma, Jan Cami, Sara Cuadrado, Emmanuel Dartois, Daniel Dicken, Meriem Elyajouri, Asunción Fuente, Javier R. Goicoechea , et al. (121 additional authors not shown)

    Abstract: Most low-mass stars form in stellar clusters that also contain massive stars, which are sources of far-ultraviolet (FUV) radiation. Theoretical models predict that this FUV radiation produces photo-dissociation regions (PDRs) on the surfaces of protoplanetary disks around low-mass stars, impacting planet formation within the disks. We report JWST and Atacama Large Millimetere Array observations of… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

    Journal ref: Science, 383, 6686, 2024

  30. arXiv:2402.17110  [pdf, other

    cs.LG cs.CL

    Sinkhorn Distance Minimization for Knowledge Distillation

    Authors: Xiao Cui, Yulei Qin, Yuting Gao, Enwei Zhang, Zihan Xu, Tong Wu, Ke Li, Xing Sun, Wengang Zhou, Houqiang Li

    Abstract: Knowledge distillation (KD) has been widely adopted to compress large language models (LLMs). Existing KD methods investigate various divergence measures including the Kullback-Leibler (KL), reverse Kullback-Leibler (RKL), and Jensen-Shannon (JS) divergences. However, due to limitations inherent in their assumptions and definitions, these measures fail to deliver effective supervision when few dis… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted by COLING 2024

  31. arXiv:2402.16641  [pdf, other

    cs.CV

    Towards Open-ended Visual Quality Comparison

    Authors: Haoning Wu, Hanwei Zhu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Annan Wang, Wenxiu Sun, Qiong Yan, Xiaohong Liu, Guangtao Zhai, Shiqi Wang, Weisi Lin

    Abstract: Comparative settings (e.g. pairwise choice, listwise ranking) have been adopted by a wide range of subjective studies for image quality assessment (IQA), as it inherently standardizes the evaluation criteria across different observers and offer more clear-cut responses. In this work, we extend the edge of emerging large multi-modality models (LMMs) to further advance visual quality comparison into… ▽ More

    Submitted 4 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Fix typos

  32. arXiv:2402.14807  [pdf, other

    cs.MA cs.AI cs.LG

    A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

    Authors: Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe

    Abstract: Restless multi-armed bandits (RMAB) have demonstrated success in optimizing resource allocation for large beneficiary populations in public health settings. Unfortunately, RMAB models lack flexibility to adapt to evolving public health policy priorities. Concurrently, Large Language Models (LLMs) have emerged as adept automated planners across domains of robotic control and navigation. In this pap… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  33. arXiv:2402.14090  [pdf, other

    cs.AI econ.GN stat.ML

    Social Environment Design

    Authors: Edwin Zhang, Sadie Zhao, Tonghan Wang, Safwan Hossain, Henry Gasztowtt, Stephan Zheng, David C. Parkes, Milind Tambe, Yiling Chen

    Abstract: Artificial Intelligence (AI) holds promise as a technology that can be used to improve government and economic policy-making. This paper proposes a new research agenda towards this end by introducing Social Environment Design, a general framework for the use of AI for automated policy-making that connects with the Reinforcement Learning, EconCS, and Computational Social Choice communities. The fra… ▽ More

    Submitted 17 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ICML 2024 Position Paper. Website at https://sed.eddie.win

  34. arXiv:2402.10258  [pdf, other

    astro-ph.SR astro-ph.GA

    FAUST XII. Accretion streamers and jets in the VLA 1623--2417 protocluster

    Authors: C. Codella, L. Podio, M. De Simone, C. Ceccarelli, S. Ohashi, C. J. Chandler, N. Sakai, J. E. Pineda, D. M. Segura-Cox, E. Bianchi, N. Cuello, A. López-Sepulcre, D. Fedele, P. Caselli, S. Charnley, D. Johnstone, Z. E. Zhang, M. J. Maureira, Y. Zhang, G. Sabatini, B. Svoboda, I. Jiménez-Serra, L. Loinard, S. Mercimek, N. Murillo , et al. (1 additional authors not shown)

    Abstract: The ALMA interferometer has played a key role in revealing a new component of the Sun-like star forming process: the molecular streamers, i.e. structures up to thousands of au long funneling material non-axisymmetrically to disks. In the context of the FAUST ALMA LP, the archetypical VLA1623-2417 protostellar cluster has been imaged at 1.3 mm in the SO(5$_6$--4$_5$), SO(6$_6$--5$_5$), and SiO(5--4… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: accepted by MNRAS

  35. arXiv:2402.09705  [pdf, other

    quant-ph cs.AR

    Linear Depth QFT over IBM Heavy-hex Architecture

    Authors: Xiangyu Gao, Yuwei **, Minghao Guo, Henry Chen, Eddy Z. Zhang

    Abstract: Compiling a given quantum algorithm into a target hardware architecture is a challenging optimization problem. The compiler must take into consideration the coupling graph of physical qubits and the gate operation dependencies. The existing noise in hardware architectures requires the compilation to use as few running cycles as possible. Existing approaches include using SAT solver or heuristics t… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  36. arXiv:2402.07116  [pdf, other

    cs.CV

    A Benchmark for Multi-modal Foundation Models on Low-level Vision: from Single Images to Pairs

    Authors: Zicheng Zhang, Haoning Wu, Erli Zhang, Guangtao Zhai, Weisi Lin

    Abstract: The rapid development of Multi-modality Large Language Models (MLLMs) has navigated a paradigm shift in computer vision, moving towards versatile foundational models. However, evaluating MLLMs in low-level visual perception and understanding remains a yet-to-explore domain. To this end, we design benchmark settings to emulate human language responses related to low-level vision: the low-level visu… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2309.14181

  37. arXiv:2402.01512  [pdf, other

    cs.CL

    Distractor Generation for Multiple-Choice Questions: A Survey of Methods, Datasets, and Evaluation

    Authors: Elaf Alhazmi, Quan Z. Sheng, Wei Emma Zhang, Munazza Zaib, Ahoud Alhazmi

    Abstract: Distractors are important in learning evaluation. This paper surveys distractor generation tasks using English multiple-choice question datasets for textual and multimodal contexts. In particular, this paper presents a thorough literature review of the recent studies on distractor generation tasks, discusses multiple choice components and their characteristics, analyzes the related datasets, and s… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  38. arXiv:2401.17654  [pdf, other

    cs.CV

    All Beings Are Equal in Open Set Recognition

    Authors: Chaohua Li, Enhao Zhang, Chuanxing Geng, SongCan Chen

    Abstract: In open-set recognition (OSR), a promising strategy is exploiting pseudo-unknown data outside given $K$ known classes as an additional $K$+$1$-th class to explicitly model potential open space. However, treating unknown classes without distinction is unequal for them relative to known classes due to the category-agnostic and scale-agnostic of the unknowns. This inevitably not only disrupts the inh… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: Accepted by the main track The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  39. arXiv:2401.12197  [pdf, other

    math.PR

    Empirical martingale projections via the adapted Wasserstein distance

    Authors: Jose Blanchet, Johannes Wiesel, Erica Zhang, Zhenyuan Zhang

    Abstract: Given a collection of multidimensional pairs $\{(X_i,Y_i):1 \leq i\leq n\}$, we study the problem of projecting the associated suitably smoothed empirical measure onto the space of martingale couplings (i.e. distributions satisfying $\mathbb{E}[Y|X]=X$) using the adapted Wasserstein distance. We call the resulting distance the smoothed empirical martingale projection distance (SE-MPD), for which w… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 55 pages, 7 figures

  40. arXiv:2312.17090  [pdf, other

    cs.CV cs.CL cs.LG

    Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

    Authors: Haoning Wu, Zicheng Zhang, Weixia Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Yixuan Gao, Annan Wang, Erli Zhang, Wenxiu Sun, Qiong Yan, Xiongkuo Min, Guangtao Zhai, Weisi Lin

    Abstract: The explosion of visual content available online underscores the requirement for an accurate machine assessor to robustly evaluate scores across diverse types of visual contents. While recent studies have demonstrated the exceptional potentials of large multi-modality models (LMMs) on a wide range of related fields, in this work, we explore how to teach them for visual rating aligned with human op… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Technical Report

  41. arXiv:2312.16114  [pdf, other

    quant-ph

    Quantum Fourier Transformation Circuits Compilation

    Authors: Yuwei **, Xiangyu Gao, Minghao Guo, Henry Chen, Fei Hua, Chi Zhang, Eddy Z. Zhang

    Abstract: In this research paper, our primary focus revolves around the domain-specific hardware map** strategy tailored for Quantum Fourier Transformation (QFT) circuits. While previous approaches have heavily relied on SAT solvers or heuristic methods to generate hardware-compatible QFT circuits by inserting SWAP gates to realign logical qubits with physical qubits at various stages, they encountered si… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  42. arXiv:2312.15300  [pdf, other

    cs.CV

    Q-Boost: On Visual Quality Assessment Ability of Low-level Multi-Modality Foundation Models

    Authors: Zicheng Zhang, Haoning Wu, Zhongpeng Ji, Chunyi Li, Erli Zhang, Wei Sun, Xiaohong Liu, Xiongkuo Min, Fengyu Sun, Shangling Jui, Weisi Lin, Guangtao Zhai

    Abstract: Recent advancements in Multi-modality Large Language Models (MLLMs) have demonstrated remarkable capabilities in complex high-level vision tasks. However, the exploration of MLLM potential in visual quality assessment, a vital aspect of low-level vision, remains limited. To address this gap, we introduce Q-Boost, a novel strategy designed to enhance low-level MLLMs in image quality assessment (IQA… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  43. arXiv:2312.09983  [pdf, other

    cs.LG cs.AI stat.ML

    Toward Computationally Efficient Inverse Reinforcement Learning via Reward Sha**

    Authors: Lauren H. Cooke, Harvey Klyne, Edwin Zhang, Cassidy Laidlaw, Milind Tambe, Finale Doshi-Velez

    Abstract: Inverse reinforcement learning (IRL) is computationally challenging, with common approaches requiring the solution of multiple reinforcement learning (RL) sub-problems. This work motivates the use of potential-based reward sha** to reduce the computational burden of each RL sub-problem. This work serves as a proof-of-concept and we hope will inspire future developments towards computationally ef… ▽ More

    Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

  44. arXiv:2312.08653  [pdf, other

    cs.CV

    SKDF: A Simple Knowledge Distillation Framework for Distilling Open-Vocabulary Knowledge to Open-world Object Detector

    Authors: Shuailei Ma, Yuefeng Wang, Ying Wei, Jiaqi Fan, Enming Zhang, Xinyu Sun, Peihao Chen

    Abstract: In this paper, we attempt to specialize the VLM model for OWOD tasks by distilling its open-world knowledge into a language-agnostic detector. Surprisingly, we observe that the combination of a simple \textbf{knowledge distillation} approach and the automatic pseudo-labeling mechanism in OWOD can achieve better performance for unknown object detection, even with a small amount of data. Unfortunate… ▽ More

    Submitted 30 March, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.11623

  45. arXiv:2312.06363  [pdf, other

    cs.AI cs.CL cs.LG

    MMICT: Boosting Multi-Modal Fine-Tuning with In-Context Examples

    Authors: Tao Chen, Enwei Zhang, Yuting Gao, Ke Li, Xing Sun, Yan Zhang, Hui Li

    Abstract: Although In-Context Learning (ICL) brings remarkable performance gains to Large Language Models (LLMs), the improvements remain lower than fine-tuning on downstream tasks. This paper introduces Multi-Modal In-Context Tuning (MMICT), a novel multi-modal fine-tuning paradigm that boosts multi-modal fine-tuning by fully leveraging the promising ICL capability of multi-modal LLMs (MM-LLMs). We propose… ▽ More

    Submitted 12 December, 2023; v1 submitted 11 December, 2023; originally announced December 2023.

  46. arXiv:2312.02409  [pdf, other

    cs.CV cs.RO

    MGTR: Multi-Granular Transformer for Motion Prediction with LiDAR

    Authors: Yiqian Gan, Hao Xiao, Yizhe Zhao, Ethan Zhang, Zhe Huang, Xin Ye, Lingting Ge

    Abstract: Motion prediction has been an essential component of autonomous driving systems since it handles highly uncertain and complex scenarios involving moving agents of different types. In this paper, we propose a Multi-Granular TRansformer (MGTR) framework, an encoder-decoder network that exploits context features in different granularities for different kinds of traffic agents. To further enhance MGTR… ▽ More

    Submitted 5 February, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Accepted to ICRA 2024

  47. arXiv:2312.02189  [pdf, other

    cs.CV cs.AI

    StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D

    Authors: Pengsheng Guo, Hans Hao, Adam Caccavale, Zhongzheng Ren, Edward Zhang, Qi Shan, Aditya Sankar, Alexander G. Schwing, Alex Colburn, Fangchang Ma

    Abstract: In the realm of text-to-3D generation, utilizing 2D diffusion models through score distillation sampling (SDS) frequently leads to issues such as blurred appearances and multi-faced geometry, primarily due to the intrinsically noisy nature of the SDS loss. Our analysis identifies the core of these challenges as the interaction among noise levels in the 2D diffusion process, the architecture of the… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  48. arXiv:2312.00591  [pdf, other

    cs.CV cs.AI

    Less is More: Learning Reference Knowledge Using No-Reference Image Quality Assessment

    Authors: Xudong Li, **gyuan Zheng, Xiawu Zheng, Runze Hu, Enwei Zhang, Yuting Gao, Yunhang Shen, Ke Li, Yutao Liu, **yang Dai, Yan Zhang, Rongrong Ji

    Abstract: Image Quality Assessment (IQA) with reference images have achieved great success by imitating the human vision system, in which the image quality is effectively assessed by comparing the query image with its pristine reference image. However, for the images in the wild, it is quite difficult to access accurate reference images. We argue that it is possible to learn reference knowledge under the No… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  49. arXiv:2311.18259  [pdf, other

    cs.CV cs.AI

    Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

    Authors: Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, **g Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, **g Huang, Md Mohaiminul Islam, Suyog Jain , et al. (76 additional authors not shown)

    Abstract: We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric and exocentric video of skilled human activities (e.g., sports, music, dance, bike repair). 740 participants from 13 cities worldwide performed these activities in 123 different natural scene contexts, yielding long-form captures from… ▽ More

    Submitted 29 April, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: updated baseline results and dataset statistics to match the released v2 data; added table to appendix comparing stats of Ego-Exo4D alongside other datasets

  50. arXiv:2311.17624  [pdf, other

    eess.SP cs.NI

    Combating Multi-path Interference to Improve Chirp-based Underwater Acoustic Communication

    Authors: Wenjun Xie, Enqi Zhang, Lizhao You, Deqing Wang, Zhaorui Wang, Liqun Fu

    Abstract: Linear chirp-based underwater acoustic communication has been widely used due to its reliability and long-range transmission capability. However, unlike the counterpart chirp technology in wireless -- LoRa, its throughput is severely limited by the number of modulated chirps in a symbol. The fundamental challenge lies in the underwater multi-path channel, where the delayed copied of one symbol may… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.