Skip to main content

Showing 1–50 of 185 results for author: Pan, R

.
  1. arXiv:2407.03203  [pdf, other

    cs.FL cs.AI

    TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

    Authors: Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang

    Abstract: Proving mathematical theorems using computer-verifiable formal languages like Lean significantly impacts mathematical reasoning. One approach to formal theorem proving involves generating complete proofs using Large Language Models (LLMs) based on Natural Language (NL) proofs. Similar methods have shown promising results in code generation. However, most modern LLMs exhibit suboptimal performance… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2406.19976  [pdf, other

    cs.LG math.OC

    ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting

    Authors: Rui Pan, Jipeng Zhang, Xingyuan Pan, Renjie Pi, Xiaoyu Wang, Tong Zhang

    Abstract: Bilevel optimization has shown its utility across various machine learning settings, yet most algorithms in practice require second-order information, making it challenging to scale them up. Only recently, a paradigm of first-order algorithms emerged, capable of effectively addressing bilevel optimization problems. Nevertheless, the practical efficiency of this paradigm remains unverified, particu… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2406.15244  [pdf, other

    cs.LG math.OC

    Large Batch Analysis for Adagrad Under Anisotropic Smoothness

    Authors: Yuxing Liu, Rui Pan, Tong Zhang

    Abstract: Adaptive gradient algorithms have been widely adopted in training large-scale deep neural networks, especially large foundation models. Despite their huge success in practice, their theoretical advantages over stochastic gradient descent (SGD) have not been fully understood, especially in the large batch-size setting commonly used in practice. This is because the only theoretical result that can d… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  4. arXiv:2406.11937  [pdf, other

    physics.ins-det hep-ex physics.data-an

    Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter

    Authors: M. Aamir, B. Acar, G. Adamov, T. Adams, C. Adloff, S. Afanasiev, C. Agrawal, C. Agrawal, A. Ahmad, H. A. Ahmed, S. Akbar, N. Akchurin, B. Akgul, B. Akgun, R. O. Akpinar, E. Aktas, A. AlKadhim, V. Alexakhin, J. Alimena, J. Alison, A. Alpana, W. Alshehri, P. Alvarez Dominguez, M. Alyari, C. Amendola , et al. (550 additional authors not shown)

    Abstract: A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr… ▽ More

    Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Prepared for submission to JINST

  5. arXiv:2406.07502  [pdf, other

    cs.CV cs.CL

    Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions

    Authors: Renjie Pi, Jianshu Zhang, Jipeng Zhang, Rui Pan, Zhekai Chen, Tong Zhang

    Abstract: Image description datasets play a crucial role in the advancement of various applications such as image understanding, text-to-image generation, and text-image retrieval. Currently, image description datasets primarily originate from two sources. One source is the scra** of image-text pairs from the web. Despite their abundance, these descriptions are often of low quality and noisy. Another is t… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  6. arXiv:2406.06745  [pdf, other

    gr-qc astro-ph.CO hep-ph hep-th

    Universal properties of the evolution of the Universe in modified loop quantum cosmology

    Authors: Jamal Saeed, Rui Pan, Christian Brown, Gerald Clevear, Anzhong Wang

    Abstract: In this paper, we systematically study the evolution of the Universe in the framework of a modified loop quantum cosmological model (mLQC-I) with various inflationary potentials, including chaotic, Starobinsky, generalized Starobinsky, polynomials of the first and second kinds, generalized T- models and natural inflation. In all these models, the big bang singularity is represented by a quantum bo… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 28 pages, 32 Figures

  7. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  8. arXiv:2405.03515  [pdf

    physics.geo-ph

    Linear correlations of Gibbs free energy for rare earth element oxide, hydroxide, chloride, fluoride, carbonate, and ferrite minerals and crystalline solids

    Authors: Ruiguang Pan, Chen Zhu

    Abstract: Rare Earth Elements (REE) are critical minerals (metals) for the transition from fossil fuels to renewable and clean energy. Accurate thermodynamic properties of REE minerals and other crystalline solids are crucial for geochemical modeling of the solubility, speciation, and transport of REE in ore formation, extraction, chemical processing, and recycling processes. However, the Gibbs free energie… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  9. arXiv:2404.17582  [pdf, other

    cs.HC cs.LG stat.AP

    Data Quality in Crowdsourcing and Spamming Behavior Detection

    Authors: Yang Ba, Michelle V. Mancenido, Erin K. Chiou, Rong Pan

    Abstract: As crowdsourcing emerges as an efficient and cost-effective method for obtaining labels for machine learning datasets, it is important to assess the quality of crowd-provided data, so as to improve analysis performance and reduce biases in subsequent machine learning tasks. Given the lack of ground truth in most cases of crowdsourcing, we refer to data quality as annotators' consistency and credib… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Preprint paper, under review on Behavior Research Methods. 45 pages, 10 figures

  10. arXiv:2404.13132  [pdf, other

    astro-ph.GA

    Medium Bands, Mega Science: a JWST/NIRCam Medium-Band Imaging Survey of Abell 2744

    Authors: Katherine A. Suess, John R. Weaver, Sedona H. Price, Richard Pan, Bingjie Wang, Rachel Bezanson, Gabriel Brammer, Sam E. Cutler, Ivo Labbe, Joel Leja, Christina C. Williams, Katherine E. Whitaker, Pratika Dayal, Anna de Graaff, Robert Feldmann, Marijn Franx, Yoshinobu Fudamoto, Seiji Fujimoto, Lukas J. Furtak, Andy D. Goulding, Jenny E. Greene, Gourav Khullar, Vasily Kokorev, Mariska Kriek, Brian Lorenz , et al. (17 additional authors not shown)

    Abstract: In this paper, we describe the "Medium Bands, Mega Science" JWST Cycle 2 survey (JWST-GO-4111) and demonstrate the power of these data to reveal both the spatially-integrated and spatially-resolved properties of galaxies from the local universe to the era of cosmic dawn. Executed in November 2023, MegaScience obtained ~30 arcmin^2 of deep multiband NIRCam imaging centered on the z~0.3 Abell 2744 c… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 21 pages, 10 figures. Fully reduced imaging, photometric catalogs, and photometric redshift fits publicly available at https://jwst-uncover.github.io/megascience/

  11. arXiv:2404.12678  [pdf, other

    cs.CV

    Exploring Interactive Semantic Alignment for Efficient HOI Detection with Vision-language Model

    Authors: Jihao Dong, Renjie Pan, Hua Yang

    Abstract: Human-Object Interaction (HOI) detection aims to localize human-object pairs and comprehend their interactions. Recently, two-stage transformer-based methods have demonstrated competitive performance. However, these methods frequently focus on object appearance features and ignore global contextual information. Besides, vision-language model CLIP which effectively aligns visual and text embeddings… ▽ More

    Submitted 24 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  12. arXiv:2404.10179  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    Scaling Instructable Agents Across Many Simulated Worlds

    Authors: SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi , et al. (68 additional authors not shown)

    Abstract: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructio… ▽ More

    Submitted 17 April, 2024; v1 submitted 13 March, 2024; originally announced April 2024.

  13. arXiv:2404.08457  [pdf, other

    stat.ME

    A Latent Factor Model for High-Dimensional Binary Data

    Authors: Jiaxin Shi, Yuan Gao, Rui Pan, Hansheng Wang

    Abstract: In this study, we develop a latent factor model for analysing high-dimensional binary data. Specifically, a standard probit model is used to describe the regression relationship between the observed binary data and the continuous latent variables. Our method assumes that the dependency structure of the observed binary data can be fully captured by the continuous latent factors. To estimate the mod… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  14. arXiv:2404.06809  [pdf, other

    cs.CL

    Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation

    Authors: Ruotong Pan, Boxi Cao, Hongyu Lin, Xianpei Han, Jia Zheng, Sirui Wang, Xunliang Cai, Le Sun

    Abstract: The rapid development of large language models has led to the widespread adoption of Retrieval-Augmented Generation (RAG), which integrates external knowledge to alleviate knowledge bottlenecks and mitigate hallucinations. However, the existing RAG paradigm inevitably suffers from the impact of flawed information introduced during the retrieval phrase, thereby diminishing the reliability and corre… ▽ More

    Submitted 8 May, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: Our code, benchmark, and models are available at https://github.com/panruotong/CAG

  15. arXiv:2404.01630  [pdf, other

    cs.NI

    SMaRTT-REPS: Sender-based Marked Rapidly-adapting Trimmed & Timed Transport with Recycled Entropies

    Authors: Tommaso Bonato, Abdul Kabbani, Daniele De Sensi, Rong Pan, Yanfang Le, Costin Raiciu, Mark Handley, Timo Schneider, Nils Blach, Ahmad Ghalayini, Daniel Alves, Michael Papamichael, Adrian Caulfield, Torsten Hoefler

    Abstract: With the rapid growth of machine learning (ML) workloads in datacenters, existing congestion control (CC) algorithms fail to deliver the required performance at scale. ML traffic is bursty and bulk-synchronous and thus requires quick reaction and strong fairness. We show that existing CC algorithms that use delay as a main signal react too slowly and are not always fair. We design SMaRTT, a simple… ▽ More

    Submitted 27 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Fixed typo and wrong y axis of one plot

  16. arXiv:2403.17919  [pdf, other

    cs.LG cs.AI cs.CL math.OC

    LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

    Authors: Rui Pan, Xiang Liu, Shizhe Diao, Renjie Pi, Jipeng Zhang, Chi Han, Tong Zhang

    Abstract: The machine learning community has witnessed impressive advancements since large language models (LLMs) first appeared. Yet, their massive memory consumption has become a significant roadblock to large-scale training. For instance, a 7B model typically requires at least 60 GB of GPU memory with full parameter training, which presents challenges for researchers without access to high-resource envir… ▽ More

    Submitted 25 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  17. arXiv:2403.12789  [pdf, other

    stat.ME

    Bivariate temporal dependence via mixtures of rotated copulas

    Authors: Ruyi Pan, Luis E. Nieto-Barajas, Radu Craiu

    Abstract: Parametric bivariate copula families have been known to flexibly capture enough various dependence patterns, e.g., either positive or negative dependence in either the lower or upper tails of bivariate distributions. However, to the best of our knowledge, there is not a single parametric model adaptable enough to capture several of these features simultaneously. To address this, we propose a mixtu… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  18. arXiv:2403.11163  [pdf, ps, other

    stat.ME cs.LG math.ST stat.CO

    A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques

    Authors: Xuetong Li, Yuan Gao, Hong Chang, Danyang Huang, Yingying Ma, Rui Pan, Haobo Qi, Feifei Wang, Shuyuan Wu, Ke Xu, **g Zhou, Xuening Zhu, Yingqiu Zhu, Hansheng Wang

    Abstract: This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first clas… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  19. arXiv:2403.08730  [pdf, other

    cs.CL cs.CV

    Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization

    Authors: Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang

    Abstract: Multimodal Large Language Models (MLLMs) excel in generating responses based on visual inputs. However, they often suffer from a bias towards generating responses similar to their pretraining corpus, overshadowing the importance of visual information. We treat this bias as a "preference" for pretraining statistics, which hinders the model's grounding in visual input. To mitigate this issue, we pro… ▽ More

    Submitted 3 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  20. arXiv:2403.00783  [pdf, other

    cs.AI

    On the Roles of LLMs in Planning: Embedding LLMs into Planning Graphs

    Authors: Hankz Hankui Zhuo, Xin Chen, Rong Pan

    Abstract: Plan synthesis aims to generate a course of actions or policies to transit given initial states to goal states, provided domain models that could be designed by experts or learnt from training data or interactions with the world. Intrigued by the claims of emergent planning capabilities in large language models (LLMs), works have been proposed to investigate the planning effectiveness of LLMs, wit… ▽ More

    Submitted 18 February, 2024; originally announced March 2024.

  21. arXiv:2402.05664  [pdf, other

    astro-ph.GA

    UNCOVER NIRSpec/PRISM Spectroscopy Unveils Evidence of Early Core Formation in a Massive, Centrally Dusty Quiescent Galaxy at $z_{spec}=3.97$

    Authors: David J. Setton, Gourav Khullar, Tim B. Miller, Rachel Bezanson, Jenny E. Greene, Katherine A. Suess, Katherine E. Whitaker, Jacqueline Antwi-Danso, Hakim Atek, Gabriel Brammer, Sam E. Cutler, Pratika Dayal, Robert Feldmann, Lukas J. Furtak, Seiji Fujimoto, Karl Glazebrook, Andy D. Goulding, Vasily Kokorev, Ivo Labbe, Joel Leja, Yilun Ma, Danilo Marchesini, Themiya Nanayakkara, Richard Pan, Sedona H. Price , et al. (6 additional authors not shown)

    Abstract: We report the spectroscopic confirmation of a massive ($\log(M_\star/M_\odot)=10.34 \pm_{0.07}^{0.06}$), HST-dark ($m_\mathrm{F150W} - m_\mathrm{F444W} = 3.6$) quiescent galaxy at $z_{spec}=3.97$ in the UNCOVER survey. NIRSpec/PRISM spectroscopy and a non-detection in deep ALMA imaging surprisingly reveals that the galaxy is consistent with a low ($<$10 $M_\odot \ \mathrm{yr^{-1}}$) star formation… ▽ More

    Submitted 12 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 17 pages, 9 figures, 2 tables. Resubmitted to ApJ after response to referee and update to include new medium band imaging from the JWST MEGASCIENCE program. Comments welcome!

  22. arXiv:2402.03757  [pdf, other

    cs.CV cs.CL cs.LG

    The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs

    Authors: Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang

    Abstract: Large language models (LLMs) have recently experienced remarkable progress, where the advent of multi-modal large language models (MLLMs) has endowed LLMs with visual capabilities, leading to impressive performances in various multi-modal tasks. However, those powerful MLLMs such as GPT-4V still fail spectacularly when presented with certain image and text inputs. In this paper, we identify a typi… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  23. arXiv:2401.15880  [pdf, other

    q-bio.GN q-bio.QM

    Deciphering regulatory architectures from synthetic single-cell expression patterns

    Authors: Rosalind Wenshan Pan, Tom Roeschinger, Kian Faizi, Hernan Garcia, Rob Phillips

    Abstract: For the vast majority of genes in sequenced genomes, there is limited understanding of how they are regulated. Without such knowledge, it is not possible to perform a quantitative theory-experiment dialogue on how such genes give rise to physiological and evolutionary adaptation. One category of high-throughput experiments used to understand the sequence-phenotype relationship of the transcriptome… ▽ More

    Submitted 5 June, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  24. arXiv:2401.11839  [pdf, other

    cs.CL cs.CY

    AI for social science and social science of AI: A Survey

    Authors: Ruoxi Xu, Yingfei Sun, Mengjie Ren, Shiguang Guo, Ruotong Pan, Hongyu Lin, Le Sun, Xianpei Han

    Abstract: Recent advancements in artificial intelligence, particularly with the emergence of large language models (LLMs), have sparked a rethinking of artificial general intelligence possibilities. The increasing human-like capabilities of AI are also attracting attention in social science research, leading to various studies exploring the combination of these two fields. In this survey, we systematically… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted by Information Processing and Management (IP&M)

  25. arXiv:2401.04217  [pdf, other

    cond-mat.soft

    Force Propagation in Active Cytoskeletal Networks

    Authors: Shichen Liu, Rosalind Wenshan Pan, Heun ** Lee, Shahriar Shadkhoo, Fan Yang, Chunhe Li, Zijie Qu, Rob Phillips, Matt Thomson

    Abstract: In biological systems, molecular-scale forces and motions are pivotal for enabling processes like motility, shape change, and replication. These forces and motions are organized, amplified, and transmitted across macroscopic scales by active materials such as the cytoskeleton, which drives micron-scale cellular movement and re-organization. Despite the integral role of active materials, understand… ▽ More

    Submitted 12 April, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: 15 pages, 4 figrues

  26. arXiv:2401.02906  [pdf, other

    cs.CR cs.CL cs.CV

    MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

    Authors: Renjie Pi, Tianyang Han, Jianshu Zhang, Yueqi Xie, Rui Pan, Qing Lian, Hanze Dong, Jipeng Zhang, Tong Zhang

    Abstract: The deployment of multimodal large language models (MLLMs) has brought forth a unique vulnerability: susceptibility to malicious attacks through visual inputs. This paper investigates the novel challenge of defending MLLMs against such attacks. Compared to large language models (LLMs), MLLMs include an additional image modality. We discover that images act as a ``foreign language" that is not cons… ▽ More

    Submitted 17 June, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

  27. arXiv:2401.01916  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA astro-ph.SR cs.CL cs.LG

    AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets

    Authors: Ernest Perkowski, Rui Pan, Tuan Dung Nguyen, Yuan-Sen Ting, Sandor Kruk, Tong Zhang, Charlie O'Neill, Maja Jablonska, Zechang Sun, Michael J. Smith, Huiling Liu, Kevin Schawinski, Kartheik Iyer, Ioana Ciucă for UniverseTBD

    Abstract: We explore the potential of enhancing LLM performance in astronomy-focused question-answering through targeted, continual pre-training. By employing a compact 7B-parameter LLaMA-2 model and focusing exclusively on a curated set of astronomy corpora -- comprising abstracts, introductions, and conclusions -- we achieve notable improvements in specialized topic comprehension. While general LLMs like… ▽ More

    Submitted 5 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: 4 pages, 1 figure, model is available at https://huggingface.co/universeTBD, published in RNAAS

  28. arXiv:2312.15012  [pdf, other

    astro-ph.GA

    Two Distinct Classes of Quiescent Galaxies at Cosmic Noon Revealed by JWST PRIMER and UNCOVER

    Authors: Sam E. Cutler, Katherine E. Whitaker, John R. Weaver, Bingjie Wang, Richard Pan, Rachel Bezanson, Lukas J. Furtak, Ivo Labbe, Joel Leja, Sedona H. Price, Yingjie Cheng, Maike Clausen, Fergus Cullen, Pratika Dayal, Anna de Graaff, Mark Dickinson, James S. Dunlop, Robert Feldmann, Marijn Franx, Mauro Giavalisco, Karl Glazebrook, Jenny E. Greene, Norman A. Grogin, Garth Illingworth, Anton M. Koekemoer , et al. (9 additional authors not shown)

    Abstract: We present a measurement of the low-mass quiescent size-mass relation at Cosmic Noon (1<z<3) from the JWST PRIMER and UNCOVER treasury surveys, which highlights two distinct classes of quiescent galaxies. While the massive population is well studied at these redshifts, the low-mass end has been previously under-explored due to a lack of observing facilities with sufficient sensitivity and spatial… ▽ More

    Submitted 23 April, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: 16 pages, 6 figures, 1 table. Submitted to ApJL. Revised 2024 April 18

  29. arXiv:2312.14567  [pdf, other

    cs.LG math.OC

    Accelerated Convergence of Stochastic Heavy Ball Method under Anisotropic Gradient Noise

    Authors: Rui Pan, Yuxing Liu, Xiaoyu Wang, Tong Zhang

    Abstract: Heavy-ball momentum with decaying learning rates is widely used with SGD for optimizing deep learning models. In contrast to its empirical popularity, the understanding of its theoretical property is still quite limited, especially under the standard anisotropic gradient noise condition for quadratic regression problems. Although it is widely conjectured that heavy-ball momentum method can provide… ▽ More

    Submitted 17 March, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: Published at ICLR 2024

  30. arXiv:2312.05385  [pdf, other

    cs.DC cs.LG

    Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving

    Authors: Yinwei Dai, Rui Pan, Anand Iyer, Kai Li, Ravi Netravali

    Abstract: Machine learning (ML) inference platforms are tasked with balancing two competing goals: ensuring high throughput given many requests, and delivering low-latency responses to support interactive applications. Unfortunately, existing platform knobs (e.g., batch sizes) fail to ease this fundamental tension, and instead only enable users to harshly trade off one property for the other. This paper exp… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: The first two authors contributed equally and are alphabetically ordered

  31. arXiv:2312.05030  [pdf, other

    astro-ph.GA

    JWST UNCOVER: The Overabundance of Ultraviolet-luminous Galaxies at $z>9$

    Authors: Iryna Chemerynska, Hakim Atek, Lukas J. Furtak, Adi Zitrin, Jenny E. Greene, Pratika Dayal, Andrea Weibel, Vasily Kokorev, Andy D. Goulding, Christina C. Williams, Themiya Nanayakkara, Rachel Bezanson, Gabriel Brammer, Sam E. Cutler, Ivo Labbe, Joel Leja, Richard Pan, Sedona H. Price, Bingjie Wang, John R. Weaver, Katherine E. Whitaker

    Abstract: Over the past year, JWST has uncovered galaxies at record-breaking distances up to $z \sim 13$. The JWST UNCOVER (ultra-deep NIRSpec and NIRcam observations before the epoch of reionization) program has obtained ultra-deep multiwavelength NIRCam imaging of the massive galaxy cluster Abell 2744 over $\sim 45$ arcmin$^{2}$ down to $\sim 29.5$ AB mag. Here, we present a robust ultraviolet (UV) lumino… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Submitted to MNRAS

  32. arXiv:2312.03176  [pdf, other

    cs.LG

    Active Learning for Abrupt Shifts Change-point Detection via Derivative-Aware Gaussian Processes

    Authors: Hao Zhao, Rong Pan

    Abstract: Change-point detection (CPD) is crucial for identifying abrupt shifts in data, which influence decision-making and efficient resource allocation across various domains. To address the challenges posed by the costly and time-intensive data acquisition in CPD, we introduce the Derivative-Aware Change Detection (DACD) method. It leverages the derivative process of a Gaussian process (GP) for Active L… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  33. arXiv:2312.02654  [pdf, other

    cond-mat.mtrl-sci physics.optics

    THz-Driven Coherent Magnetization Dynamics in a Labyrinth Domain State

    Authors: M Riepp, A Philippi-Kobs, L Mueller, R Froemter, W Roseker, R Rysov, M Walther, K Bagschik, M Hennes, D Gupta, S Marotzke, S Bajt, R Pan, T Golz, N Stojanovic, C Boeglin, G Gruebel

    Abstract: Terahertz (THz) light pulses can be used for an ultrafast coherent manipulation of the magnetization. Driving the magnetization at THz frequencies is currently the fastest way of writing magnetic information in ferromagnets. Using time-resolved resonant magnetic scattering, we gain new insights to the THz-driven coherent magnetization dynamics on nanometer length scales. We observe ultrafast demag… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 10 pages, 8 figures and 54 references

  34. arXiv:2311.08364  [pdf, other

    cs.LG cs.AI cs.DM

    Plum: Prompt Learning using Metaheuristic

    Authors: Rui Pan, Shuo Xing, Shizhe Diao, Wenhe Sun, Xiang Liu, Kashun Shum, Renjie Pi, Jipeng Zhang, Tong Zhang

    Abstract: Since the emergence of large language models, prompt learning has become a popular method for optimizing and customizing these models. Special prompts, such as Chain-of-Thought, have even revealed previously unknown reasoning capabilities within these models. However, the progress of discovering effective prompts has been slow, driving a desire for general prompt optimization methods. Unfortunatel… ▽ More

    Submitted 30 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Published at Findings of ACL 2024

  35. arXiv:2311.00047  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?

    Authors: Yichi Zhang, Jiayi Pan, Yuchen Zhou, Rui Pan, Joyce Chai

    Abstract: Vision-Language Models (VLMs) are trained on vast amounts of data captured by humans emulating our understanding of the world. However, known as visual illusions, human's perception of reality isn't always faithful to the physical world. This raises a key question: do VLMs have the similar kind of illusions as humans do, or do they faithfully learn to represent reality? To investigate this questio… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: Accepted at EMNLP 2023 main conference

  36. Quantifying the Effects of Known Unknowns on Inferred High-redshift Galaxy Properties: Burstiness, the IMF, and Nebular Physics

    Authors: Bingjie Wang, Joel Leja, Hakim Atek, Ivo Labbe, Yijia Li, Rachel Bezanson, Gabriel Brammer, Sam E. Cutler, Pratika Dayal, Lukas J. Furtak, Jenny E. Greene, Vasily Kokorev, Richard Pan, Sedona H. Price, Katherine A. Suess, John R. Weaver, Katherine E. Whitaker, Christina C. Williams

    Abstract: The era of the James Webb Space Telescope ushers stellar population models into uncharted territories, particularly at the high-redshift frontier. In a companion paper, we apply the \texttt{Prospector} Bayesian framework to jointly infer galaxy redshifts and stellar population properties from broad-band photometry as part of the UNCOVER survey. Here we present a comprehensive error budget in spect… ▽ More

    Submitted 8 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted for publication in ApJ. 22 pages, 10 figures, 2 tables

    Journal ref: The Astrophysical Journal, 963, 74, (2024)

  37. arXiv:2310.02500  [pdf, other

    astro-ph.GA

    UNCOVER: The rest ultraviolet to near infrared multiwavelength structures and dust distributions of sub-millimeter-detected galaxies in Abell 2744

    Authors: Sedona H. Price, Katherine A. Suess, Christina C. Williams, Rachel Bezanson, Gourav Khullar, Erica J. Nelson, Bingjie Wang, John R. Weaver, Seiji Fujimoto, Vasily Kokorev, Jenny E. Greene, Gabriel Brammer, Sam E. Cutler, Pratika Dayal, Lukas J. Furtak, Ivo Labbe, Joel Leja, Tim B. Miller, Themiya Nanayakkara, Richard Pan, Katherine E. Whitaker

    Abstract: With the wavelength coverage, sensitivity, and high spatial resolution of JWST, it is now possible to peer through the dust attenuation to probe the rest-frame near infrared (NIR) and stellar structures of extremely dusty galaxies at cosmic noon (z~1-3). In this paper we leverage the combined ALMA and JWST/HST coverage in Abell 2744 to study the multiwavelength (0.5-4.4um) structures of 11 sub-mil… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: Submitted to ApJ. 13 pages, 6 figures

  38. The UNCOVER Survey: A First-look HST+JWST Catalog of Galaxy Redshifts and Stellar Population Properties Spanning $0.2 \lesssim z \lesssim 15$

    Authors: Bingjie Wang, Joel Leja, Ivo Labbé, Rachel Bezanson, Katherine E. Whitaker, Gabriel Brammer, Lukas J. Furtak, John R. Weaver, Sedona H. Price, Adi Zitrin, Hakim Atek, Dan Coe, Sam E. Cutler, Pratika Dayal, Pieter van Dokkum, Robert Feldmann, Danilo Marchesini, Marijn Franx, Natascha Förster Schreiber, Seiji Fujimoto, Marla Geha, Karl Glazebrook, Anna de Graaff, Jenny E. Greene, Stéphanie Juneau , et al. (19 additional authors not shown)

    Abstract: The recent UNCOVER survey with the James Webb Space Telescope (JWST) exploits the nearby cluster Abell 2744 to create the deepest view of our universe to date by leveraging strong gravitational lensing. In this work, we perform photometric fitting of more than 50,000 robustly detected sources out to $z \sim 15$. We show the redshift evolution of stellar ages, star formation rates, and rest-frame c… ▽ More

    Submitted 16 April, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Corrected typos: Eq.1 should've been (1-kappa)^2, and the lens maps are normalized to D_ds/D_s=1. These errors were only in the writing; no data products or results were affected. The SPS catalogs are accessible via the UNCOVER survey webpage: https://jwst-uncover.github.io/DR2.html#SPSCatalogs, with a copy deposited to Zenodo: https://doi.org/10.5281/zenodo.8401181

    Journal ref: The Astrophysical Journal Supplement Series, 270, 12 (2024)

  39. arXiv:2309.07834  [pdf, other

    astro-ph.GA astro-ph.CO

    DUALZ: Deep UNCOVER-ALMA Legacy High-Z Survey

    Authors: Seiji Fujimoto, Rachel Bezanson, Ivo Labbe, Gabriel Brammer, Sedona H. Price, Bingjie Wang, John R. Weaver, Yoshinobu Fudamoto, Pascal A. Oesch, Christina C. Williams, Pratika Dayal, Robert Feldmann, Jenny E. Greene, Joel Leja, Katherine E. Whitaker, Adi Zitrin, Sam E. Cutler, Lukas J. Furtak, Richard Pan, Iryna Chemerynska, Vasily Kokorev, Tim B. Miller, Hakim Atek, Pieter van Dokkum, Stephanie Juneau , et al. (7 additional authors not shown)

    Abstract: We present the survey design and initial results of the ALMA Cycle 9 program of DUALZ, which aims to establish a joint ALMA and JWST public legacy field targeting the massive galaxy cluster Abell 2744. DUALZ features a contiguous $4'\times6'$ ALMA 30-GHz-wide mosaic in Band 6, covering areas of $μ>2$ down to a sensitivity of $σ=32.7~μ$Jy. Through a blind search, we identified 69 dust continuum sou… ▽ More

    Submitted 16 September, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: 33 pages, 16 figures, and 5 tables. Submitted to ApJS. The ALMA products are fully available from here: https://jwst-uncover.github.io/DR2.html#DUALZ

  40. arXiv:2309.06256  [pdf, other

    cs.LG

    Mitigating the Alignment Tax of RLHF

    Authors: Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang

    Abstract: LLMs acquire a wide range of abilities during pre-training, but aligning LLMs under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting, which is also known as the alignment tax. To empirically verify this hypothesis, we conducted experiments with existing RLHF algorithms using OpenLLaMA-3B, which revealed a pronounced alignment tax in NLP tasks. On the other hand, despite var… ▽ More

    Submitted 5 February, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

    Comments: 28 Pages

  41. arXiv:2309.05714  [pdf, other

    astro-ph.GA

    UNCOVER spectroscopy confirms a surprising ubiquity of AGN in red galaxies at $z>5$

    Authors: Jenny E. Greene, Ivo Labbe, Andy D. Goulding, Lukas J. Furtak, Iryna Chemerynska, Vasily Kokorev, Pratika Dayal, Christina C. Williams, Bingjie Wang, David J. Setton, Adam J. Burgasser, Rachel Bezanson, Hakim Atek, Gabriel Brammer, Sam E. Cutler, Robert Feldmann, Seiji Fujimoto, Karl Glazebrook, Anna de Graaff, Joel Leja, Danilo Marchesini, Michael V. Maseda, Jorryt Matthee, Tim B. Miller, Rohan P. Naidu , et al. (9 additional authors not shown)

    Abstract: JWST is revealing a new population of dust-reddened broad-line active galactic nuclei (AGN) at redshifts $z\gtrsim5$. Here we present deep NIRSpec/Prism spectroscopy from the Cycle 1 Treasury program UNCOVER of 15 AGN candidates selected to be compact, with red continua in the rest-frame optical but with blue slopes in the UV. From NIRCam photometry alone, they could have been dominated by dusty s… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 23 pages, 9 figures, 5 tables, submitted to ApJ

  42. arXiv:2309.03327  [pdf, other

    gr-qc

    Uniform Asymptotic Approximation Method with Pöschl-Teller Potential

    Authors: Rui Pan, John Joseph Marchetta, Jamal Saeed, Gerald Cleaver, Bao-Fei Li, Anzhong Wang, Tao Zhu

    Abstract: In this paper, we study analytical approximate solutions of the second-order homogeneous differential equations with the existence of only two turning points (but without poles), by using the uniform asymptotic approximation (UAA) method. To be more concrete, we consider the Pöschl-Teller (PT) potential, for which analytical solutions are known. Depending on the values of the parameters involved i… ▽ More

    Submitted 5 January, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Universe 2023, 9 (11), 471. In Honor of Prof. Jorge Pullin on His 60th Anniversary

    Journal ref: Universe 2023, 9 (11), 471

  43. arXiv:2308.12107  [pdf, other

    astro-ph.SR astro-ph.EP astro-ph.GA

    UNCOVER: JWST Spectroscopy of Three Cold Brown Dwarfs at Kiloparsec-scale Distances

    Authors: Adam J. Burgasser, Rachel Bezanson, Ivo Labbe, Gabriel Brammer, Sam E. Cutler, Lukas J. Furtak, Jenny E. Greene, Roman Gerasimov, Joel Leja, Richard Pan, Sedona H. Price, Bingjie Wang, John R. Weaver, Katherine E. Whitaker, Seiji Fujimoto, Vasily Kokorev, Pratika Dayal, Themiya Nanayakkara, Christina C. Williams, Danilo Marchesini, Adi Zitrin, Pieter van Dokkum

    Abstract: We report JWST/NIRSpec spectra of three distant T-type brown dwarfs identified in the Ultradeep NIRSpec and NIRCam ObserVations before the Epoch of Reionization (UNCOVER) survey of the Abell 2744 lensing field. One source was previously reported as a candidate T dwarf on the basis of NIRCam photometry, while two sources were initially identified as candidate active galactic nuclei. Low-resolution… ▽ More

    Submitted 7 February, 2024; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: revised, accepted by ApJ 22 Nov 2023

  44. arXiv:2308.11610  [pdf, other

    astro-ph.GA

    UNCOVER: A NIRSpec Identification of a Broad Line AGN at z = 8.50

    Authors: Vasily Kokorev, Seiji Fujimoto, Ivo Labbe, Jenny E. Greene, Rachel Bezanson, Pratika Dayal, Erica J. Nelson, Hakim Atek, Gabriel Brammer, Karina I. Caputi, Iryna Chemerynska, Sam E. Cutler, Robert Feldmann, Yoshinobu Fudamoto, Lukas J. Furtak, Andy D. Goulding, Anna de Graaff, Joel Leja, Danilo Marchesini, Tim B. Miller, Themiya Nanayakkara, Pascal Oesch, Richard Pan, Sedona H. Price, David J. Setton , et al. (7 additional authors not shown)

    Abstract: Deep observations with JWST have revealed an emerging population of red point-like sources that could provide a link between the postulated supermassive black hole seeds and observed quasars. In this work we present a JWST/NIRSpec spectrum from the JWST Cycle 1 UNCOVER Treasury survey, of a massive accreting black hole at $z=8.50$, displaying a clear broad-line component as inferred from the H$β$… ▽ More

    Submitted 15 October, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 14 pages, 6 figures, 2 tables. Accepted for a publication in ApJL

  45. arXiv:2308.11609  [pdf, other

    astro-ph.GA astro-ph.CO

    UNCOVER: A NIRSpec Census of Lensed Galaxies at z=8.50-13.08 Probing a High AGN Fraction and Ionized Bubbles in the Shadow

    Authors: Seiji Fujimoto, Bingjie Wang, John Weaver, Vasily Kokorev, Hakim Atek, Rachel Bezanson, Ivo Labbe, Gabriel Brammer, Jenny E. Greene, Iryna Chemerynska, Pratika Dayal, Anna de Graaff, Lukas J. Furtak, Pascal A. Oesch, David J. Setton, Sedona H. Price, Tim B. Miller, Christina C. Williams, Katherine E. Whitaker, Adi Zitrin, Sam E. Cutler, Joel Leja, Richard Pan, Dan Coe, Pieter van Dokkum , et al. (11 additional authors not shown)

    Abstract: We present JWST NIRSpec prism spectroscopy of gravitationally lensed galaxies at $z\gtrsim9$ found behind the massive galaxy cluster Abell 2744 in the UNCOVER Cycle 1 Treasury Program. We confirm the source redshift via emission lines and/or the Ly$α$ break feature for ten galaxies at z=8.50-13.08 down to $M_{\rm UV}=-17.3$. We achieve a high confirmation rate of 100\% for $z>9$ candidates reporte… ▽ More

    Submitted 25 August, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 27 pages, 11 figures, 4 tables, submitted to ApJ (See also arXiv:2308.11610)

  46. arXiv:2308.11287  [pdf, other

    stat.AP

    Large-scale Multi-layer Academic Networks Derived from Statistical Publications

    Authors: Tianchen Gao, Yan Zhang, Rui Pan, Hansheng Wang

    Abstract: The utilization of multi-layer network structures now enables the explanation of complex systems in nature from multiple perspectives. Multi-layer academic networks capture diverse relationships among academic entities, facilitating the study of academic development and the prediction of future directions. However, there are currently few academic network datasets that simultaneously consider mult… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  47. Most of the photons that reionized the Universe came from dwarf galaxies

    Authors: Hakim Atek, Ivo Labbé, Lukas J. Furtak, Iryna Chemerynska, Seiji Fujimoto, David J. Setton, Tim B. Miller, Pascal Oesch, Rachel Bezanson, Sedona H. Price, Pratika Dayal, Adi Zitrin, Vasily Kokorev, John R. Weaver, Gabriel Brammer, Pieter van Dokkum, Christina C. Williams, Sam E. Cutler, Robert Feldmann, Yoshinobu Fudamoto, Jenny E. Greene, Joel Leja, Michael V. Maseda, Adam Muzzin, Richard Pan , et al. (8 additional authors not shown)

    Abstract: The identification of sources driving cosmic reionization, a major phase transition from neutral Hydrogen to ionized plasma around 600-800 Myr after the Big Bang (Dayal et al. 2018, Mason et al. 2019, Robertson et al. 2022), has been a matter of intense debate (Robertson et al. 2022). Some models suggest that high ionizing emissivity and escape fractions ($f_{\rm esc}$) from quasars support their… ▽ More

    Submitted 30 April, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 29 pages, 7 figures, 2 tables. Published in Nature

    Journal ref: Nature Volume 626, 2024, 975-978

  48. arXiv:2308.05735  [pdf, other

    astro-ph.GA

    A supermassive black hole in the early universe growing in the shadows

    Authors: Lukas J. Furtak, Ivo Labbé, Adi Zitrin, Jenny E. Greene, Pratika Dayal, Iryna Chemerynska, Vasily Kokorev, Tim B. Miller, Andy D. Goulding, Rachel Bezanson, Gabriel B. Brammer, Sam E. Cutler, Joel Leja, Richard Pan, Sedona H. Price, Bingjie Wang, John R. Weaver, Katherine E. Whitaker, Hakim Atek, Ákos Bogdán, Stéphane Charlot, Emma Curtis-Lake, Pieter van Dokkum, Ryan Endsley, Yoshinobu Fudamoto , et al. (12 additional authors not shown)

    Abstract: Early JWST observations have uncovered a new, substantial population of red sources that might represent a previously overlooked phase of actively growing supermassive black holes (Kocevski et al. 2023, Matthee et al. 2023, Labbe et al. 2023). One of the most intriguing examples is an extremely red, point-like object that was found to be triply-imaged by the strong lensing galaxy cluster Abell 274… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: Submitted. Comments welcome!

  49. UNCOVER: Illuminating the Early Universe -- JWST/NIRSpec Confirmation of $z > 12$ Galaxies

    Authors: Bingjie Wang, Seiji Fujimoto, Ivo Labbe, Lukas J. Furtak, Tim B. Miller, David J. Setton, Adi Zitrin, Hakim Atek, Rachel Bezanson, Gabriel Brammer, Joel Leja, Pascal A. Oesch, Sedona H. Price, Iryna Chemerynska, Sam E. Cutler, Pratika Dayal, Pieter van Dokkum, Andy D. Goulding, Jenny E. Greene, Y. Fudamoto, Gourav Khullar, Vasily Kokorev, Danilo Marchesini, Richard Pan, John R. Weaver , et al. (2 additional authors not shown)

    Abstract: Observations of high-redshift galaxies provide a critical direct test to the theories of early galaxy formation, yet to date, only three have been spectroscopically confirmed at $z>12$. Due to strong gravitational lensing over a wide area, the galaxy cluster field A2744 is ideal for searching for the earliest galaxies. Here we present JWST/NIRSpec observations of two galaxies: a robust detection a… ▽ More

    Submitted 10 October, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: ApJL in press; 16 pages, 6 figures, 2 tables

    Journal ref: The Astrophysical Journal Letters, 957, L34 (2023)

  50. Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code

    Authors: Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lambert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha, Reyhaneh Jabbarvand

    Abstract: Code translation aims to convert source code from one programming language (PL) to another. Given the promising abilities of large language models (LLMs) in code synthesis, researchers are exploring their potential to automate code translation. The prerequisite for advancing the state of LLM-based code translation is to understand their promises and limitations over existing techniques. To that en… ▽ More

    Submitted 16 January, 2024; v1 submitted 6 August, 2023; originally announced August 2023.

    Comments: Published in ICSE 2024