Skip to main content

Showing 1–50 of 162 results for author: Hao, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00487  [pdf, other

    cs.CL

    It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization

    Authors: Bingdong Li, Zixiang Di, Yanting Yang, Hong Qian, Peng Yang, Hao Hao, Ke Tang, Aimin Zhou

    Abstract: In this paper, we introduce a novel approach for large language model merging via black-box multi-objective optimization algorithms. The goal of model merging is to combine multiple models, each excelling in different tasks, into a single model that outperforms any of the individual source models. However, model merging faces two significant challenges: First, existing methods rely heavily on huma… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2406.10675  [pdf, other

    cs.NE

    Large Language Models as Surrogate Models in Evolutionary Algorithms: A Preliminary Study

    Authors: Hao Hao, Xiaoqun Zhang, Aimin Zhou

    Abstract: Large Language Models (LLMs) have achieved significant progress across various fields and have exhibited strong potential in evolutionary computation, such as generating new solutions and automating algorithm design. Surrogate-assisted selection is a core step in evolutionary algorithms to solve expensive optimization problems by reducing the number of real evaluations. Traditionally, this has rel… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  3. arXiv:2406.00276  [pdf

    cs.LG cs.AI cs.CE physics.data-an

    Non-destructive Degradation Pattern Decoupling for Ultra-early Battery Prototype Verification Using Physics-informed Machine Learning

    Authors: Shengyu Tao, Mengtian Zhang, Zixi Zhao, Haoyang Li, Ruifei Ma, Yunhong Che, Xin Sun, Lin Su, Xiangyu Chen, Zihao Zhou, Heng Chang, Tingwei Cao, Xiao Xiao, Yaojun Liu, Wenjun Yu, Zhongling Xu, Yang Li, Han Hao, Xuan Zhang, Xiaosong Hu, Guangmin ZHou

    Abstract: Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed mac… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    ACM Class: J.2; G.3

  4. arXiv:2405.16494  [pdf, other

    cs.NE

    A First Look at Kolmogorov-Arnold Networks in Surrogate-assisted Evolutionary Algorithms

    Authors: Hao Hao, Xiaoqun Zhang, Bingdong Li, Aimin Zhou

    Abstract: Surrogate-assisted Evolutionary Algorithm (SAEA) is an essential method for solving expensive expensive problems. Utilizing surrogate models to substitute the optimization function can significantly reduce reliance on the function evaluations during the search process, thereby lowering the optimization costs. The construction of surrogate models is a critical component in SAEAs, with numerous mach… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  5. arXiv:2405.11966  [pdf, other

    cs.CL

    Multiple-Choice Questions are Efficient and Robust LLM Evaluators

    Authors: Ziyin Zhang, Zhaokun Jiang, Lizhen Xu, Hongkun Hao, Rui Wang

    Abstract: We present GSM-MC, a multiple-choice (MC) dataset constructed by collecting answers and incorrect predictions on GSM8K from 60 open-source models. Through extensive experiments, we show that LLMs' performance on the MC version of this popular benchmark is strongly correlated with their performance on the original version and is quite robust to distractor choices and option orders, while the evalua… ▽ More

    Submitted 26 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: data at https://github.com/Geralt-Targaryen/MC-Evaluation

  6. arXiv:2405.09330  [pdf, other

    cs.SE

    BARO: Robust Root Cause Analysis for Microservices via Multivariate Bayesian Online Change Point Detection

    Authors: Luan Pham, Huong Ha, Hongyu Zhang

    Abstract: Detecting failures and identifying their root causes promptly and accurately is crucial for ensuring the availability of microservice systems. A typical failure troubleshooting pipeline for microservices consists of two phases: anomaly detection and root cause analysis. While various existing works on root cause analysis require accurate anomaly detection, there is no guarantee of accurate estimat… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted to FSE'24

  7. arXiv:2405.06424  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation

    Authors: JoonHo Lee, Jae Oh Woo, Juree Seok, Parisa Hassanzadeh, Wooseok Jang, JuYoun Son, Sima Didari, Baruch Gutow, Heng Hao, Hankyu Moon, Wenjun Hu, Yeong-Dae Kwon, Taehee Lee, Seungjai Min

    Abstract: Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for t… ▽ More

    Submitted 19 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML 2024

  8. Investigating Interaction Modes and User Agency in Human-LLM Collaboration for Domain-Specific Data Analysis

    Authors: Jia**g Guo, Vikram Mohanty, Jorge Piazentin Ono, Hongtao Hao, Liang Gou, Liu Ren

    Abstract: Despite demonstrating robust capabilities in performing tasks related to general-domain data-operation tasks, Large Language Models (LLMs) may exhibit shortcomings when applied to domain-specific tasks. We consider the design of domain-specific AI-powered data analysis tools from two dimensions: interaction and user agency. We implemented two design probes that fall on the two ends of the two dime… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: CHI'24 Late-Breaking Work

    ACM Class: H.5.2

  9. arXiv:2405.03202  [pdf, other

    cs.CV

    Hierarchical Space-Time Attention for Micro-Expression Recognition

    Authors: Haihong Hao, Shuo Wang, Huixia Ben, Yanbin Hao, Yansong Wang, Weiwei Wang

    Abstract: Micro-expression recognition (MER) aims to recognize the short and subtle facial movements from the Micro-expression (ME) video clips, which reveal real emotions. Recent MER methods mostly only utilize special frames from ME video clips or extract optical flow from these special frames. However, they neglect the relationship between movements and space-time, while facial cues are hidden within the… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  10. arXiv:2404.18343  [pdf, other

    cs.MM cs.CV

    G-Refine: A General Quality Refiner for Text-to-Image Generation

    Authors: Chunyi Li, Haoning Wu, Hongkun Hao, Zicheng Zhang, Tengchaun Kou, Chaofeng Chen, Lei Bai, Xiaohong Liu, Weisi Lin, Guangtao Zhai

    Abstract: With the evolution of Text-to-Image (T2I) models, the quality defects of AI-Generated Images (AIGIs) pose a significant barrier to their widespread adoption. In terms of both perception and alignment, existing models cannot always guarantee high-quality results. To mitigate this limitation, we introduce G-Refine, a general image quality refiner designed to enhance low-quality images without compro… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  11. arXiv:2404.17644  [pdf, other

    stat.ML cs.AI cs.LG

    A Conditional Independence Test in the Presence of Discretization

    Authors: Boyang Sun, Yu Yao, Huangyuan Hao, Yumou Qiu, Kun Zhang

    Abstract: Testing conditional independence has many applications, such as in Bayesian network learning and causal discovery. Different test methods have been proposed. However, existing methods generally can not work when only discretized observations are available. Specifically, consider $X_1$, $\tilde{X}_2$ and $X_3$ are observed variables, where $\tilde{X}_2$ is a discretization of latent variables… ▽ More

    Submitted 3 May, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  12. arXiv:2404.11792  [pdf, other

    cs.AI

    Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study

    Authors: Zooey Nguyen, Anthony Annunziata, Vinh Luong, Sang Dinh, Quynh Le, Anh Hai Ha, Chanh Le, Hong An Phan, Shruti Raghavan, Christopher Nguyen

    Abstract: This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accura… ▽ More

    Submitted 19 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Fixed typo of OODA's score on harder-question set in Table 2

  13. arXiv:2404.05662  [pdf, other

    cs.CV

    Towards Accurate Binarization of Diffusion Model

    Authors: Xingyu Zheng, Haotong Qin, Xudong Ma, Mingyuan Zhang, Haojie Hao, Jiakai Wang, Zixiang Zhao, **yang Guo, Xianglong Liu

    Abstract: With the advancement of diffusion models (DMs) and the substantially increased computational requirements, quantization emerges as a practical solution to obtain compact and efficient low-bit DMs. However, the highly discrete representation leads to severe accuracy degradation, hindering the quantization of diffusion models to ultra-low bit-widths. This paper proposes a novel quantization-aware tr… ▽ More

    Submitted 28 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: The code is available at https://github.com/Xingyu-Zheng/BinaryDM

  14. arXiv:2404.00656  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    WavLLM: Towards Robust and Adaptive Speech Large Language Model

    Authors: Shujie Hu, Long Zhou, Shujie Liu, Sanyuan Chen, Hongkun Hao, **g Pan, Xunying Liu, **yu Li, Sunit Sivasankaran, Linquan Liu, Furu Wei

    Abstract: The recent advancements in large language models (LLMs) have revolutionized the field of natural language processing, progressively broadening their scope to multimodal perception and generation. However, effectively integrating listening capabilities into LLMs poses significant challenges, particularly with respect to generalizing across varied contexts and executing complex auditory tasks. In th… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  15. arXiv:2403.17392  [pdf, other

    cs.RO eess.SY nlin.AO

    Natural-artificial hybrid swarm: Cyborg-insect group navigation in unknown obstructed soft terrain

    Authors: Yang Bai, Phuoc Thanh Tran Ngoc, Huu Duoc Nguyen, Duc Long Le, Quang Huy Ha, Kazuki Kai, Yu Xiang See To, Yaosheng Deng, Jie Song, Naoki Wakamiya, Hirotaka Sato, Masaki Ogura

    Abstract: Navigating multi-robot systems in complex terrains has always been a challenging task. This is due to the inherent limitations of traditional robots in collision avoidance, adaptation to unknown environments, and sustained energy efficiency. In order to overcome these limitations, this research proposes a solution by integrating living insects with miniature electronic controllers to enable roboti… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  16. arXiv:2403.15664  [pdf, other

    cs.CV

    What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation

    Authors: Yihua Cheng, Yaning Zhu, Zongji Wang, Hongquan Hao, Yongwei Liu, Shiqing Cheng, Xi Wang, Hyung ** Chang

    Abstract: Driver's eye gaze holds a wealth of cognitive and intentional cues crucial for intelligent vehicles. Despite its significance, research on in-vehicle gaze estimation remains limited due to the scarcity of comprehensive and well-annotated datasets in real driving scenarios. In this paper, we present three novel elements to advance in-vehicle gaze research. Firstly, we introduce IVGaze, a pioneering… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: CVPR24

  17. arXiv:2403.14413  [pdf, other

    cs.NE cs.LG

    Model Uncertainty in Evolutionary Optimization and Bayesian Optimization: A Comparative Analysis

    Authors: Hao Hao, Xiaoqun Zhang, Aimin Zhou

    Abstract: Black-box optimization problems, which are common in many real-world applications, require optimization through input-output interactions without access to internal workings. This often leads to significant computational resources being consumed for simulations. Bayesian Optimization (BO) and Surrogate-Assisted Evolutionary Algorithm (SAEA) are two widely used gradient-free optimization techniques… ▽ More

    Submitted 22 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

  18. arXiv:2403.12945  [pdf, other

    cs.RO

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    Authors: Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, Peter David Fagan, Joey Hejna, Masha Itkina, Marion Lepert, Yecheng Jason Ma, Patrick Tree Miller, Jimmy Wu, Suneel Belkhale, Shivin Dass, Huy Ha, Arhan Jain, Abraham Lee, Youngwoon Lee, Marius Memmel, Sungjae Park , et al. (74 additional authors not shown)

    Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important step** stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Project website: https://droid-dataset.github.io/

  19. arXiv:2403.10468  [pdf, other

    cs.SE

    An Empirical Study on Developers Shared Conversations with ChatGPT in GitHub Pull Requests and Issues

    Authors: Huizi Hao, Kazi Amit Hasan, Hong Qin, Marcos Macedo, Yuan Tian, Steven H. H. Ding, Ahmed E. Hassan

    Abstract: ChatGPT has significantly impacted software development practices, providing substantial assistance to developers in a variety of tasks, including coding, testing, and debugging. Despite its widespread adoption, the impact of ChatGPT as an assistant in collaborative coding remains largely unexplored. In this paper, we analyze a dataset of 210 and 370 developers shared conversations with ChatGPT in… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  20. arXiv:2403.09566  [pdf, other

    cs.RO

    PaperBot: Learning to Design Real-World Tools Using Paper

    Authors: Ruoshi Liu, Junbang Liang, Sruthi Sudhakar, Huy Ha, Cheng Chi, Shuran Song, Carl Vondrick

    Abstract: Paper is a cheap, recyclable, and clean material that is often used to make practical tools. Traditional tool design either relies on simulation or physical analysis, which is often inaccurate and time-consuming. In this paper, we propose PaperBot, an approach that directly learns to design and use a tool in the real world using paper without human intervention. We demonstrated the effectiveness a… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Project Website: https://paperbot.cs.columbia.edu/

  21. arXiv:2403.07592  [pdf, other

    cs.CV

    Accurate Spatial Gene Expression Prediction by integrating Multi-resolution features

    Authors: Youngmin Chung, Ji Hun Ha, Kyeong Chan Im, Joo Sang Lee

    Abstract: Recent advancements in Spatial Transcriptomics (ST) technology have facilitated detailed gene expression analysis within tissue contexts. However, the high costs and methodological limitations of ST necessitate a more robust predictive model. In response, this paper introduces TRIPLEX, a novel deep learning framework designed to predict spatial gene expression from Whole Slide Images (WSIs). TRIPL… ▽ More

    Submitted 25 April, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  22. arXiv:2403.04264  [pdf, other

    cs.AI

    Competitive Facility Location under Random Utilities and Routing Constraints

    Authors: Hoang Giang Pham, Tien Thanh Dam, Ngan Ha Duong, Tien Mai, Minh Hoang Ha

    Abstract: In this paper, we study a facility location problem within a competitive market context, where customer demand is predicted by a random utility choice model. Unlike prior research, which primarily focuses on simple constraints such as a cardinality constraint on the number of selected locations, we introduce routing constraints that necessitate the selection of locations in a manner that guarantee… ▽ More

    Submitted 9 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  23. arXiv:2403.01898  [pdf, other

    cs.CV eess.IV

    Revisiting Learning-based Video Motion Magnification for Real-time Processing

    Authors: Hyunwoo Ha, Oh Hyun-Bin, Kim Jun-Seong, Kwon Byung-Ki, Kim Sung-Bin, Linh-Tam Tran, Ji-Yun Kim, Sung-Ho Bae, Tae-Hyun Oh

    Abstract: Video motion magnification is a technique to capture and amplify subtle motion in a video that is invisible to the naked eye. The deep learning-based prior work successfully demonstrates the modelling of the motion magnification problem with outstanding quality compared to conventional signal processing-based ones. However, it still lags behind real-time performance, which prevents it from being e… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 19 pages

  24. arXiv:2402.18223  [pdf, other

    cs.CL

    Improving Open-Ended Text Generation via Adaptive Decoding

    Authors: Wenhong Zhu, Hongkun Hao, Zhiwei He, Yiming Ai, Rui Wang

    Abstract: Current language models decode text token by token according to probabilistic distribution, and determining the appropriate candidates for the next token is crucial to ensure generation quality. This study introduces adaptive decoding, a mechanism that dynamically empowers language models to ascertain a sensible candidate set during generation. Specifically, we introduce an entropy-based metric ca… ▽ More

    Submitted 2 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: ICML2024

  25. arXiv:2402.15038  [pdf, other

    cs.RO cs.AI cs.LG

    Dynamics-Guided Diffusion Model for Robot Manipulator Design

    Authors: Xiaomeng Xu, Huy Ha, Shuran Song

    Abstract: We present Dynamics-Guided Diffusion Model, a data-driven framework for generating manipulator geometry designs for a given manipulation task. Instead of training different design models for each task, our approach employs a learned dynamics network shared across tasks. For a new manipulation task, we first decompose it into a collection of individual motion targets which we call target interactio… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  26. arXiv:2402.14679  [pdf, other

    cs.CL cs.CY

    Is Cognition and Action Consistent or Not: Investigating Large Language Model's Personality

    Authors: Yiming Ai, Zhiwei He, Ziyin Zhang, Wenhong Zhu, Hongkun Hao, Kai Yu, Lingjun Chen, Rui Wang

    Abstract: In this study, we investigate the reliability of Large Language Models (LLMs) in professing human-like personality traits through responses to personality questionnaires. Our goal is to evaluate the consistency between LLMs' professed personality inclinations and their actual "behavior", examining the extent to which these models can emulate human-like personality patterns. Through a comprehensive… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  27. arXiv:2402.14007  [pdf, other

    cs.CL cs.AI

    Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models

    Authors: Zhiwei He, Binglin Zhou, Hongkun Hao, Aiwei Liu, Xing Wang, Zhaopeng Tu, Zhuosheng Zhang, Rui Wang

    Abstract: Text watermarking technology aims to tag and identify content produced by large language models (LLMs) to prevent misuse. In this study, we introduce the concept of cross-lingual consistency in text watermarking, which assesses the ability of text watermarks to maintain their effectiveness after being translated into other languages. Preliminary empirical results from two LLMs and three watermarki… ▽ More

    Submitted 4 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ACL 2024 (main conference)

  28. arXiv:2402.03104  [pdf, other

    stat.ML cs.LG

    High-dimensional Bayesian Optimization via Covariance Matrix Adaptation Strategy

    Authors: Lam Ngo, Huong Ha, Jeffrey Chan, Vu Nguyen, Hongyu Zhang

    Abstract: Bayesian Optimization (BO) is an effective method for finding the global optimum of expensive black-box functions. However, it is well known that applying BO to high-dimensional optimization problems is challenging. To address this issue, a promising solution is to use a local search strategy that partitions the search domain into local regions with high likelihood of containing the global optimum… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 31 pages, 17 figures

    Journal ref: Transactions on Machine Learning Research 2024

  29. arXiv:2401.06949  [pdf, other

    cs.RO cs.AI

    ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization

    Authors: Kourosh Darvish, Marta Skreta, Yuchi Zhao, Naruki Yoshikawa, Sagnik Som, Miroslav Bogdanovic, Yang Cao, Han Hao, Hao** Xu, Alán Aspuru-Guzik, Animesh Garg, Florian Shkurti

    Abstract: Chemistry experimentation is often resource- and labor-intensive. Despite the many benefits incurred by the integration of advanced and special-purpose lab equipment, many aspects of experimentation are still manually conducted by chemists, for example, polishing an electrode in electrochemistry experiments. Traditional lab automation infrastructure faces challenges when it comes to flexibly adapt… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  30. arXiv:2401.06469  [pdf, other

    cs.LG cs.CL

    Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning

    Authors: Kaiyi Zhang, Ang Lv, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan

    Abstract: In this paper, by treating in-context learning (ICL) as a meta-optimization process, we explain why LLMs are sensitive to the order of ICL examples. This understanding leads us to the development of Batch-ICL, an effective, efficient, and order-agnostic inference algorithm for ICL. Differing from the standard N-shot learning approach, Batch-ICL employs $N$ separate 1-shot forward computations and… ▽ More

    Submitted 5 June, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted by ACL 2024 (Findings)

  31. arXiv:2401.04849  [pdf

    econ.EM cs.AI

    A Deep Learning Representation of Spatial Interaction Model for Resilient Spatial Planning of Community Business Clusters

    Authors: Haiyan Hao, Yan Wang

    Abstract: Existing Spatial Interaction Models (SIMs) are limited in capturing the complex and context-aware interactions between business clusters and trade areas. To address the limitation, we propose a SIM-GAT model to predict spatiotemporal visitation flows between community business clusters and their trade areas. The model innovatively represents the integrated system of business clusters, trade areas,… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  32. arXiv:2401.01117  [pdf, other

    cs.CV eess.IV

    Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

    Authors: Chunyi Li, Haoning Wu, Zicheng Zhang, Hongkun Hao, Kaiwei Zhang, Lei Bai, Xiaohong Liu, Xiongkuo Min, Weisi Lin, Guangtao Zhai

    Abstract: With the rapid evolution of the Text-to-Image (T2I) model in recent years, their unsatisfactory generation result has become a challenge. However, uniformly refining AI-Generated Images (AIGIs) of different qualities not only limited optimization capabilities for low-quality AIGIs but also brought negative optimization to high-quality AIGIs. To address this issue, a quality-award refiner named Q-R… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 6 pages, 5 figures

  33. arXiv:2401.00246  [pdf, other

    cs.CL cs.SD eess.AS

    Boosting Large Language Model for Speech Synthesis: An Empirical Study

    Authors: Hongkun Hao, Long Zhou, Shujie Liu, **yu Li, Shujie Hu, Rui Wang, Furu Wei

    Abstract: Large language models (LLMs) have made significant advancements in natural language processing and are concurrently extending the language ability to other modalities, such as speech and vision. Nevertheless, most of the previous work focuses on prompting LLMs with perception abilities like auditory comprehension, and the effective approach for augmenting LLMs with speech synthesis capabilities re… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

  34. arXiv:2312.09551  [pdf, other

    eess.IV cs.CV

    Learning-based Axial Video Motion Magnification

    Authors: Kwon Byung-Ki, Oh Hyun-Bin, Kim Jun-Seong, Hyunwoo Ha, Tae-Hyun Oh

    Abstract: Video motion magnification amplifies invisible small motions to be perceptible, which provides humans with a spatially dense and holistic understanding of small motions in the scene of interest. This is based on the premise that magnifying small motions enhances the legibility of motions. In the real world, however, vibrating objects often possess convoluted systems that have complex natural frequ… ▽ More

    Submitted 26 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: main paper: 12 pages, supplementary: 10 pages, 20 figures, 1 table

  35. arXiv:2312.03687  [pdf, other

    cond-mat.mtrl-sci cs.AI

    MatterGen: a generative model for inorganic materials design

    Authors: Claudio Zeni, Robert Pinsler, Daniel Zügner, Andrew Fowler, Matthew Horton, Xiang Fu, Sasha Shysheya, Jonathan Crabbé, Lixin Sun, Jake Smith, Bichlien Nguyen, Hannes Schulz, Sarah Lewis, Chin-Wei Huang, Ziheng Lu, Yichi Zhou, Han Yang, Hongxia Hao, Jielan Li, Ryota Tomioka, Tian Xie

    Abstract: The design of functional materials with desired properties is essential in driving technological advances in areas like energy storage, catalysis, and carbon capture. Generative models provide a new paradigm for materials design by directly generating entirely novel materials given desired property constraints. Despite recent progress, current generative models have low success rate in proposing s… ▽ More

    Submitted 29 January, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: 13 pages main text, 35 pages supplementary information

  36. arXiv:2312.02189  [pdf, other

    cs.CV cs.AI

    StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D

    Authors: Pengsheng Guo, Hans Hao, Adam Caccavale, Zhongzheng Ren, Edward Zhang, Qi Shan, Aditya Sankar, Alexander G. Schwing, Alex Colburn, Fangchang Ma

    Abstract: In the realm of text-to-3D generation, utilizing 2D diffusion models through score distillation sampling (SDS) frequently leads to issues such as blurred appearances and multi-faced geometry, primarily due to the intrinsically noisy nature of the SDS loss. Our analysis identifies the core of these challenges as the interaction among noise levels in the 2D diffusion process, the architecture of the… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  37. arXiv:2311.09154  [pdf, other

    cs.CL

    CLEAN-EVAL: Clean Evaluation on Contaminated Large Language Models

    Authors: Wenhong Zhu, Hongkun Hao, Zhiwei He, Yunze Song, Yumeng Zhang, Hanxu Hu, Yiran Wei, Rui Wang, Hongyuan Lu

    Abstract: We are currently in an era of fierce competition among various large language models (LLMs) continuously pushing the boundaries of benchmark performance. However, genuinely assessing the capabilities of these LLMs has become a challenging and critical issue due to potential data contamination, and it wastes dozens of time and effort for researchers and engineers to download and try those contamina… ▽ More

    Submitted 2 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: NAACL2024(findings)

  38. arXiv:2310.14971  [pdf, other

    cs.CL

    Penalty Decoding: Well Suppress the Self-Reinforcement Effect in Open-Ended Text Generation

    Authors: Wenhong Zhu, Hongkun Hao, Rui Wang

    Abstract: The decoding algorithm is critical for open-ended text generation, transforming latent representations into coherent and meaningful outputs. This paper investigates the self-reinforcement effect in text generation and the effectiveness of a repetition penalty to mitigate it. However, determining the optimal repetition penalty value is challenging. To tackle this, we propose a forgetting mechanism… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP2023

  39. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  40. arXiv:2310.05186  [pdf, other

    cs.AI

    Evolutionary Retrosynthetic Route Planning

    Authors: Yan Zhang, Hao Hao, Xiao He, Shuanhu Gao, Aimin Zhou

    Abstract: Molecular retrosynthesis is a significant and complex problem in the field of chemistry, however, traditional manual synthesis methods not only need well-trained experts but also are time-consuming. With the development of big data and machine learning, artificial intelligence (AI) based retrosynthesis is attracting more attention and is becoming a valuable tool for molecular retrosynthesis. At pr… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  41. arXiv:2310.04993  [pdf, other

    cs.LG

    Prompt-augmented Temporal Point Process for Streaming Event Sequence

    Authors: Siqiao Xue, Yan Wang, Zhixuan Chu, Xiaoming Shi, Caigao Jiang, Hongyan Hao, Gangwei Jiang, Xiaoyun Feng, James Y. Zhang, Jun Zhou

    Abstract: Neural Temporal Point Processes (TPPs) are the prevalent paradigm for modeling continuous-time event sequences, such as user activities on the web and financial transactions. In real-world applications, event data is typically received in a \emph{streaming} manner, where the distribution of patterns may shift over time. Additionally, \emph{privacy and memory constraints} are commonly observed in p… ▽ More

    Submitted 13 October, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023 camera ready version

  42. arXiv:2309.16102  [pdf, other

    cs.AI cs.DB

    Discovering Utility-driven Interval Rules

    Authors: Chunkai Zhang, Maohua Lyu, Huai** Hao, Wensheng Gan, Philip S. Yu

    Abstract: For artificial intelligence, high-utility sequential rule mining (HUSRM) is a knowledge discovery method that can reveal the associations between events in the sequences. Recently, abundant methods have been proposed to discover high-utility sequence rules. However, the existing methods are all related to point-based sequences. Interval events that persist for some time are common. Traditional int… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: Preprint. 11 figures, 5 tables

  43. arXiv:2309.11994  [pdf, ps, other

    cs.NE cs.LG

    Enhancing SAEAs with Unevaluated Solutions: A Case Study of Relation Model for Expensive Optimization

    Authors: Hao Hao, Xiaoqun Zhang, Aimin Zhou

    Abstract: Surrogate-assisted evolutionary algorithms (SAEAs) hold significant importance in resolving expensive optimization problems~(EOPs). Extensive efforts have been devoted to improving the efficacy of SAEAs through the development of proficient model-assisted selection methods. However, generating high-quality solutions is a prerequisite for selection. The fundamental paradigm of evaluating a limited… ▽ More

    Submitted 8 October, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: 18 pages, 9 figures

  44. arXiv:2309.02868  [pdf, other

    cs.LG

    Enhancing Asynchronous Time Series Forecasting with Contrastive Relational Inference

    Authors: Yan Wang, Zhixuan Chu, Tao Zhou, Caigao Jiang, Hongyan Hao, Minjie Zhu, Xindong Cai, Qing Cui, Longfei Li, James Y Zhang, Siqiao Xue, Jun Zhou

    Abstract: Asynchronous time series, also known as temporal event sequences, are the basis of many applications throughout different industries. Temporal point processes(TPPs) are the standard method for modeling such data. Existing TPP models have focused on parameterizing the conditional distribution of future events instead of explicitly modeling event interactions, imposing challenges for event predictio… ▽ More

    Submitted 6 October, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: ICDM 2023 AI4TS Workshop

  45. arXiv:2308.10837  [pdf, other

    cs.IR

    Leveraging Large Language Models for Pre-trained Recommender Systems

    Authors: Zhixuan Chu, Hongyan Hao, Xin Ouyang, Simeng Wang, Yan Wang, Yue Shen, **jie Gu, Qing Cui, Longfei Li, Siqiao Xue, James Y Zhang, Sheng Li

    Abstract: Recent advancements in recommendation systems have shifted towards more comprehensive and personalized recommendations by utilizing large language models (LLM). However, effectively integrating LLM's commonsense knowledge and reasoning abilities into recommendation systems remains a challenging problem. In this paper, we propose RecSysLLM, a novel pre-trained recommendation model based on LLMs. Re… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 13 pages, 4 figures

  46. arXiv:2308.10835  [pdf, other

    cs.IR

    Enhancing Recommender Systems with Large Language Model Reasoning Graphs

    Authors: Yan Wang, Zhixuan Chu, Xin Ouyang, Simeng Wang, Hongyan Hao, Yue Shen, **jie Gu, Siqiao Xue, James Y Zhang, Qing Cui, Longfei Li, Jun Zhou, Sheng Li

    Abstract: Recommendation systems aim to provide users with relevant suggestions, but often lack interpretability and fail to capture higher-level semantic relationships between user behaviors and profiles. In this paper, we propose a novel approach that leverages large language models (LLMs) to construct personalized reasoning graphs. These graphs link a user's profile and behavioral sequences through causa… ▽ More

    Submitted 24 January, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: 12 pages, 6 figures

  47. arXiv:2308.05361  [pdf, other

    cs.CL

    WeaverBird: Empowering Financial Decision-Making with Large Language Model, Knowledge Base, and Search Engine

    Authors: Siqiao Xue, Fan Zhou, Yi Xu, Ming **, Qingsong Wen, Hongyan Hao, Qingyang Dai, Caigao Jiang, Hongyu Zhao, Shuo Xie, Jianshan He, James Zhang, Hongyuan Mei

    Abstract: We present WeaverBird, an intelligent dialogue system designed specifically for the finance domain. Our system harnesses a large language model of GPT architecture that has been tuned using extensive corpora of finance-related text. As a result, our system possesses the capability to understand complex financial queries, such as "How should I manage my investments during inflation?", and provide i… ▽ More

    Submitted 6 April, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

    Comments: revise abstract

  48. Continual Learning in Predictive Autoscaling

    Authors: Hongyan Hao, Zhixuan Chu, Shiyi Zhu, Gangwei Jiang, Yan Wang, Caigao Jiang, James Zhang, Wei Jiang, Siqiao Xue, Jun Zhou

    Abstract: Predictive Autoscaling is used to forecast the workloads of servers and prepare the resources in advance to ensure service level objectives (SLOs) in dynamic cloud environments. However, in practice, its prediction task often suffers from performance degradation under abnormal traffics caused by external events (such as sales promotional activities and applications re-configurations), for which a… ▽ More

    Submitted 14 August, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

  49. arXiv:2307.14535  [pdf, other

    cs.RO

    Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition

    Authors: Huy Ha, Pete Florence, Shuran Song

    Abstract: We present a framework for robot skill acquisition, which 1) efficiently scale up data generation of language-labelled robot data and 2) effectively distills this data down into a robust multi-task language-conditioned visuo-motor policy. For (1), we use a large language model (LLM) to guide high-level planning, and sampling-based robot planners (e.g. motion or grasp samplers) for generating diver… ▽ More

    Submitted 30 September, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: 25 pages, 9 figures, videos and code links on website https://www.cs.columbia.edu/~huy/scalingup/

    ACM Class: I.2.9

  50. arXiv:2307.08771  [pdf, other

    cs.CV

    UPSCALE: Unconstrained Channel Pruning

    Authors: Alvin Wan, Hanxiang Hao, Kaushik Patnaik, Yueyang Xu, Omer Hadad, David Güera, Zhile Ren, Qi Shan

    Abstract: As neural networks grow in size and complexity, inference speeds decline. To combat this, one of the most effective compression techniques -- channel pruning -- removes channels from weights. However, for multi-branch segments of a model, channel removal can introduce inference-time memory copies. In turn, these copies increase inference latency -- so much so that the pruned model can be slower th… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 29 pages, 26 figures, accepted to ICML 2023