Search | arXiv e-print repository

Emergent Crowd Grou** via Heuristic Self-Organization

Authors: Xiao-Cheng Liao, Wei-Neng Chen, Xiang-Ling Chen, Yi Mei

Abstract: Modeling crowds has many important applications in games and computer animation. Inspired by the emergent following effect in real-life crowd scenarios, in this work, we develop a method for implicitly grou** moving agents. We achieve this by analyzing local information around each agent and rotating its preferred velocity accordingly. Each agent could automatically form an implicit group with i… ▽ More Modeling crowds has many important applications in games and computer animation. Inspired by the emergent following effect in real-life crowd scenarios, in this work, we develop a method for implicitly grou** moving agents. We achieve this by analyzing local information around each agent and rotating its preferred velocity accordingly. Each agent could automatically form an implicit group with its neighboring agents that have similar directions. In contrast to an explicit group, there are no strict boundaries for an implicit group. If an agent's direction deviates from its group as a result of positional changes, it will autonomously exit the group or join another implicitly formed neighboring group. This implicit grou** is autonomously emergent among agents rather than deliberately controlled by the algorithm. The proposed method is compared with many crowd simulation models, and the experimental results indicate that our approach achieves the lowest congestion levels in some classic scenarios. In addition, we demonstrate that adjusting the preferred velocity of agents can actually reduce the dissimilarity between their actual velocity and the original preferred velocity. Our work is available online. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2406.16578 [pdf, other]

QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds

Authors: Ye Wang, Yuting Mei, Sipeng Zheng, Qin **

Abstract: While pets offer companionship, their limited intelligence restricts advanced reasoning and autonomous interaction with humans. Considering this, we propose QuadrupedGPT, a versatile agent designed to master a broad range of complex tasks with agility comparable to that of a pet. To achieve this goal, the primary challenges include: i) effectively leveraging multimodal observations for decision-ma… ▽ More While pets offer companionship, their limited intelligence restricts advanced reasoning and autonomous interaction with humans. Considering this, we propose QuadrupedGPT, a versatile agent designed to master a broad range of complex tasks with agility comparable to that of a pet. To achieve this goal, the primary challenges include: i) effectively leveraging multimodal observations for decision-making; ii) mastering agile control of locomotion and path planning; iii) develo** advanced cognition to execute long-term objectives. QuadrupedGPT processes human command and environmental contexts using a large multimodal model (LMM). Empowered by its extensive knowledge base, our agent autonomously assigns appropriate parameters for adaptive locomotion policies and guides the agent in planning a safe but efficient path towards the goal, utilizing semantic-aware terrain analysis. Moreover, QuadrupedGPT is equipped with problem-solving capabilities that enable it to decompose long-term goals into a sequence of executable subgoals through high-level reasoning. Extensive experiments across various benchmarks confirm that QuadrupedGPT can adeptly handle multiple tasks with intricate instructions, demonstrating a significant step towards the versatile quadruped agents in open-ended worlds. Our website and codes can be found at https://quadruped-hub.github.io/Quadruped-GPT/. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: Under review

arXiv:2406.16301 [pdf, other]

doi 10.1145/3652583.3658038

UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos

Authors: Yuting Mei, Linli Yao, Qin **

Abstract: With the surge in the amount of video data, video summarization techniques, including visual-modal(VM) and textual-modal(TM) summarization, are attracting more and more attention. However, unimodal summarization inevitably loses the rich semantics of the video. In this paper, we focus on a more comprehensive video summarization task named Bimodal Semantic Summarization of Videos (BiSSV). Specifica… ▽ More With the surge in the amount of video data, video summarization techniques, including visual-modal(VM) and textual-modal(TM) summarization, are attracting more and more attention. However, unimodal summarization inevitably loses the rich semantics of the video. In this paper, we focus on a more comprehensive video summarization task named Bimodal Semantic Summarization of Videos (BiSSV). Specifically, we first construct a large-scale dataset, BIDS, in (video, VM-Summary, TM-Summary) triplet format. Unlike traditional processing methods, our construction procedure contains a VM-Summary extraction algorithm aiming to preserve the most salient content within long videos. Based on BIDS, we propose a Unified framework UBiSS for the BiSSV task, which models the saliency information in the video and generates a TM-summary and VM-summary simultaneously. We further optimize our model with a list-wise ranking-based objective to improve its capacity to capture highlights. Lastly, we propose a metric, $NDCG_{MS}$, to provide a joint evaluation of the bimodal summary. Experiments show that our unified framework achieves better performance than multi-stage summarization pipelines. Code and data are available at https://github.com/MeiYutingg/UBiSS. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: Accepted by ACM International Conference on Multimedia Retrieval (ICMR'24)

Journal ref: Proceedings of the 2024 International Conference on Multimedia Retrieval, May 2024, Pages 1034-1042

arXiv:2406.11844 [pdf]

Prompting the E-Brushes: Users as Authors in Generative AI

Authors: Yiyang Mei

Abstract: Since its introduction in 2022, Generative AI has significantly impacted the art world, from winning state art fairs to creating complex videos from simple prompts. Amid this renaissance, a pivotal issue emerges: should users of Generative AI be recognized as authors eligible for copyright protection? The Copyright Office, in its March 2023 Guidance, argues against this notion. By comparing the pr… ▽ More Since its introduction in 2022, Generative AI has significantly impacted the art world, from winning state art fairs to creating complex videos from simple prompts. Amid this renaissance, a pivotal issue emerges: should users of Generative AI be recognized as authors eligible for copyright protection? The Copyright Office, in its March 2023 Guidance, argues against this notion. By comparing the prompts to clients' instructions for commissioned art, the Office denies users authorship due to their limited role in the creative process. This Article challenges this viewpoint and advocates for the recognition of Generative AI users who incorporate these tools into their creative endeavors. It argues that the current policy fails to consider the intricate and dynamic interaction between Generative AI users and the models, where users actively influence the output through a process of adjustment, refinement, selection, and arrangement. Rather than dismissing the contributions generated by AI, this Article suggests a simplified and streamlined registration process that acknowledges the role of AI in creation. This approach not only aligns with the constitutional goal of promoting the progress of science and useful arts but also encourages public engagement in the creative process, which contributes to the pool of training data for AI. Moreover, it advocates for a flexible framework that evolves alongside technological advancements while ensuring safety and public interest. In conclusion, by examining text-to-image generators and addressing misconceptions about Generative AI and user interaction, this Article calls for a regulatory framework that adapts to technological developments and safeguards public interests △ Less

Submitted 24 March, 2024; originally announced June 2024.

Journal ref: International Journal of Law, Ethics, and Technology 2024

arXiv:2406.10373 [pdf, other]

Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections

Authors: Jiacong Xu, Yiqun Mei, Vishal M. Patel

Abstract: Photographs captured in unstructured tourist environments frequently exhibit variable appearances and transient occlusions, challenging accurate scene reconstruction and inducing artifacts in novel view synthesis. Although prior approaches have integrated the Neural Radiance Field (NeRF) with additional learnable modules to handle the dynamic appearances and eliminate transient objects, their exte… ▽ More Photographs captured in unstructured tourist environments frequently exhibit variable appearances and transient occlusions, challenging accurate scene reconstruction and inducing artifacts in novel view synthesis. Although prior approaches have integrated the Neural Radiance Field (NeRF) with additional learnable modules to handle the dynamic appearances and eliminate transient objects, their extensive training demands and slow rendering speeds limit practical deployments. Recently, 3D Gaussian Splatting (3DGS) has emerged as a promising alternative to NeRF, offering superior training and inference efficiency along with better rendering quality. This paper presents Wild-GS, an innovative adaptation of 3DGS optimized for unconstrained photo collections while preserving its efficiency benefits. Wild-GS determines the appearance of each 3D Gaussian by their inherent material attributes, global illumination and camera properties per image, and point-level local variance of reflectance. Unlike previous methods that model reference features in image space, Wild-GS explicitly aligns the pixel appearance features to the corresponding local Gaussians by sampling the triplane extracted from the reference image. This novel design effectively transfers the high-frequency detailed appearance of the reference view to 3D space and significantly expedites the training process. Furthermore, 2D visibility maps and depth regularization are leveraged to mitigate the transient effects and constrain the geometry, respectively. Extensive experiments demonstrate that Wild-GS achieves state-of-the-art rendering performance and the highest efficiency in both training and inference among all the existing techniques. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 15 pages, 7 figures

arXiv:2406.01566 [pdf, other]

Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs

Authors: Yixuan Mei, Yonghao Zhuang, Xupeng Miao, Juncheng Yang, Zhihao Jia, Rashmi Vinayak

Abstract: This paper introduces Helix, a distributed system for high-throughput, low-latency large language model (LLM) serving on heterogeneous GPU clusters. A key idea behind Helix is to formulate inference computation of LLMs over heterogeneous GPUs and network connections as a max-flow problem for a directed, weighted graph, whose nodes represent GPU instances and edges capture both GPU and network hete… ▽ More This paper introduces Helix, a distributed system for high-throughput, low-latency large language model (LLM) serving on heterogeneous GPU clusters. A key idea behind Helix is to formulate inference computation of LLMs over heterogeneous GPUs and network connections as a max-flow problem for a directed, weighted graph, whose nodes represent GPU instances and edges capture both GPU and network heterogeneity through their capacities. Helix then uses a mixed integer linear programming (MILP) algorithm to discover highly optimized strategies to serve LLMs. This approach allows Helix to jointly optimize model placement and request scheduling, two highly entangled tasks in heterogeneous LLM serving. Our evaluation on several heterogeneous cluster settings ranging from 24 to 42 GPU nodes shows that Helix improves serving throughput by up to 2.7$\times$ and reduces prompting and decoding latency by up to 2.8$\times$ and 1.3$\times$, respectively, compared to best existing approaches. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2405.14268 [pdf, other]

Multi-Representation Genetic Programming: A Case Study on Tree-based and Linear Representations

Authors: Zhixing Huang, Yi Mei, Fangfang Zhang, Mengjie Zhang, Wolfgang Banzhaf

Abstract: Existing genetic programming (GP) methods are typically designed based on a certain representation, such as tree-based or linear representations. These representations show various pros and cons in different domains. However, due to the complicated relationships among representation and fitness landscapes of GP, it is hard to intuitively determine which GP representation is the most suitable for s… ▽ More Existing genetic programming (GP) methods are typically designed based on a certain representation, such as tree-based or linear representations. These representations show various pros and cons in different domains. However, due to the complicated relationships among representation and fitness landscapes of GP, it is hard to intuitively determine which GP representation is the most suitable for solving a certain problem. Evolving programs (or models) with multiple representations simultaneously can alternatively search on different fitness landscapes since representations are highly related to the search space that essentially defines the fitness landscape. Fully using the latent synergies among different GP individual representations might be helpful for GP to search for better solutions. However, existing GP literature rarely investigates the simultaneous effective use of evolving multiple representations. To fill this gap, this paper proposes a multi-representation GP algorithm based on tree-based and linear representations, which are two commonly used GP representations. In addition, we develop a new cross-representation crossover operator to harness the interplay between tree-based and linear representations. Empirical results show that navigating the learned knowledge between basic tree-based and linear representations successfully improves the effectiveness of GP with solely tree-based or linear representation in solving symbolic regression and dynamic job shop scheduling problems. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.04759 [pdf, ps, other]

Multi-Label Out-of-Distribution Detection with Spectral Normalized Joint Energy

Authors: Yihan Mei, Xinyu Wang, Dell Zhang, Xiaoling Wang

Abstract: In today's interconnected world, achieving reliable out-of-distribution (OOD) detection poses a significant challenge for machine learning models. While numerous studies have introduced improved approaches for multi-class OOD detection tasks, the investigation into multi-label OOD detection tasks has been notably limited. We introduce Spectral Normalized Joint Energy (SNoJoE), a method that consol… ▽ More In today's interconnected world, achieving reliable out-of-distribution (OOD) detection poses a significant challenge for machine learning models. While numerous studies have introduced improved approaches for multi-class OOD detection tasks, the investigation into multi-label OOD detection tasks has been notably limited. We introduce Spectral Normalized Joint Energy (SNoJoE), a method that consolidates label-specific information across multiple labels through the theoretically justified concept of an energy-based function. Throughout the training process, we employ spectral normalization to manage the model's feature space, thereby enhancing model efficacy and generalization, in addition to bolstering robustness. Our findings indicate that the application of spectral normalization to joint energy scores notably amplifies the model's capability for OOD detection. We perform OOD detection experiments utilizing PASCAL-VOC as the in-distribution dataset and ImageNet-22K or Texture as the out-of-distribution datasets. Our experimental results reveal that, in comparison to prior top performances, SNoJoE achieves 11% and 54% relative reductions in FPR95 on the respective OOD datasets, thereby defining the new state of the art in this field of study. △ Less

Submitted 12 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.04652 [pdf, ps, other]

AffirmativeAI: Towards LGBTQ+ Friendly Audit Frameworks for Large Language Models

Authors: Yinru Long, Zilin Ma, Yiyang Mei, Zhaoyuan Su

Abstract: LGBTQ+ community face disproportionate mental health challenges, including higher rates of depression, anxiety, and suicidal ideation. Research has shown that LGBTQ+ people have been using large language model-based chatbots, such as ChatGPT, for their mental health needs. Despite the potential for immediate support and anonymity these chatbots offer, concerns regarding their capacity to provide e… ▽ More LGBTQ+ community face disproportionate mental health challenges, including higher rates of depression, anxiety, and suicidal ideation. Research has shown that LGBTQ+ people have been using large language model-based chatbots, such as ChatGPT, for their mental health needs. Despite the potential for immediate support and anonymity these chatbots offer, concerns regarding their capacity to provide empathetic, accurate, and affirming responses remain. In response to these challenges, we propose a framework for evaluating the affirmativeness of LLMs based on principles of affirmative therapy, emphasizing the need for attitudes, knowledge, and actions that support and validate LGBTQ+ experiences. We propose a combination of qualitative and quantitative analyses, ho** to establish benchmarks for "Affirmative AI," ensuring that LLM-based chatbots can provide safe, supportive, and effective mental health support to LGBTQ+ individuals. We benchmark LLM affirmativeness not as a mental health solution for LGBTQ+ individuals or to claim it resolves their mental health issues, as we highlight the need to consider complex discrimination in the LGBTQ+ community when designing technological aids. Our goal is to evaluate LLMs for LGBTQ+ mental health support since many in the community already use them, aiming to identify potential harms of using general-purpose LLMs in this context. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2404.10252 [pdf, other]

Learning from Offline and Online Experiences: A Hybrid Adaptive Operator Selection Framework

Authors: Jiyuan Pei, Jialin Liu, Yi Mei

Abstract: In many practical applications, usually, similar optimisation problems or scenarios repeatedly appear. Learning from previous problem-solving experiences can help adjust algorithm components of meta-heuristics, e.g., adaptively selecting promising search operators, to achieve better optimisation performance. However, those experiences obtained from previously solved problems, namely offline experi… ▽ More In many practical applications, usually, similar optimisation problems or scenarios repeatedly appear. Learning from previous problem-solving experiences can help adjust algorithm components of meta-heuristics, e.g., adaptively selecting promising search operators, to achieve better optimisation performance. However, those experiences obtained from previously solved problems, namely offline experiences, may sometimes provide misleading perceptions when solving a new problem, if the characteristics of previous problems and the new one are relatively different. Learning from online experiences obtained during the ongoing problem-solving process is more instructive but highly restricted by limited computational resources. This paper focuses on the effective combination of offline and online experiences. A novel hybrid framework that learns to dynamically and adaptively select promising search operators is proposed. Two adaptive operator selection modules with complementary paradigms cooperate in the framework to learn from offline and online experiences and make decisions. An adaptive decision policy is maintained to balance the use of those two modules in an online manner. Extensive experiments on 170 widely studied real-value benchmark optimisation problems and a benchmark set with 34 instances for combinatorial optimisation show that the proposed hybrid framework outperforms the state-of-the-art methods. Ablation study verifies the effectiveness of each component of the framework. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2403.17328 [pdf, other]

Learning Traffic Signal Control via Genetic Programming

Authors: Xiao-Cheng Liao, Yi Mei, Mengjie Zhang

Abstract: The control of traffic signals is crucial for improving transportation efficiency. Recently, learning-based methods, especially Deep Reinforcement Learning (DRL), garnered substantial success in the quest for more efficient traffic signal control strategies. However, the design of rewards in DRL highly demands domain knowledge to converge to an effective policy, and the final policy also presents… ▽ More The control of traffic signals is crucial for improving transportation efficiency. Recently, learning-based methods, especially Deep Reinforcement Learning (DRL), garnered substantial success in the quest for more efficient traffic signal control strategies. However, the design of rewards in DRL highly demands domain knowledge to converge to an effective policy, and the final policy also presents difficulties in terms of explainability. In this work, a new learning-based method for signal control in complex intersections is proposed. In our approach, we design a concept of phase urgency for each signal phase. During signal transitions, the traffic light control strategy selects the next phase to be activated based on the phase urgency. We then proposed to represent the urgency function as an explainable tree structure. The urgency function can calculate the phase urgency for a specific phase based on the current road conditions. Genetic programming is adopted to perform gradient-free optimization of the urgency function. We test our algorithm on multiple public traffic signal control datasets. The experimental results indicate that the tree-shaped urgency function evolved by genetic programming outperforms the baselines, including a state-of-the-art method in the transportation field and a well-known DRL-based method. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.15872 [pdf, other]

RAAMove: A Corpus for Analyzing Moves in Research Article Abstracts

Authors: Hongzheng Li, Ruo** Wang, Ge Shi, Xing Lv, Lei Lei, Chong Feng, Fang Liu, **kun Lin, Yangguang Mei, Lingnan Xu

Abstract: Move structures have been studied in English for Specific Purposes (ESP) and English for Academic Purposes (EAP) for decades. However, there are few move annotation corpora for Research Article (RA) abstracts. In this paper, we introduce RAAMove, a comprehensive multi-domain corpus dedicated to the annotation of move structures in RA abstracts. The primary objective of RAAMove is to facilitate mov… ▽ More Move structures have been studied in English for Specific Purposes (ESP) and English for Academic Purposes (EAP) for decades. However, there are few move annotation corpora for Research Article (RA) abstracts. In this paper, we introduce RAAMove, a comprehensive multi-domain corpus dedicated to the annotation of move structures in RA abstracts. The primary objective of RAAMove is to facilitate move analysis and automatic move identification. This paper provides a thorough discussion of the corpus construction process, including the scheme, data collection, annotation guidelines, and annotation procedures. The corpus is constructed through two stages: initially, expert annotators manually annotate high-quality data; subsequently, based on the human-annotated data, a BERT-based model is employed for automatic annotation with the help of experts' modification. The result is a large-scale and high-quality corpus comprising 33,988 annotated instances. We also conduct preliminary move identification experiments using the BERT-based model to verify the effectiveness of the proposed corpus and model. The annotated corpus is available for academic research purposes and can serve as essential resources for move analysis, English language teaching and writing, as well as move/discourse-related tasks in Natural Language Processing (NLP). △ Less

Submitted 23 March, 2024; originally announced March 2024.

Comments: Accepted by LREC-COLING 2024

arXiv:2403.09632 [pdf, other]

Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image

Authors: Yiqun Mei, Yu Zeng, He Zhang, Zhixin Shu, Xuaner Zhang, Sai Bi, Jianming Zhang, HyunJoon Jung, Vishal M. Patel

Abstract: At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to rec… ▽ More At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to reconstruct geometry and appearance from an input portrait as a set of 3D-aware features. We design a relighting module conditioned on a given lighting to process these features, and predict a relit 3D representation in the form of a tri-plane, which can render to an arbitrary viewpoint through volume rendering. Besides viewpoint and lighting control, Holo-Relighting also takes the head pose as a condition to enable head-pose-dependent lighting effects. With these novel designs, Holo-Relighting can generate complex non-Lambertian lighting effects (e.g., specular highlights and cast shadows) without using any explicit physical lighting priors. We train Holo-Relighting with data captured with a light stage, and propose two data-rendering techniques to improve the data quality for training the volumetric relighting system. Through quantitative and qualitative experiments, we demonstrate Holo-Relighting can achieve state-of-the-arts relighting quality with better photorealism, 3D consistency and controllability. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: CVPR2024

arXiv:2402.13777 [pdf, other]

Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

Authors: Jiayu Chen, Bhargav Ganguly, Yang Xu, Yongsheng Mei, Tian Lan, Vaneet Aggarwal

Abstract: Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline… ▽ More Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning. △ Less

Submitted 25 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: We restructured the paper and added more discussion

arXiv:2402.09260 [pdf, other]

doi 10.1145/3613904.3642482

Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support

Authors: Zilin Ma, Yiyang Mei, Yinru Long, Zhaoyuan Su, Krzysztof Z. Gajos

Abstract: LGBTQ+ individuals are increasingly turning to chatbots powered by large language models (LLMs) to meet their mental health needs. However, little research has explored whether these chatbots can adequately and safely provide tailored support for this demographic. We interviewed 18 LGBTQ+ and 13 non-LGBTQ+ participants about their experiences with LLM-based chatbots for mental health needs. LGBTQ+… ▽ More LGBTQ+ individuals are increasingly turning to chatbots powered by large language models (LLMs) to meet their mental health needs. However, little research has explored whether these chatbots can adequately and safely provide tailored support for this demographic. We interviewed 18 LGBTQ+ and 13 non-LGBTQ+ participants about their experiences with LLM-based chatbots for mental health needs. LGBTQ+ participants relied on these chatbots for mental health support, likely due to an absence of support in real life. Notably, while LLMs offer prompt support, they frequently fall short in gras** the nuances of LGBTQ-specific challenges. Although fine-tuning LLMs to address LGBTQ+ needs can be a step in the right direction, it isn't the panacea. The deeper issue is entrenched in societal discrimination. Consequently, we call on future researchers and designers to look beyond mere technical refinements and advocate for holistic strategies that confront and counteract the societal biases burdening the LGBTQ+ community. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.01439 [pdf, other]

From Words to Molecules: A Survey of Large Language Models in Chemistry

Authors: Chang Liao, Yemin Yu, Yu Mei, Ying Wei

Abstract: In recent years, Large Language Models (LLMs) have achieved significant success in natural language processing (NLP) and various interdisciplinary areas. However, applying LLMs to chemistry is a complex task that requires specialized domain knowledge. This paper provides a thorough exploration of the nuanced methodologies employed in integrating LLMs into the field of chemistry, delving into the c… ▽ More In recent years, Large Language Models (LLMs) have achieved significant success in natural language processing (NLP) and various interdisciplinary areas. However, applying LLMs to chemistry is a complex task that requires specialized domain knowledge. This paper provides a thorough exploration of the nuanced methodologies employed in integrating LLMs into the field of chemistry, delving into the complexities and innovations at this interdisciplinary juncture. Specifically, our analysis begins with examining how molecular information is fed into LLMs through various representation and tokenization methods. We then categorize chemical LLMs into three distinct groups based on the domain and modality of their input data, and discuss approaches for integrating these inputs for LLMs. Furthermore, this paper delves into the pretraining objectives with adaptations to chemical LLMs. After that, we explore the diverse applications of LLMs in chemistry, including novel paradigms for their application in chemistry tasks. Finally, we identify promising research directions, including further integration with chemical knowledge, advancements in continual learning, and improvements in model interpretability, paving the way for groundbreaking developments in the field. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: Submitted to IJCAI 2024 survey track

arXiv:2402.00404 [pdf, other]

Improving Critical Node Detection Using Neural Network-based Initialization in a Genetic Algorithm

Authors: Chanjuan Liu, Shike Ge, Zhihan Chen, Wenbin Pei, Enqiang Zhu, Yi Mei, Hisao Ishibuchi

Abstract: The Critical Node Problem (CNP) is concerned with identifying the critical nodes in a complex network. These nodes play a significant role in maintaining the connectivity of the network, and removing them can negatively impact network performance. CNP has been studied extensively due to its numerous real-world applications. Among the different versions of CNP, CNP-1a has gained the most popularity… ▽ More The Critical Node Problem (CNP) is concerned with identifying the critical nodes in a complex network. These nodes play a significant role in maintaining the connectivity of the network, and removing them can negatively impact network performance. CNP has been studied extensively due to its numerous real-world applications. Among the different versions of CNP, CNP-1a has gained the most popularity. The primary objective of CNP-1a is to minimize the pair-wise connectivity in the remaining network after deleting a limited number of nodes from a network. Due to the NP-hard nature of CNP-1a, many heuristic/metaheuristic algorithms have been proposed to solve this problem. However, most existing algorithms start with a random initialization, leading to a high cost of obtaining an optimal solution. To improve the efficiency of solving CNP-1a, a knowledge-guided genetic algorithm named K2GA has been proposed. Unlike the standard genetic algorithm framework, K2GA has two main components: a pretrained neural network to obtain prior knowledge on possible critical nodes, and a hybrid genetic algorithm with local search for finding an optimal set of critical nodes based on the knowledge given by the trained neural network. The local search process utilizes a cut node-based greedy strategy. The effectiveness of the proposed knowledgeguided genetic algorithm is verified by experiments on 26 realworld instances of complex networks. Experimental results show that K2GA outperforms the state-of-the-art algorithms regarding the best, median, and average objective values, and improves the best upper bounds on the best objective values for eight realworld instances. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: 14 pages, 13 figures

arXiv:2401.15279 [pdf, other]

FabHacks: Transform Everyday Objects into Functional Fixtures

Authors: Yuxuan Mei, Benjamin Jones, Dan Cascaval, Jennifer Mankoff, Etienne Vouga, Adriana Schulz

Abstract: Storage, organizing, and decorating are an important part of home design. While one can buy commercial items for many of these tasks, this can be costly, and re-use is more sustainable. An alternative is a "home hack", a functional assembly that can be constructed from existing household items. However, coming up with such hacks requires combining objects to make a physically valid design, which m… ▽ More Storage, organizing, and decorating are an important part of home design. While one can buy commercial items for many of these tasks, this can be costly, and re-use is more sustainable. An alternative is a "home hack", a functional assembly that can be constructed from existing household items. However, coming up with such hacks requires combining objects to make a physically valid design, which might be difficult to test if they are large, require nailing or screwing something to the wall, or the designer has mobility limitations. In this work, we present a design and visualization system for creating workable functional assemblies, FabHacks, which is based on a solver-aided domain-specific language (S-DSL) FabHaL. By analyzing existing home hacks shared online, we create a design abstraction for connecting household items using predefined types of connections. We provide a UI for FabHaL that can be used to design assemblies that fulfill a given specification. Our system leverages a physics-based solver that takes an assembly design and finds its expected physical configuration. Our validation includes a user study showing that users can create assemblies successfully using our UI and explore a range of designs. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.14544 [pdf, other]

Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

Authors: Yongsheng Mei, Mahdi Imani, Tian Lan

Abstract: Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel… ▽ More Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel maximum a posteriori inference of Gaussian Cox processes. It leverages the Laplace approximation and change of kernel technique to transform the problem into a new reproducing kernel Hilbert space, where it becomes more tractable computationally. It enables us to obtain both a functional posterior of the latent intensity function and the covariance of the posterior, thus extending existing works that often focus on specific link functions or estimating the posterior mean. Using the result, we propose a BO framework based on the Gaussian Cox process model and further develop a Nyström approximation for efficient computation. Extensive evaluations on various synthetic and real-world datasets demonstrate significant improvement over state-of-the-art inference solutions for Gaussian Cox processes, as well as effective BO with a wide range of acquisition functions designed through the underlying Gaussian Cox process model. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: 2024 International Conference on Learning Representations (ICLR)

arXiv:2401.06979 [pdf, ps, other]

Distance-aware Attention Resha**: Enhance Generalization of Neural Solver for Large-scale Vehicle Routing Problems

Authors: Yang Wang, Ya-Hui Jia, Wei-Neng Chen, Yi Mei

Abstract: Neural solvers based on attention mechanism have demonstrated remarkable effectiveness in solving vehicle routing problems. However, in the generalization process from small scale to large scale, we find a phenomenon of the dispersion of attention scores in existing neural solvers, which leads to poor performance. To address this issue, this paper proposes a distance-aware attention resha** meth… ▽ More Neural solvers based on attention mechanism have demonstrated remarkable effectiveness in solving vehicle routing problems. However, in the generalization process from small scale to large scale, we find a phenomenon of the dispersion of attention scores in existing neural solvers, which leads to poor performance. To address this issue, this paper proposes a distance-aware attention resha** method, assisting neural solvers in solving large-scale vehicle routing problems. Specifically, without the need for additional training, we utilize the Euclidean distance information between current nodes to adjust attention scores. This enables a neural solver trained on small-scale instances to make rational choices when solving a large-scale problem. Experimental results show that the proposed method significantly outperforms existing state-of-the-art neural solvers on the large-scale CVRPLib dataset. △ Less

Submitted 13 January, 2024; originally announced January 2024.

arXiv:2401.06377 [pdf, other]

Design and Nonlinear Modeling of a Modular Cable Driven Soft Robotic Arm

Authors: Xinda Qi, Yu Mei, Dong Chen, Zhaojian Li, Xiaobo Tan

Abstract: We propose a novel multi-section cable-driven soft robotic arm inspired by octopus tentacles along with a new modeling approach. Each section of the modular manipulator is made of a soft tubing backbone, a soft silicon arm body, and two rigid endcaps, which connect adjacent sections and decouple the actuation cables of different sections. The soft robotic arm is made with casting after the rigid e… ▽ More We propose a novel multi-section cable-driven soft robotic arm inspired by octopus tentacles along with a new modeling approach. Each section of the modular manipulator is made of a soft tubing backbone, a soft silicon arm body, and two rigid endcaps, which connect adjacent sections and decouple the actuation cables of different sections. The soft robotic arm is made with casting after the rigid endcaps are 3D-printed, achieving low-cost and convenient fabrication. To capture the nonlinear effect of cables pushing into the soft silicon arm body, which results from the absence of intermediate rigid cable guides for higher compliance, an analytical static model is developed to capture the relationship between the bending curvature and the cable lengths. The proposed model shows superior prediction performance in experiments over that of a baseline model, especially under large bending conditions. Based on the nonlinear static model, a kinematic model of a multi-section arm is further developed and used to derive a motion planning algorithm. Experiments show that the proposed soft arm has high flexibility and a large workspace, and the tracking errors under the algorithm based on the proposed modeling approach are up to 52$\%$ smaller than those with the algorithm derived from the baseline model. The presented modeling approach is expected to be applicable to a broad range of soft cable-driven actuators and manipulators. △ Less

Submitted 15 May, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

Comments: The paper has been accepted by IEEE Transactions on Mechatronics

arXiv:2401.00283 [pdf, other]

Near-Space Communications: the Last Piece of 6G Space-Air-Ground-Sea Integrated Network Puzzle

Authors: Hongshan Liu, Tong Qin, Zhen Gao, Tianqi Mao, Keke Ying, Ziwei Wan, Li Qiao, Rui Na, Zhongxiang Li, Chun Hu, Yikun Mei, Tuan Li, Guanghui Wen, Lei Chen, Zhonghuai Wu, Ruiqi Liu, Gaojie Chen, Shuo Wang, Dezhi Zheng

Abstract: This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis… ▽ More This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis between the NS-COM network and other counterparts in SAGSIN is conducted, covering aspects of deployment, coverage, channel characteristics and unique problems of NS-COM network. Afterwards, the technical aspects of NS-COM, including channel modeling, random access, channel estimation, array-based beam management and joint network optimization, are examined in detail. Furthermore, we explore the potential applications of NS-COM, such as structural expansion in SAGSIN communication, civil aviation communication, remote and urgent communication, weather monitoring and carbon neutrality. Finally, some promising research avenues are identified, including stratospheric satellite (StratoSat) -to-ground direct links for mobile terminals, reconfigurable multiple-input multiple-output (MIMO) and holographic MIMO, federated learning in NS-COM networks, maritime communication, electromagnetic spectrum sensing and adversarial game, integrated sensing and communications, StratoSat-based radar detection and imaging, NS-COM assisted enhanced global navigation system, NS-COM assisted intelligent unmanned system and free space optical (FSO) communication. Overall, this paper highlights that the NS-COM plays an indispensable role in the SAGSIN puzzle, providing substantial performance and coverage enhancement to the traditional SAGSIN architecture. △ Less

Submitted 4 March, 2024; v1 submitted 30 December, 2023; originally announced January 2024.

Comments: 28 pages, 8 figures, 2 tables

arXiv:2312.07760 [pdf, other]

doi 10.1007/978-981-99-8391-9_33

XC-NAS: A New Cellular Encoding Approach for Neural Architecture Search of Multi-path Convolutional Neural Networks

Authors: Trevor Londt, Xiaoying Gao, Peter Andreae, Yi Mei

Abstract: Convolutional Neural Networks (CNNs) continue to achieve great success in classification tasks as innovative techniques and complex multi-path architecture topologies are introduced. Neural Architecture Search (NAS) aims to automate the design of these complex architectures, reducing the need for costly manual design work by human experts. Cellular Encoding (CE) is an evolutionary computation tech… ▽ More Convolutional Neural Networks (CNNs) continue to achieve great success in classification tasks as innovative techniques and complex multi-path architecture topologies are introduced. Neural Architecture Search (NAS) aims to automate the design of these complex architectures, reducing the need for costly manual design work by human experts. Cellular Encoding (CE) is an evolutionary computation technique which excels in constructing novel multi-path topologies of varying complexity and has recently been applied with NAS to evolve CNN architectures for various classification tasks. However, existing CE approaches have severe limitations. They are restricted to only one domain, only partially implement the theme of CE, or only focus on the micro-architecture search space. This paper introduces a new CE representation and algorithm capable of evolving novel multi-path CNN architectures of varying depth, width, and complexity for image and text classification tasks. The algorithm explicitly focuses on the macro-architecture search space. Furthermore, by using a surrogate model approach, we show that the algorithm can evolve a performant CNN architecture in less than one GPU day, thereby allowing a sufficient number of experiment runs to be conducted to achieve scientific robustness. Experiment results show that the approach is highly competitive, defeating several state-of-the-art methods, and is generalisable to both the image and text domains. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Comments: Australasian Joint Conference on Artificial Intelligence 2023

arXiv:2312.07696 [pdf, ps, other]

Real-time Network Intrusion Detection via Decision Transformers

Authors: **gdi Chen, Hanhan Zhou, Yongsheng Mei, Gina Adam, Nathaniel D. Bastian, Tian Lan

Abstract: Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlyin… ▽ More Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlying network states are often not observable. In this paper, we cast the problem of real-time network intrusion detection as casual sequence modeling and draw upon the power of the transformer architecture for real-time decision-making. By conditioning a causal decision transformer on past trajectories, consisting of the rewards, network packets, and detection decisions, our proposed framework will generate future detection decisions to achieve the desired return. It enables decision transformers to be applied to real-time network intrusion detection, as well as a novel tradeoff between the accuracy and timeliness of detection. The proposed solution is evaluated on public network intrusion detection datasets and outperforms several baseline algorithms using reinforcement learning and sequence modeling, in terms of detection accuracy and timeliness. △ Less

Submitted 16 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.04948 [pdf, other]

Scientific Preparation for CSST: Classification of Galaxy and Nebula/Star Cluster Based on Deep Learning

Authors: Yuquan Zhang, Zhong Cao, Feng Wang, Lam, Man I, Hui Deng, Ying Mei, Lei Tan

Abstract: The Chinese Space Station Telescope (abbreviated as CSST) is a future advanced space telescope. Real-time identification of galaxy and nebula/star cluster (abbreviated as NSC) images is of great value during CSST survey. While recent research on celestial object recognition has progressed, the rapid and efficient identification of high-resolution local celestial images remains challenging. In this… ▽ More The Chinese Space Station Telescope (abbreviated as CSST) is a future advanced space telescope. Real-time identification of galaxy and nebula/star cluster (abbreviated as NSC) images is of great value during CSST survey. While recent research on celestial object recognition has progressed, the rapid and efficient identification of high-resolution local celestial images remains challenging. In this study, we conducted galaxy and NSC image classification research using deep learning methods based on data from the Hubble Space Telescope. We built a Local Celestial Image Dataset and designed a deep learning model named HR-CelestialNet for classifying images of the galaxy and NSC. HR-CelestialNet achieved an accuracy of 89.09% on the testing set, outperforming models such as AlexNet, VGGNet and ResNet, while demonstrating faster recognition speeds. Furthermore, we investigated the factors influencing CSST image quality and evaluated the generalization ability of HR-CelestialNet on the blurry image dataset, demonstrating its robustness to low image quality. The proposed method can enable real-time identification of celestial images during CSST survey mission. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.01728 [pdf, other]

ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation

Authors: Tong Nie, Guoyang Qin, Wei Ma, Yuewen Mei, Jian Sun

Abstract: Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient feat… ▽ More Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient features of expressivity but lacks prior knowledge of the underlying spatiotemporal structures. Leveraging the strengths of both two paradigms, we demonstrate a low rankness-induced Transformer to achieve a balance between strong inductive bias and high model expressivity. The exploitation of the inherent structures of spatiotemporal data enables our model to learn balanced signal-noise representations, making it generalizable for a variety of imputation problems. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters, and air quality. Promising empirical results provide strong conviction that incorporating time series primitives, such as low-rankness, can substantially facilitate the development of a generalizable model to approach a wide range of spatiotemporal imputation problems. △ Less

Submitted 28 May, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: Accepted by KDD'24 (Research Track)

arXiv:2311.15920 [pdf, other]

A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning

Authors: Jianxiong Li, Shichao Lin, Tianyu Shi, Chujie Tian, Yu Mei, Jian Song, Xianyuan Zhan, Ruimin Li

Abstract: The optimization of traffic signal control (TSC) is critical for an efficient transportation system. In recent years, reinforcement learning (RL) techniques have emerged as a popular approach for TSC and show promising results for highly adaptive control. However, existing RL-based methods suffer from notably poor real-world applicability and hardly have any successful deployments. The reasons for… ▽ More The optimization of traffic signal control (TSC) is critical for an efficient transportation system. In recent years, reinforcement learning (RL) techniques have emerged as a popular approach for TSC and show promising results for highly adaptive control. However, existing RL-based methods suffer from notably poor real-world applicability and hardly have any successful deployments. The reasons for such failures are mostly due to the reliance on over-idealized traffic simulators for policy optimization, as well as using unrealistic fine-grained state observations and reward signals that are not directly obtainable from real-world sensors. In this paper, we propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC). Specifically, we combine well-established traffic flow theory with machine learning to construct a reward inference model to infer the reward signals from coarse-grained traffic data. With the inferred rewards, we further propose a sample-efficient offline RL method to enable direct signal control policy learning from historical offline datasets of real-world intersections. To evaluate our approach, we collect historical traffic data from a real-world intersection, and develop a highly customized simulation environment that strictly follows real data characteristics. We demonstrate through extensive experiments that our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 15 pages, 6 figures

arXiv:2311.09611 [pdf, other]

DeltaLCA: Comparative Life-Cycle Assessment for Electronics Design

Authors: Zhihan Zhang, Felix Hähnlein, Yuxuan Mei, Zachary Englhardt, Shwetak Patel, Adriana Schulz, Vikram Iyer

Abstract: Reducing the environmental footprint of electronics and computing devices requires new tools that empower designers to make informed decisions about sustainability during the design process itself. This is not possible with current tools for life cycle assessment (LCA) which require substantial domain expertise and time to evaluate the numerous chips and other components that make up a device. We… ▽ More Reducing the environmental footprint of electronics and computing devices requires new tools that empower designers to make informed decisions about sustainability during the design process itself. This is not possible with current tools for life cycle assessment (LCA) which require substantial domain expertise and time to evaluate the numerous chips and other components that make up a device. We observe first that informed decision-making does not require absolute metrics and can instead be done by comparing designs. Second, we can use domain-specific heuristics to perform these comparisons. We combine these insights to develop DeltaLCA, an open-source interactive design tool that addresses the dual challenges of automating life cycle inventory generation and data availability by performing comparative analyses of electronics designs. Users can upload standard design files from Electronic Design Automation (EDA) software and the tool will guide them through determining which one has greater carbon footprint. DeltaLCA leverages electronics-specific LCA datasets and heuristics and tries to automatically rank the two designs, prompting users to provide additional information only when necessary. We show through case studies DeltaLCA achieves the same result as evaluating full LCAs, and that it accelerates LCA comparisons from eight expert-hours to a single click for devices with ~30 components, and 15 minutes for more complex devices with ~100 components. △ Less

Submitted 16 November, 2023; originally announced November 2023.

arXiv:2311.06770 [pdf, other]

Compressive Sensing-Based Grant-Free Massive Access for 6G Massive Communication

Authors: Zhen Gao, Malong Ke, Yikun Mei, Li Qiao, Sheng Chen, Derrick Wing Kwan Ng, H. Vincent Poor

Abstract: The advent of the sixth-generation (6G) of wireless communications has given rise to the necessity to connect vast quantities of heterogeneous wireless devices, which requires advanced system capabilities far beyond existing network architectures. In particular, such massive communication has been recognized as a prime driver that can empower the 6G vision of future ubiquitous connectivity, suppor… ▽ More The advent of the sixth-generation (6G) of wireless communications has given rise to the necessity to connect vast quantities of heterogeneous wireless devices, which requires advanced system capabilities far beyond existing network architectures. In particular, such massive communication has been recognized as a prime driver that can empower the 6G vision of future ubiquitous connectivity, supporting Internet of Human-Machine-Things for which massive access is critical. This paper surveys the most recent advances toward massive access in both academic and industry communities, focusing primarily on the promising compressive sensing-based grant-free massive access paradigm. We first specify the limitations of existing random access schemes and reveal that the practical implementation of massive communication relies on a dramatically different random access paradigm from the current ones mainly designed for human-centric communications. Then, a compressive sensing-based grant-free massive access roadmap is presented, where the evolutions from single-antenna to large-scale antenna array-based base stations, from single-station to cooperative massive multiple-input multiple-output systems, and from unsourced to sourced random access scenarios are detailed. Finally, we discuss the key challenges and open issues to shed light on the potential future research directions of grant-free massive access. △ Less

Submitted 12 November, 2023; originally announced November 2023.

Comments: Accepted by IEEE IoT Journal

arXiv:2310.16310 [pdf, other]

Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process with Uncertainty Quantification

Authors: Zichong Li, Qunzhi Xu, Zhenghao Xu, Yajun Mei, Tuo Zhao, Hongyuan Zha

Abstract: Spatio-temporal point processes (STPPs) are potent mathematical tools for modeling and predicting events with both temporal and spatial features. Despite their versatility, most existing methods for learning STPPs either assume a restricted form of the spatio-temporal distribution, or suffer from inaccurate approximations of the intractable integral in the likelihood training objective. These issu… ▽ More Spatio-temporal point processes (STPPs) are potent mathematical tools for modeling and predicting events with both temporal and spatial features. Despite their versatility, most existing methods for learning STPPs either assume a restricted form of the spatio-temporal distribution, or suffer from inaccurate approximations of the intractable integral in the likelihood training objective. These issues typically arise from the normalization term of the probability density function. Moreover, current techniques fail to provide uncertainty quantification for model predictions, such as confidence intervals for the predicted event's arrival time and confidence regions for the event's location, which is crucial given the considerable randomness of the data. To tackle these challenges, we introduce SMASH: a Score MAtching-based pSeudolikeliHood estimator for learning marked STPPs with uncertainty quantification. Specifically, our framework adopts a normalization-free objective by estimating the pseudolikelihood of marked STPPs through score-matching and offers uncertainty quantification for the predicted event time, location and mark by computing confidence regions over the generated samples. The superior performance of our proposed framework is demonstrated through extensive experiments in both event prediction and uncertainty quantification. △ Less

Submitted 24 October, 2023; originally announced October 2023.

arXiv:2308.16818 [pdf, other]

Irregular Traffic Time Series Forecasting Based on Asynchronous Spatio-Temporal Graph Convolutional Network

Authors: Weijia Zhang, Le Zhang, **dong Han, Hao Liu, **gbo Zhou, Yu Mei, Hui Xiong

Abstract: Accurate traffic forecasting at intersections governed by intelligent traffic signals is critical for the advancement of an effective intelligent traffic signal control system. However, due to the irregular traffic time series produced by intelligent intersections, the traffic forecasting task becomes much more intractable and imposes three major new challenges: 1) asynchronous spatial dependency,… ▽ More Accurate traffic forecasting at intersections governed by intelligent traffic signals is critical for the advancement of an effective intelligent traffic signal control system. However, due to the irregular traffic time series produced by intelligent intersections, the traffic forecasting task becomes much more intractable and imposes three major new challenges: 1) asynchronous spatial dependency, 2) irregular temporal dependency among traffic data, and 3) variable-length sequence to be predicted, which severely impede the performance of current traffic forecasting methods. To this end, we propose an Asynchronous Spatio-tEmporal graph convolutional nEtwoRk (ASeer) to predict the traffic states of the lanes entering intelligent intersections in a future time window. Specifically, by linking lanes via a traffic diffusion graph, we first propose an Asynchronous Graph Diffusion Network to model the asynchronous spatial dependency between the time-misaligned traffic state measurements of lanes. After that, to capture the temporal dependency within irregular traffic state sequence, a learnable personalized time encoding is devised to embed the continuous time for each lane. Then we propose a Transformable Time-aware Convolution Network that learns meta-filters to derive time-aware convolution filters with transformable filter sizes for efficient temporal convolution on the irregular sequence. Furthermore, a Semi-Autoregressive Prediction Network consisting of a state evolution unit and a semiautoregressive predictor is designed to effectively and efficiently predict variable-length traffic state sequences. Extensive experiments on two real-world datasets demonstrate the effectiveness of ASeer in six metrics. △ Less

Submitted 1 September, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

arXiv:2307.15810 [pdf]

Understanding the Benefits and Challenges of Using Large Language Model-based Conversational Agents for Mental Well-being Support

Authors: Zilin Ma, Yiyang Mei, Zhaoyuan Su

Abstract: Conversational agents powered by large language models (LLM) have increasingly been utilized in the realm of mental well-being support. However, the implications and outcomes associated with their usage in such a critical field remain somewhat ambiguous and unexplored. We conducted a qualitative analysis of 120 posts, encompassing 2917 user comments, drawn from the most popular subreddit focused o… ▽ More Conversational agents powered by large language models (LLM) have increasingly been utilized in the realm of mental well-being support. However, the implications and outcomes associated with their usage in such a critical field remain somewhat ambiguous and unexplored. We conducted a qualitative analysis of 120 posts, encompassing 2917 user comments, drawn from the most popular subreddit focused on mental health support applications powered by large language models (u/Replika). This exploration aimed to shed light on the advantages and potential pitfalls associated with the integration of these sophisticated models in conversational agents intended for mental health support. We found the app (Replika) beneficial in offering on-demand, non-judgmental support, boosting user confidence, and aiding self-discovery. Yet, it faced challenges in filtering harmful content, sustaining consistent communication, remembering new information, and mitigating users' overdependence. The stigma attached further risked isolating users socially. We strongly assert that future researchers and designers must thoroughly evaluate the appropriateness of employing LLMs for mental well-being support, ensuring their responsible and effective application. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2307.10120 [pdf, other]

Quarl: A Learning-Based Quantum Circuit Optimizer

Authors: Zikun Li, **jun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded Padon, Zhihao Jia

Abstract: Optimizing quantum circuits is challenging due to the very large search space of functionally equivalent circuits and the necessity of applying transformations that temporarily decrease performance to achieve a final performance improvement. This paper presents Quarl, a learning-based quantum circuit optimizer. Applying reinforcement learning (RL) to quantum circuit optimization raises two main ch… ▽ More Optimizing quantum circuits is challenging due to the very large search space of functionally equivalent circuits and the necessity of applying transformations that temporarily decrease performance to achieve a final performance improvement. This paper presents Quarl, a learning-based quantum circuit optimizer. Applying reinforcement learning (RL) to quantum circuit optimization raises two main challenges: the large and varying action space and the non-uniform state representation. Quarl addresses these issues with a novel neural architecture and RL-training procedure. Our neural architecture decomposes the action space into two parts and leverages graph neural networks in its state representation, both of which are guided by the intuition that optimization decisions can be mostly guided by local reasoning while allowing global circuit-wide reasoning. Our evaluation shows that Quarl significantly outperforms existing circuit optimizers on almost all benchmark circuits. Surprisingly, Quarl can learn to perform rotation merging, a complex, non-local circuit optimization implemented as a separate pass in existing optimizers. △ Less

Submitted 17 July, 2023; originally announced July 2023.

arXiv:2307.01482 [pdf, other]

Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at Scale

Authors: Tong Nie, Guoyang Qin, Lijun Sun, Wei Ma, Yu Mei, Jian Sun

Abstract: Spatiotemporal urban data (STUD) displays complex correlational patterns. Extensive advanced techniques have been designed to capture these patterns for effective forecasting. However, because STUD is often massive in scale, practitioners need to strike a balance between effectiveness and efficiency by choosing computationally efficient models. An alternative paradigm called MLP-Mixer has the pote… ▽ More Spatiotemporal urban data (STUD) displays complex correlational patterns. Extensive advanced techniques have been designed to capture these patterns for effective forecasting. However, because STUD is often massive in scale, practitioners need to strike a balance between effectiveness and efficiency by choosing computationally efficient models. An alternative paradigm called MLP-Mixer has the potential for both simplicity and effectiveness. Taking inspiration from its success in other domains, we propose an adapted version, named NexuSQN, for STUD forecast at scale. We identify the challenges faced when directly applying MLP-Mixers as series- and window-wise multivaluedness and propose the ST-contextualization to distinguish between spatial and temporal patterns. Experimental results surprisingly demonstrate that MLP-Mixers with ST-contextualization can rival SOTA performance when tested on several urban benchmarks. Furthermore, it was deployed in a collaborative urban congestion project with Baidu, specifically evaluating its ability to forecast traffic states in megacities like Bei**g and Shanghai. Our findings contribute to the exploration of simple yet effective models for real-world STUD forecasting. △ Less

Submitted 7 February, 2024; v1 submitted 4 July, 2023; originally announced July 2023.

arXiv:2306.15334 [pdf, other]

Understanding Client Reactions in Online Mental Health Counseling

Authors: Anqi Li, Lizhi Ma, Yaling Mei, Hongliang He, Shuai Zhang, Huachuan Qiu, Zhenzhong Lan

Abstract: Communication success relies heavily on reading participants' reactions. Such feedback is especially important for mental health counselors, who must carefully consider the client's progress and adjust their approach accordingly. However, previous NLP research on counseling has mainly focused on studying counselors' intervention strategies rather than their clients' reactions to the intervention.… ▽ More Communication success relies heavily on reading participants' reactions. Such feedback is especially important for mental health counselors, who must carefully consider the client's progress and adjust their approach accordingly. However, previous NLP research on counseling has mainly focused on studying counselors' intervention strategies rather than their clients' reactions to the intervention. This work aims to fill this gap by develo** a theoretically grounded annotation framework that encompasses counselors' strategies and client reaction behaviors. The framework has been tested against a large-scale, high-quality text-based counseling dataset we collected over the past two years from an online welfare counseling platform. Our study shows how clients react to counselors' strategies, how such reactions affect the final counseling outcomes, and how counselors can adjust their strategies in response to these reactions. We also demonstrate that this study can help counselors automatically predict their clients' states. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: Accept to ACL 2023, oral. For code and data, see https://github.com/dll-wu/Client-React

arXiv:2306.00187 [pdf, other]

AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization

Authors: Kailash Gogineni, Yongsheng Mei, Peng Wei, Tian Lan, Guru Venkataramani

Abstract: Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning~(RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralize… ▽ More Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning~(RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralized execution~(CTDE) paradigm. Current multi-agent reinforcement learning~(MARL) algorithms consider experience replay with uniform sampling or based on priority weights to improve transition data sample efficiency in the sampling phase. However, moving transition data histories for each agent through the processor memory hierarchy is a performance limiter. Also, as the agents' transitions continuously renew every iteration, the finite cache capacity results in increased cache misses. To this end, we propose \name, that repeatedly reuses the transitions~(experiences) for a window of $n$ steps in order to improve the cache locality and minimize the transition data movement, instead of sampling new transitions at each step. Specifically, our optimization uses priority weights to select the transitions so that only high-priority transitions will be reused frequently, thereby improving the cache performance. Our experimental results on the Predator-Prey environment demonstrate the effectiveness of reusing the essential transitions based on the priority weights, where we observe an end-to-end training time reduction of $25.4\%$~(for $32$ agents) compared to existing prioritized MER algorithms without notable degradation in the mean reward. △ Less

Submitted 31 May, 2023; originally announced June 2023.

Comments: Accepted to ASAP'23

arXiv:2305.02805 [pdf, other]

Local Optima Correlation Assisted Adaptive Operator Selection

Authors: Jiyuan Pei, Hao Tong, Jialin Liu, Yi Mei, Xin Yao

Abstract: For solving combinatorial optimisation problems with metaheuristics, different search operators are applied for sampling new solutions in the neighbourhood of a given solution. It is important to understand the relationship between operators for various purposes, e.g., adaptively deciding when to use which operator to find optimal solutions efficiently. However, it is difficult to theoretically an… ▽ More For solving combinatorial optimisation problems with metaheuristics, different search operators are applied for sampling new solutions in the neighbourhood of a given solution. It is important to understand the relationship between operators for various purposes, e.g., adaptively deciding when to use which operator to find optimal solutions efficiently. However, it is difficult to theoretically analyse this relationship, especially in the complex solution space of combinatorial optimisation problems. In this paper, we propose to empirically analyse the relationship between operators in terms of the correlation between their local optima and develop a measure for quantifying their relationship. The comprehensive analyses on a wide range of capacitated vehicle routing problem benchmark instances show that there is a consistent pattern in the correlation between commonly used operators. Based on this newly proposed local optima correlation metric, we propose a novel approach for adaptively selecting among the operators during the search process. The core intention is to improve search efficiency by preventing wasting computational resources on exploring neighbourhoods where the local optima have already been reached. Experiments on randomly generated instances and commonly used benchmark datasets are conducted. Results show that the proposed approach outperforms commonly used adaptive operator selection methods. △ Less

Submitted 3 May, 2023; originally announced May 2023.

arXiv:2303.12950 [pdf, other]

LightPainter: Interactive Portrait Relighting with Freehand Scribble

Authors: Yiqun Mei, He Zhang, Xuaner Zhang, Jianming Zhang, Zhixin Shu, Yilin Wang, Zijun Wei, Shi Yan, HyunJoon Jung, Vishal M. Patel

Abstract: Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map. However, these methods are not intuitive for user interaction and lack precise lighting control. We introduce LightPainter, a scribble-based relighting system that allows users to interactively manipulate portrait lighting effect with e… ▽ More Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map. However, these methods are not intuitive for user interaction and lack precise lighting control. We introduce LightPainter, a scribble-based relighting system that allows users to interactively manipulate portrait lighting effect with ease. This is achieved by two conditional neural networks, a delighting module that recovers geometry and albedo optionally conditioned on skin tone, and a scribble-based module for relighting. To train the relighting module, we propose a novel scribble simulation procedure to mimic real user scribbles, which allows our pipeline to be trained without any human annotations. We demonstrate high-quality and flexible portrait lighting editing capability with both quantitative and qualitative experiments. User study comparisons with commercial lighting editing tools also demonstrate consistent user preference for our method. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: CVPR2023

arXiv:2302.10418 [pdf, other]

MAC-PO: Multi-Agent Experience Replay via Collective Priority Optimization

Authors: Yongsheng Mei, Hanhan Zhou, Tian Lan, Guru Venkataramani, Peng Wei

Abstract: Experience replay is crucial for off-policy reinforcement learning (RL) methods. By remembering and reusing the experiences from past different policies, experience replay significantly improves the training efficiency and stability of RL algorithms. Many decision-making problems in practice naturally involve multiple agents and require multi-agent reinforcement learning (MARL) under centralized t… ▽ More Experience replay is crucial for off-policy reinforcement learning (RL) methods. By remembering and reusing the experiences from past different policies, experience replay significantly improves the training efficiency and stability of RL algorithms. Many decision-making problems in practice naturally involve multiple agents and require multi-agent reinforcement learning (MARL) under centralized training decentralized execution paradigm. Nevertheless, existing MARL algorithms often adopt standard experience replay where the transitions are uniformly sampled regardless of their importance. Finding prioritized sampling weights that are optimized for MARL experience replay has yet to be explored. To this end, we propose MAC-PO, which formulates optimal prioritized experience replay for multi-agent problems as a regret minimization over the sampling weights of transitions. Such optimization is relaxed and solved using the Lagrangian multiplier approach to obtain the close-form optimal sampling weights. By minimizing the resulting policy regret, we can narrow the gap between the current policy and a nominal optimal policy, thus acquiring an improved prioritization scheme for multi-agent tasks. Our experimental results on Predator-Prey and StarCraft Multi-Agent Challenge environments demonstrate the effectiveness of our method, having a better ability to replay important transitions and outperforming other state-of-the-art baselines. △ Less

Submitted 27 February, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: The 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023). arXiv admin note: text overlap with arXiv:2302.05593

arXiv:2302.05593 [pdf, other]

ReMIX: Regret Minimization for Monotonic Value Function Factorization in Multiagent Reinforcement Learning

Authors: Yongsheng Mei, Hanhan Zhou, Tian Lan

Abstract: Value function factorization methods have become a dominant approach for cooperative multiagent reinforcement learning under a centralized training and decentralized execution paradigm. By factorizing the optimal joint action-value function using a monotonic mixing function of agents' utilities, these algorithms ensure the consistency between joint and local action selections for decentralized dec… ▽ More Value function factorization methods have become a dominant approach for cooperative multiagent reinforcement learning under a centralized training and decentralized execution paradigm. By factorizing the optimal joint action-value function using a monotonic mixing function of agents' utilities, these algorithms ensure the consistency between joint and local action selections for decentralized decision-making. Nevertheless, the use of monotonic mixing functions also induces representational limitations. Finding the optimal projection of an unrestricted mixing function onto monotonic function classes is still an open problem. To this end, we propose ReMIX, formulating this optimal projection problem for value function factorization as a regret minimization over the projection weights of different state-action values. Such an optimization problem can be relaxed and solved using the Lagrangian multiplier method to obtain the close-form optimal projection weights. By minimizing the resulting policy regret, we can narrow the gap between the optimal and the restricted monotonic mixing functions, thus obtaining an improved monotonic value function factorization. Our experimental results on Predator-Prey and StarCraft Multiagent Challenge environments demonstrate the effectiveness of our method, indicating the better capabilities of handling environments with non-monotonic value functions. △ Less

Submitted 10 February, 2023; originally announced February 2023.

arXiv:2302.02521 [pdf, other]

Exploiting Partial Common Information Microstructure for Multi-Modal Brain Tumor Segmentation

Authors: Yongsheng Mei, Guru Venkataramani, Tian Lan

Abstract: Learning with multiple modalities is crucial for automated brain tumor segmentation from magnetic resonance imaging data. Explicitly optimizing the common information shared among all modalities (e.g., by maximizing the total correlation) has been shown to achieve better feature representations and thus enhance the segmentation performance. However, existing approaches are oblivious to partial com… ▽ More Learning with multiple modalities is crucial for automated brain tumor segmentation from magnetic resonance imaging data. Explicitly optimizing the common information shared among all modalities (e.g., by maximizing the total correlation) has been shown to achieve better feature representations and thus enhance the segmentation performance. However, existing approaches are oblivious to partial common information shared by subsets of the modalities. In this paper, we show that identifying such partial common information can significantly boost the discriminative power of image segmentation models. In particular, we introduce a novel concept of partial common information mask (PCI-mask) to provide a fine-grained characterization of what partial common information is shared by which subsets of the modalities. By solving a masked correlation maximization and simultaneously learning an optimal PCI-mask, we identify the latent microstructure of partial common information and leverage it in a self-attention module to selectively weight different feature representations in multi-modal data. We implement our proposed framework on the standard U-Net. Our experimental results on the Multi-modal Brain Tumor Segmentation Challenge (BraTS) datasets outperform those of state-of-the-art segmentation baselines, with validation Dice similarity coefficients of 0.920, 0.897, 0.837 for the whole tumor, tumor core, and enhancing tumor on BraTS-2020. △ Less

Submitted 14 July, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

Comments: 2023 ICML Workshop on Machine Learning for Multimodal Healthcare Data (ML4MHD)

arXiv:2301.01333 [pdf]

oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation

Authors: Jianhui Li, Zhennan Qin, Yijie Mei, **gze Cui, Yunfei Song, Ciyong Chen, Yifei Zhang, Longsheng Du, Xianhang Cheng, Baihui **, Yan Zhang, Jason Ye, Eric Lin, Dan Lavery

Abstract: With the rapid development of deep learning models and hardware support for dense computing, the deep learning workload characteristics changed significantly from a few hot spots on compute-intensive operations to a broad range of operations scattered across the models. Accelerating a few compute-intensive operations using the expert-tuned implementation of primitives does not fully exploit the pe… ▽ More With the rapid development of deep learning models and hardware support for dense computing, the deep learning workload characteristics changed significantly from a few hot spots on compute-intensive operations to a broad range of operations scattered across the models. Accelerating a few compute-intensive operations using the expert-tuned implementation of primitives does not fully exploit the performance potential of AI hardware. Various efforts have been made to compile a full deep neural network (DNN) graph. One of the biggest challenges is to achieve high-performance tensor compilation by generating expert level performance code for the dense compute-intensive operations and applying compilation optimization at the scope of DNN computation graph across multiple compute-intensive operations. We present oneDNN Graph Compiler, a tensor compiler that employs a hybrid approach of using techniques from both compiler optimization and expert-tuned kernels for high performance code generation of the deep neural network graph. oneDNN Graph Compiler addresses unique optimization challenges in the deep learning domain, such as low-precision computation, aggressive fusion of graph operations, optimization for static tensor shapes and memory layout, constant weight optimization, and memory buffer reuse. Experimental results demonstrate significant performance gains over existing tensor compiler and primitives library for performance-critical DNN computation graphs and end-to-end models on Intel Xeon Scalable Processors. △ Less

Submitted 11 March, 2024; v1 submitted 3 January, 2023; originally announced January 2023.

Comments: 10 pages excluding reference, 9 figures, 1 table

arXiv:2212.01259 [pdf, other]

Covariance Estimators for the ROOT-SGD Algorithm in Online Learning

Authors: Yiling Luo, Xiaoming Huo, Yajun Mei

Abstract: Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges t… ▽ More Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges to a normal distribution. However, this normal distribution has unknown asymptotic covariance; thus cannot be directly applied to measure the uncertainty. To fill this gap, we develop two estimators for the asymptotic covariance of ROOT-SGD. Our covariance estimators are useful for statistical inference in ROOT-SGD. Our first estimator adopts the idea of plug-in. For each unknown component in the formula of the asymptotic covariance, we substitute it with its empirical counterpart. The plug-in estimator converges at the rate $\mathcal{O}(1/\sqrt{t})$, where $t$ is the sample size. Despite its quick convergence, the plug-in estimator has the limitation that it relies on the Hessian of the loss function, which might be unavailable in some cases. Our second estimator is a Hessian-free estimator that overcomes the aforementioned limitation. The Hessian-free estimator uses the random-scaling technique, and we show that it is an asymptotically consistent estimator of the true covariance. △ Less

Submitted 2 December, 2022; originally announced December 2022.

arXiv:2211.17207 [pdf, other]

Canal: A Flexible Interconnect Generator for Coarse-Grained Reconfigurable Arrays

Authors: Jackson Melchert, Keyi Zhang, Yuchen Mei, Mark Horowitz, Christopher Torng, Priyanka Raina

Abstract: The architecture of a coarse-grained reconfigurable array (CGRA) interconnect has a significant effect on not only the flexibility of the resulting accelerator, but also its power, performance, and area. Design decisions that have complex trade-offs need to be explored to maintain efficiency and performance across a variety of evolving applications. This paper presents Canal, a Python-embedded dom… ▽ More The architecture of a coarse-grained reconfigurable array (CGRA) interconnect has a significant effect on not only the flexibility of the resulting accelerator, but also its power, performance, and area. Design decisions that have complex trade-offs need to be explored to maintain efficiency and performance across a variety of evolving applications. This paper presents Canal, a Python-embedded domain-specific language (eDSL) and compiler for specifying and generating reconfigurable interconnects for CGRAs. Canal uses a graph-based intermediate representation (IR) that allows for easy hardware generation and tight integration with place and route tools. We evaluate Canal by constructing both a fully static interconnect and a hybrid interconnect with ready-valid signaling, and by conducting design space exploration of the interconnect architecture by modifying the switch box topology, the number of routing tracks, and the interconnect tile connections. Through the use of a graph-based IR for CGRA interconnects, the eDSL, and the interconnect generation system, Canal enables fast design space exploration and creation of CGRA interconnects. △ Less

Submitted 30 November, 2022; originally announced November 2022.

Comments: Preprint version

arXiv:2211.13182 [pdf, other]

Cascade: An Application Pipelining Toolkit for Coarse-Grained Reconfigurable Arrays

Authors: Jackson Melchert, Yuchen Mei, Kalhan Koul, Qiaoyi Liu, Mark Horowitz, Priyanka Raina

Abstract: While coarse-grained reconfigurable arrays (CGRAs) have emerged as promising programmable accelerator architectures, pipelining applications running on CGRAs is required to ensure high maximum clock frequencies. Current CGRA compilers either lack pipelining techniques resulting in low performance or perform exhaustive pipelining resulting in high energy and resource consumption. We introduce Casca… ▽ More While coarse-grained reconfigurable arrays (CGRAs) have emerged as promising programmable accelerator architectures, pipelining applications running on CGRAs is required to ensure high maximum clock frequencies. Current CGRA compilers either lack pipelining techniques resulting in low performance or perform exhaustive pipelining resulting in high energy and resource consumption. We introduce Cascade, an application pipelining toolkit for CGRAs, including a CGRA application frequency model, automated pipelining techniques for CGRA application compilers that work with both dense and sparse applications, and hardware optimizations for improving application frequency. Cascade enables 7 - 34x lower critical path delays and 7 - 190x lower EDP across a variety of dense image processing and machine learning workloads, and 2 - 4.4x lower critical path delays and 1.5 - 4.2x lower EDP on sparse workloads, compared to a compiler without pipelining. △ Less

Submitted 23 November, 2022; originally announced November 2022.

Comments: Preprint version

arXiv:2210.06635 [pdf, other]

A Bayesian Optimization Framework for Finding Local Optima in Expensive Multi-Modal Functions

Authors: Yongsheng Mei, Tian Lan, Mahdi Imani, Suresh Subramaniam

Abstract: Bayesian optimization (BO) is a popular global optimization scheme for sample-efficient optimization in domains with expensive function evaluations. The existing BO techniques are capable of finding a single global optimum solution. However, finding a set of global and local optimum solutions is crucial in a wide range of real-world problems, as implementing some of the optimal solutions might not… ▽ More Bayesian optimization (BO) is a popular global optimization scheme for sample-efficient optimization in domains with expensive function evaluations. The existing BO techniques are capable of finding a single global optimum solution. However, finding a set of global and local optimum solutions is crucial in a wide range of real-world problems, as implementing some of the optimal solutions might not be feasible due to various practical restrictions (e.g., resource limitation, physical constraints, etc.). In such domains, if multiple solutions are known, the implementation can be quickly switched to another solution, and the best possible system performance can still be obtained. This paper develops a multimodal BO framework to effectively find a set of local/global solutions for expensive-to-evaluate multimodal objective functions. We consider the standard BO setting with Gaussian process regression representing the objective function. We analytically derive the joint distribution of the objective function and its first-order derivatives. This joint distribution is used in the body of the BO acquisition functions to search for local optima during the optimization process. We introduce variants of the well-known BO acquisition functions to the multimodal setting and demonstrate the performance of the proposed framework in locating a set of local optimum solutions using multiple optimization problems. △ Less

Submitted 5 August, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: European Conference on Artificial Intelligence (ECAI) 2023

arXiv:2209.04846 [pdf, ps, other]

Joint Activity Detection and Channel Estimation for Massive IoT Access Based on Millimeter-Wave/Terahertz Multi-Panel Massive MIMO

Authors: Hanlin Xiu, Zhen Gao, Anwen Liao, Yikun Mei, Dezhi Zheng, Shufeng Tan, Marco Di Renzo, Lajos Hanzo

Abstract: The multi-panel array, as a state-of-the-art antenna-in-package technology, is very suitable for millimeter-wave (mmWave)/terahertz (THz) systems, due to its low-cost deployment and scalable configuration. But in the context of nonuniform array structures it leads to intractable signal processing. Based on such an array structure at the base station, this paper investigates a joint active user det… ▽ More The multi-panel array, as a state-of-the-art antenna-in-package technology, is very suitable for millimeter-wave (mmWave)/terahertz (THz) systems, due to its low-cost deployment and scalable configuration. But in the context of nonuniform array structures it leads to intractable signal processing. Based on such an array structure at the base station, this paper investigates a joint active user detection (AUD) and channel estimation (CE) scheme based on compressive sensing (CS) for application to the massive Internet of Things (IoT). Specifically, by exploiting the structured sparsity of mmWave/THz massive IoT access channels, we firstly formulate the multi-panel massive multiple-input multiple-output (mMIMO)-based joint AUD and CE problem as a multiple measurement vector (MMV)-CS problem. Then, we harness the expectation maximization (EM) algorithm to learn the prior parameters (i.e., the noise variance and the sparsity ratio) and an orthogonal approximate message passing (OAMP)-EM-MMV algorithm is developed to solve this problem. Our simulation results verify the improved AUD and CE performance of the proposed scheme compared to conventional CS-based algorithms. △ Less

Submitted 11 September, 2022; originally announced September 2022.

Comments: accepted by IEEE Transactions on Vehicular Technology

arXiv:2208.05045 [pdf, other]

Adaptive Resources Allocation CUSUM for Binomial Count Data Monitoring with Application to COVID-19 Hotspot Detection

Authors: Jiuyun Hu, Yajun Mei, Sarah Holte, Hao Yan

Abstract: In this paper, we present an efficient statistical method (denoted as "Adaptive Resources Allocation CUSUM") to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted up… ▽ More In this paper, we present an efficient statistical method (denoted as "Adaptive Resources Allocation CUSUM") to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted update is used to update the posterior distribution of the infection rate. Then, the upper confidence bound (UCB) is used for resource allocation and planning. Finally, CUSUM monitoring statistics to detect the change point as well as the change location. For performance evaluation, we compare the performance of the proposed method with several benchmark methods in the literature and showed the proposed algorithm is able to achieve a lower detection delay and higher detection precision. Finally, this method is applied to hotspot detection in a real case study of county-level daily positive COVID-19 cases in Washington State WA) and demonstrates the effectiveness with very limited distributed samples. △ Less

Submitted 17 August, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

Comments: Accepted in Journal of Applied Statistics

arXiv:2207.01983 [pdf, ps, other]

Massive Access in Extra Large-Scale MIMO with Mixed-ADC over Near Field Channels

Authors: Yikun Mei, Zhen Gao, De Mi, Mingyu Zhou, Dezhi Zheng, Michail Matthaiou, Pei Xiao, Robert Schober

Abstract: Massive connectivity for extra large-scale multi-input multi-output (XL-MIMO) systems is a challenging issue due to the near-field access channels and the prohibitive cost. In this paper, we propose an uplink grant-free massive access scheme for XL-MIMO systems, in which a mixed-analog-to-digital converters (ADC) architecture is adopted to strike the right balance between access performance and po… ▽ More Massive connectivity for extra large-scale multi-input multi-output (XL-MIMO) systems is a challenging issue due to the near-field access channels and the prohibitive cost. In this paper, we propose an uplink grant-free massive access scheme for XL-MIMO systems, in which a mixed-analog-to-digital converters (ADC) architecture is adopted to strike the right balance between access performance and power consumption. By exploiting the spatial-domain structured sparsity and the piecewise angular-domain cluster sparsity of massive access channels, a compressive sensing (CS)-based two-stage orthogonal approximate message passing algorithm is proposed to efficiently solve the joint activity detection and channel estimation problem. Particularly, high-precision quantized measurements are leveraged to perform accurate hyper-parameter estimation, thereby facilitating the activity detection. Moreover, we adopt a subarray-wise estimation strategy to overcome the severe angular-domain energy dispersion problem which is caused by the near-field effect in XL-MIMO channels. Simulation results verify the superiority of our proposed algorithm over state-of-the-art CS algorithms for massive access based on XL-MIMO with mixed-ADC architectures. △ Less

Submitted 3 April, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

Comments: Accepted by IEEE TVT

arXiv:2205.00061 [pdf, other]

doi 10.1109/ISIT50566.2022.9834388

The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models

Authors: Yiling Luo, Xiaoming Huo, Yajun Mei

Abstract: We study the Stochastic Gradient Descent (SGD) algorithm in nonparametric statistics: kernel regression in particular. The directional bias property of SGD, which is known in the linear regression setting, is generalized to the kernel regression. More specifically, we prove that SGD with moderate and annealing step-size converges along the direction of the eigenvector that corresponds to the large… ▽ More We study the Stochastic Gradient Descent (SGD) algorithm in nonparametric statistics: kernel regression in particular. The directional bias property of SGD, which is known in the linear regression setting, is generalized to the kernel regression. More specifically, we prove that SGD with moderate and annealing step-size converges along the direction of the eigenvector that corresponds to the largest eigenvalue of the Gram matrix. In addition, the Gradient Descent (GD) with a moderate or small step-size converges along the direction that corresponds to the smallest eigenvalue. These facts are referred to as the directional bias properties; they may interpret how an SGD-computed estimator has a potentially smaller generalization error than a GD-computed estimator. The application of our theory is demonstrated by simulation studies and a case study that is based on the FashionMNIST dataset. △ Less

Submitted 29 April, 2022; originally announced May 2022.

Showing 1–50 of 78 results for author: Mei, Y