-
Emergent Crowd Grou** via Heuristic Self-Organization
Authors:
Xiao-Cheng Liao,
Wei-Neng Chen,
Xiang-Ling Chen,
Yi Mei
Abstract:
Modeling crowds has many important applications in games and computer animation. Inspired by the emergent following effect in real-life crowd scenarios, in this work, we develop a method for implicitly grou** moving agents. We achieve this by analyzing local information around each agent and rotating its preferred velocity accordingly. Each agent could automatically form an implicit group with i…
▽ More
Modeling crowds has many important applications in games and computer animation. Inspired by the emergent following effect in real-life crowd scenarios, in this work, we develop a method for implicitly grou** moving agents. We achieve this by analyzing local information around each agent and rotating its preferred velocity accordingly. Each agent could automatically form an implicit group with its neighboring agents that have similar directions. In contrast to an explicit group, there are no strict boundaries for an implicit group. If an agent's direction deviates from its group as a result of positional changes, it will autonomously exit the group or join another implicitly formed neighboring group. This implicit grou** is autonomously emergent among agents rather than deliberately controlled by the algorithm. The proposed method is compared with many crowd simulation models, and the experimental results indicate that our approach achieves the lowest congestion levels in some classic scenarios. In addition, we demonstrate that adjusting the preferred velocity of agents can actually reduce the dissimilarity between their actual velocity and the original preferred velocity. Our work is available online.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds
Authors:
Ye Wang,
Yuting Mei,
Sipeng Zheng,
Qin **
Abstract:
While pets offer companionship, their limited intelligence restricts advanced reasoning and autonomous interaction with humans. Considering this, we propose QuadrupedGPT, a versatile agent designed to master a broad range of complex tasks with agility comparable to that of a pet. To achieve this goal, the primary challenges include: i) effectively leveraging multimodal observations for decision-ma…
▽ More
While pets offer companionship, their limited intelligence restricts advanced reasoning and autonomous interaction with humans. Considering this, we propose QuadrupedGPT, a versatile agent designed to master a broad range of complex tasks with agility comparable to that of a pet. To achieve this goal, the primary challenges include: i) effectively leveraging multimodal observations for decision-making; ii) mastering agile control of locomotion and path planning; iii) develo** advanced cognition to execute long-term objectives. QuadrupedGPT processes human command and environmental contexts using a large multimodal model (LMM). Empowered by its extensive knowledge base, our agent autonomously assigns appropriate parameters for adaptive locomotion policies and guides the agent in planning a safe but efficient path towards the goal, utilizing semantic-aware terrain analysis. Moreover, QuadrupedGPT is equipped with problem-solving capabilities that enable it to decompose long-term goals into a sequence of executable subgoals through high-level reasoning. Extensive experiments across various benchmarks confirm that QuadrupedGPT can adeptly handle multiple tasks with intricate instructions, demonstrating a significant step towards the versatile quadruped agents in open-ended worlds. Our website and codes can be found at https://quadruped-hub.github.io/Quadruped-GPT/.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
UBiSS: A Unified Framework for Bimodal Semantic Summarization of Videos
Authors:
Yuting Mei,
Linli Yao,
Qin **
Abstract:
With the surge in the amount of video data, video summarization techniques, including visual-modal(VM) and textual-modal(TM) summarization, are attracting more and more attention. However, unimodal summarization inevitably loses the rich semantics of the video. In this paper, we focus on a more comprehensive video summarization task named Bimodal Semantic Summarization of Videos (BiSSV). Specifica…
▽ More
With the surge in the amount of video data, video summarization techniques, including visual-modal(VM) and textual-modal(TM) summarization, are attracting more and more attention. However, unimodal summarization inevitably loses the rich semantics of the video. In this paper, we focus on a more comprehensive video summarization task named Bimodal Semantic Summarization of Videos (BiSSV). Specifically, we first construct a large-scale dataset, BIDS, in (video, VM-Summary, TM-Summary) triplet format. Unlike traditional processing methods, our construction procedure contains a VM-Summary extraction algorithm aiming to preserve the most salient content within long videos. Based on BIDS, we propose a Unified framework UBiSS for the BiSSV task, which models the saliency information in the video and generates a TM-summary and VM-summary simultaneously. We further optimize our model with a list-wise ranking-based objective to improve its capacity to capture highlights. Lastly, we propose a metric, $NDCG_{MS}$, to provide a joint evaluation of the bimodal summary. Experiments show that our unified framework achieves better performance than multi-stage summarization pipelines. Code and data are available at https://github.com/MeiYutingg/UBiSS.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Prompting the E-Brushes: Users as Authors in Generative AI
Authors:
Yiyang Mei
Abstract:
Since its introduction in 2022, Generative AI has significantly impacted the art world, from winning state art fairs to creating complex videos from simple prompts. Amid this renaissance, a pivotal issue emerges: should users of Generative AI be recognized as authors eligible for copyright protection? The Copyright Office, in its March 2023 Guidance, argues against this notion. By comparing the pr…
▽ More
Since its introduction in 2022, Generative AI has significantly impacted the art world, from winning state art fairs to creating complex videos from simple prompts. Amid this renaissance, a pivotal issue emerges: should users of Generative AI be recognized as authors eligible for copyright protection? The Copyright Office, in its March 2023 Guidance, argues against this notion. By comparing the prompts to clients' instructions for commissioned art, the Office denies users authorship due to their limited role in the creative process. This Article challenges this viewpoint and advocates for the recognition of Generative AI users who incorporate these tools into their creative endeavors. It argues that the current policy fails to consider the intricate and dynamic interaction between Generative AI users and the models, where users actively influence the output through a process of adjustment, refinement, selection, and arrangement. Rather than dismissing the contributions generated by AI, this Article suggests a simplified and streamlined registration process that acknowledges the role of AI in creation. This approach not only aligns with the constitutional goal of promoting the progress of science and useful arts but also encourages public engagement in the creative process, which contributes to the pool of training data for AI. Moreover, it advocates for a flexible framework that evolves alongside technological advancements while ensuring safety and public interest. In conclusion, by examining text-to-image generators and addressing misconceptions about Generative AI and user interaction, this Article calls for a regulatory framework that adapts to technological developments and safeguards public interests
△ Less
Submitted 24 March, 2024;
originally announced June 2024.
-
Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections
Authors:
Jiacong Xu,
Yiqun Mei,
Vishal M. Patel
Abstract:
Photographs captured in unstructured tourist environments frequently exhibit variable appearances and transient occlusions, challenging accurate scene reconstruction and inducing artifacts in novel view synthesis. Although prior approaches have integrated the Neural Radiance Field (NeRF) with additional learnable modules to handle the dynamic appearances and eliminate transient objects, their exte…
▽ More
Photographs captured in unstructured tourist environments frequently exhibit variable appearances and transient occlusions, challenging accurate scene reconstruction and inducing artifacts in novel view synthesis. Although prior approaches have integrated the Neural Radiance Field (NeRF) with additional learnable modules to handle the dynamic appearances and eliminate transient objects, their extensive training demands and slow rendering speeds limit practical deployments. Recently, 3D Gaussian Splatting (3DGS) has emerged as a promising alternative to NeRF, offering superior training and inference efficiency along with better rendering quality. This paper presents Wild-GS, an innovative adaptation of 3DGS optimized for unconstrained photo collections while preserving its efficiency benefits. Wild-GS determines the appearance of each 3D Gaussian by their inherent material attributes, global illumination and camera properties per image, and point-level local variance of reflectance. Unlike previous methods that model reference features in image space, Wild-GS explicitly aligns the pixel appearance features to the corresponding local Gaussians by sampling the triplane extracted from the reference image. This novel design effectively transfers the high-frequency detailed appearance of the reference view to 3D space and significantly expedites the training process. Furthermore, 2D visibility maps and depth regularization are leveraged to mitigate the transient effects and constrain the geometry, respectively. Extensive experiments demonstrate that Wild-GS achieves state-of-the-art rendering performance and the highest efficiency in both training and inference among all the existing techniques.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs
Authors:
Yixuan Mei,
Yonghao Zhuang,
Xupeng Miao,
Juncheng Yang,
Zhihao Jia,
Rashmi Vinayak
Abstract:
This paper introduces Helix, a distributed system for high-throughput, low-latency large language model (LLM) serving on heterogeneous GPU clusters. A key idea behind Helix is to formulate inference computation of LLMs over heterogeneous GPUs and network connections as a max-flow problem for a directed, weighted graph, whose nodes represent GPU instances and edges capture both GPU and network hete…
▽ More
This paper introduces Helix, a distributed system for high-throughput, low-latency large language model (LLM) serving on heterogeneous GPU clusters. A key idea behind Helix is to formulate inference computation of LLMs over heterogeneous GPUs and network connections as a max-flow problem for a directed, weighted graph, whose nodes represent GPU instances and edges capture both GPU and network heterogeneity through their capacities. Helix then uses a mixed integer linear programming (MILP) algorithm to discover highly optimized strategies to serve LLMs. This approach allows Helix to jointly optimize model placement and request scheduling, two highly entangled tasks in heterogeneous LLM serving. Our evaluation on several heterogeneous cluster settings ranging from 24 to 42 GPU nodes shows that Helix improves serving throughput by up to 2.7$\times$ and reduces prompting and decoding latency by up to 2.8$\times$ and 1.3$\times$, respectively, compared to best existing approaches.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Multi-Representation Genetic Programming: A Case Study on Tree-based and Linear Representations
Authors:
Zhixing Huang,
Yi Mei,
Fangfang Zhang,
Mengjie Zhang,
Wolfgang Banzhaf
Abstract:
Existing genetic programming (GP) methods are typically designed based on a certain representation, such as tree-based or linear representations. These representations show various pros and cons in different domains. However, due to the complicated relationships among representation and fitness landscapes of GP, it is hard to intuitively determine which GP representation is the most suitable for s…
▽ More
Existing genetic programming (GP) methods are typically designed based on a certain representation, such as tree-based or linear representations. These representations show various pros and cons in different domains. However, due to the complicated relationships among representation and fitness landscapes of GP, it is hard to intuitively determine which GP representation is the most suitable for solving a certain problem. Evolving programs (or models) with multiple representations simultaneously can alternatively search on different fitness landscapes since representations are highly related to the search space that essentially defines the fitness landscape. Fully using the latent synergies among different GP individual representations might be helpful for GP to search for better solutions. However, existing GP literature rarely investigates the simultaneous effective use of evolving multiple representations. To fill this gap, this paper proposes a multi-representation GP algorithm based on tree-based and linear representations, which are two commonly used GP representations. In addition, we develop a new cross-representation crossover operator to harness the interplay between tree-based and linear representations. Empirical results show that navigating the learned knowledge between basic tree-based and linear representations successfully improves the effectiveness of GP with solely tree-based or linear representation in solving symbolic regression and dynamic job shop scheduling problems.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Multi-Label Out-of-Distribution Detection with Spectral Normalized Joint Energy
Authors:
Yihan Mei,
Xinyu Wang,
Dell Zhang,
Xiaoling Wang
Abstract:
In today's interconnected world, achieving reliable out-of-distribution (OOD) detection poses a significant challenge for machine learning models. While numerous studies have introduced improved approaches for multi-class OOD detection tasks, the investigation into multi-label OOD detection tasks has been notably limited. We introduce Spectral Normalized Joint Energy (SNoJoE), a method that consol…
▽ More
In today's interconnected world, achieving reliable out-of-distribution (OOD) detection poses a significant challenge for machine learning models. While numerous studies have introduced improved approaches for multi-class OOD detection tasks, the investigation into multi-label OOD detection tasks has been notably limited. We introduce Spectral Normalized Joint Energy (SNoJoE), a method that consolidates label-specific information across multiple labels through the theoretically justified concept of an energy-based function. Throughout the training process, we employ spectral normalization to manage the model's feature space, thereby enhancing model efficacy and generalization, in addition to bolstering robustness. Our findings indicate that the application of spectral normalization to joint energy scores notably amplifies the model's capability for OOD detection. We perform OOD detection experiments utilizing PASCAL-VOC as the in-distribution dataset and ImageNet-22K or Texture as the out-of-distribution datasets. Our experimental results reveal that, in comparison to prior top performances, SNoJoE achieves 11% and 54% relative reductions in FPR95 on the respective OOD datasets, thereby defining the new state of the art in this field of study.
△ Less
Submitted 12 May, 2024; v1 submitted 7 May, 2024;
originally announced May 2024.
-
AffirmativeAI: Towards LGBTQ+ Friendly Audit Frameworks for Large Language Models
Authors:
Yinru Long,
Zilin Ma,
Yiyang Mei,
Zhaoyuan Su
Abstract:
LGBTQ+ community face disproportionate mental health challenges, including higher rates of depression, anxiety, and suicidal ideation. Research has shown that LGBTQ+ people have been using large language model-based chatbots, such as ChatGPT, for their mental health needs. Despite the potential for immediate support and anonymity these chatbots offer, concerns regarding their capacity to provide e…
▽ More
LGBTQ+ community face disproportionate mental health challenges, including higher rates of depression, anxiety, and suicidal ideation. Research has shown that LGBTQ+ people have been using large language model-based chatbots, such as ChatGPT, for their mental health needs. Despite the potential for immediate support and anonymity these chatbots offer, concerns regarding their capacity to provide empathetic, accurate, and affirming responses remain. In response to these challenges, we propose a framework for evaluating the affirmativeness of LLMs based on principles of affirmative therapy, emphasizing the need for attitudes, knowledge, and actions that support and validate LGBTQ+ experiences. We propose a combination of qualitative and quantitative analyses, ho** to establish benchmarks for "Affirmative AI," ensuring that LLM-based chatbots can provide safe, supportive, and effective mental health support to LGBTQ+ individuals. We benchmark LLM affirmativeness not as a mental health solution for LGBTQ+ individuals or to claim it resolves their mental health issues, as we highlight the need to consider complex discrimination in the LGBTQ+ community when designing technological aids. Our goal is to evaluate LLMs for LGBTQ+ mental health support since many in the community already use them, aiming to identify potential harms of using general-purpose LLMs in this context.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
Learning from Offline and Online Experiences: A Hybrid Adaptive Operator Selection Framework
Authors:
Jiyuan Pei,
Jialin Liu,
Yi Mei
Abstract:
In many practical applications, usually, similar optimisation problems or scenarios repeatedly appear. Learning from previous problem-solving experiences can help adjust algorithm components of meta-heuristics, e.g., adaptively selecting promising search operators, to achieve better optimisation performance. However, those experiences obtained from previously solved problems, namely offline experi…
▽ More
In many practical applications, usually, similar optimisation problems or scenarios repeatedly appear. Learning from previous problem-solving experiences can help adjust algorithm components of meta-heuristics, e.g., adaptively selecting promising search operators, to achieve better optimisation performance. However, those experiences obtained from previously solved problems, namely offline experiences, may sometimes provide misleading perceptions when solving a new problem, if the characteristics of previous problems and the new one are relatively different. Learning from online experiences obtained during the ongoing problem-solving process is more instructive but highly restricted by limited computational resources. This paper focuses on the effective combination of offline and online experiences. A novel hybrid framework that learns to dynamically and adaptively select promising search operators is proposed. Two adaptive operator selection modules with complementary paradigms cooperate in the framework to learn from offline and online experiences and make decisions. An adaptive decision policy is maintained to balance the use of those two modules in an online manner. Extensive experiments on 170 widely studied real-value benchmark optimisation problems and a benchmark set with 34 instances for combinatorial optimisation show that the proposed hybrid framework outperforms the state-of-the-art methods. Ablation study verifies the effectiveness of each component of the framework.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Learning Traffic Signal Control via Genetic Programming
Authors:
Xiao-Cheng Liao,
Yi Mei,
Mengjie Zhang
Abstract:
The control of traffic signals is crucial for improving transportation efficiency. Recently, learning-based methods, especially Deep Reinforcement Learning (DRL), garnered substantial success in the quest for more efficient traffic signal control strategies. However, the design of rewards in DRL highly demands domain knowledge to converge to an effective policy, and the final policy also presents…
▽ More
The control of traffic signals is crucial for improving transportation efficiency. Recently, learning-based methods, especially Deep Reinforcement Learning (DRL), garnered substantial success in the quest for more efficient traffic signal control strategies. However, the design of rewards in DRL highly demands domain knowledge to converge to an effective policy, and the final policy also presents difficulties in terms of explainability. In this work, a new learning-based method for signal control in complex intersections is proposed. In our approach, we design a concept of phase urgency for each signal phase. During signal transitions, the traffic light control strategy selects the next phase to be activated based on the phase urgency. We then proposed to represent the urgency function as an explainable tree structure. The urgency function can calculate the phase urgency for a specific phase based on the current road conditions. Genetic programming is adopted to perform gradient-free optimization of the urgency function. We test our algorithm on multiple public traffic signal control datasets. The experimental results indicate that the tree-shaped urgency function evolved by genetic programming outperforms the baselines, including a state-of-the-art method in the transportation field and a well-known DRL-based method.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
RAAMove: A Corpus for Analyzing Moves in Research Article Abstracts
Authors:
Hongzheng Li,
Ruo** Wang,
Ge Shi,
Xing Lv,
Lei Lei,
Chong Feng,
Fang Liu,
**kun Lin,
Yangguang Mei,
Lingnan Xu
Abstract:
Move structures have been studied in English for Specific Purposes (ESP) and English for Academic Purposes (EAP) for decades. However, there are few move annotation corpora for Research Article (RA) abstracts. In this paper, we introduce RAAMove, a comprehensive multi-domain corpus dedicated to the annotation of move structures in RA abstracts. The primary objective of RAAMove is to facilitate mov…
▽ More
Move structures have been studied in English for Specific Purposes (ESP) and English for Academic Purposes (EAP) for decades. However, there are few move annotation corpora for Research Article (RA) abstracts. In this paper, we introduce RAAMove, a comprehensive multi-domain corpus dedicated to the annotation of move structures in RA abstracts. The primary objective of RAAMove is to facilitate move analysis and automatic move identification. This paper provides a thorough discussion of the corpus construction process, including the scheme, data collection, annotation guidelines, and annotation procedures. The corpus is constructed through two stages: initially, expert annotators manually annotate high-quality data; subsequently, based on the human-annotated data, a BERT-based model is employed for automatic annotation with the help of experts' modification. The result is a large-scale and high-quality corpus comprising 33,988 annotated instances. We also conduct preliminary move identification experiments using the BERT-based model to verify the effectiveness of the proposed corpus and model. The annotated corpus is available for academic research purposes and can serve as essential resources for move analysis, English language teaching and writing, as well as move/discourse-related tasks in Natural Language Processing (NLP).
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Holo-Relighting: Controllable Volumetric Portrait Relighting from a Single Image
Authors:
Yiqun Mei,
Yu Zeng,
He Zhang,
Zhixin Shu,
Xuaner Zhang,
Sai Bi,
Jianming Zhang,
HyunJoon Jung,
Vishal M. Patel
Abstract:
At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to rec…
▽ More
At the core of portrait photography is the search for ideal lighting and viewpoint. The process often requires advanced knowledge in photography and an elaborate studio setup. In this work, we propose Holo-Relighting, a volumetric relighting method that is capable of synthesizing novel viewpoints, and novel lighting from a single image. Holo-Relighting leverages the pretrained 3D GAN (EG3D) to reconstruct geometry and appearance from an input portrait as a set of 3D-aware features. We design a relighting module conditioned on a given lighting to process these features, and predict a relit 3D representation in the form of a tri-plane, which can render to an arbitrary viewpoint through volume rendering. Besides viewpoint and lighting control, Holo-Relighting also takes the head pose as a condition to enable head-pose-dependent lighting effects. With these novel designs, Holo-Relighting can generate complex non-Lambertian lighting effects (e.g., specular highlights and cast shadows) without using any explicit physical lighting priors. We train Holo-Relighting with data captured with a light stage, and propose two data-rendering techniques to improve the data quality for training the volumetric relighting system. Through quantitative and qualitative experiments, we demonstrate Holo-Relighting can achieve state-of-the-arts relighting quality with better photorealism, 3D consistency and controllability.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions
Authors:
Jiayu Chen,
Bhargav Ganguly,
Yang Xu,
Yongsheng Mei,
Tian Lan,
Vaneet Aggarwal
Abstract:
Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline…
▽ More
Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.
△ Less
Submitted 25 May, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support
Authors:
Zilin Ma,
Yiyang Mei,
Yinru Long,
Zhaoyuan Su,
Krzysztof Z. Gajos
Abstract:
LGBTQ+ individuals are increasingly turning to chatbots powered by large language models (LLMs) to meet their mental health needs. However, little research has explored whether these chatbots can adequately and safely provide tailored support for this demographic. We interviewed 18 LGBTQ+ and 13 non-LGBTQ+ participants about their experiences with LLM-based chatbots for mental health needs. LGBTQ+…
▽ More
LGBTQ+ individuals are increasingly turning to chatbots powered by large language models (LLMs) to meet their mental health needs. However, little research has explored whether these chatbots can adequately and safely provide tailored support for this demographic. We interviewed 18 LGBTQ+ and 13 non-LGBTQ+ participants about their experiences with LLM-based chatbots for mental health needs. LGBTQ+ participants relied on these chatbots for mental health support, likely due to an absence of support in real life. Notably, while LLMs offer prompt support, they frequently fall short in gras** the nuances of LGBTQ-specific challenges. Although fine-tuning LLMs to address LGBTQ+ needs can be a step in the right direction, it isn't the panacea. The deeper issue is entrenched in societal discrimination. Consequently, we call on future researchers and designers to look beyond mere technical refinements and advocate for holistic strategies that confront and counteract the societal biases burdening the LGBTQ+ community.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
From Words to Molecules: A Survey of Large Language Models in Chemistry
Authors:
Chang Liao,
Yemin Yu,
Yu Mei,
Ying Wei
Abstract:
In recent years, Large Language Models (LLMs) have achieved significant success in natural language processing (NLP) and various interdisciplinary areas. However, applying LLMs to chemistry is a complex task that requires specialized domain knowledge. This paper provides a thorough exploration of the nuanced methodologies employed in integrating LLMs into the field of chemistry, delving into the c…
▽ More
In recent years, Large Language Models (LLMs) have achieved significant success in natural language processing (NLP) and various interdisciplinary areas. However, applying LLMs to chemistry is a complex task that requires specialized domain knowledge. This paper provides a thorough exploration of the nuanced methodologies employed in integrating LLMs into the field of chemistry, delving into the complexities and innovations at this interdisciplinary juncture. Specifically, our analysis begins with examining how molecular information is fed into LLMs through various representation and tokenization methods. We then categorize chemical LLMs into three distinct groups based on the domain and modality of their input data, and discuss approaches for integrating these inputs for LLMs. Furthermore, this paper delves into the pretraining objectives with adaptations to chemical LLMs. After that, we explore the diverse applications of LLMs in chemistry, including novel paradigms for their application in chemistry tasks. Finally, we identify promising research directions, including further integration with chemical knowledge, advancements in continual learning, and improvements in model interpretability, paving the way for groundbreaking developments in the field.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Improving Critical Node Detection Using Neural Network-based Initialization in a Genetic Algorithm
Authors:
Chanjuan Liu,
Shike Ge,
Zhihan Chen,
Wenbin Pei,
Enqiang Zhu,
Yi Mei,
Hisao Ishibuchi
Abstract:
The Critical Node Problem (CNP) is concerned with identifying the critical nodes in a complex network. These nodes play a significant role in maintaining the connectivity of the network, and removing them can negatively impact network performance. CNP has been studied extensively due to its numerous real-world applications. Among the different versions of CNP, CNP-1a has gained the most popularity…
▽ More
The Critical Node Problem (CNP) is concerned with identifying the critical nodes in a complex network. These nodes play a significant role in maintaining the connectivity of the network, and removing them can negatively impact network performance. CNP has been studied extensively due to its numerous real-world applications. Among the different versions of CNP, CNP-1a has gained the most popularity. The primary objective of CNP-1a is to minimize the pair-wise connectivity in the remaining network after deleting a limited number of nodes from a network. Due to the NP-hard nature of CNP-1a, many heuristic/metaheuristic algorithms have been proposed to solve this problem. However, most existing algorithms start with a random initialization, leading to a high cost of obtaining an optimal solution. To improve the efficiency of solving CNP-1a, a knowledge-guided genetic algorithm named K2GA has been proposed. Unlike the standard genetic algorithm framework, K2GA has two main components: a pretrained neural network to obtain prior knowledge on possible critical nodes, and a hybrid genetic algorithm with local search for finding an optimal set of critical nodes based on the knowledge given by the trained neural network. The local search process utilizes a cut node-based greedy strategy. The effectiveness of the proposed knowledgeguided genetic algorithm is verified by experiments on 26 realworld instances of complex networks. Experimental results show that K2GA outperforms the state-of-the-art algorithms regarding the best, median, and average objective values, and improves the best upper bounds on the best objective values for eight realworld instances.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
FabHacks: Transform Everyday Objects into Functional Fixtures
Authors:
Yuxuan Mei,
Benjamin Jones,
Dan Cascaval,
Jennifer Mankoff,
Etienne Vouga,
Adriana Schulz
Abstract:
Storage, organizing, and decorating are an important part of home design. While one can buy commercial items for many of these tasks, this can be costly, and re-use is more sustainable. An alternative is a "home hack", a functional assembly that can be constructed from existing household items. However, coming up with such hacks requires combining objects to make a physically valid design, which m…
▽ More
Storage, organizing, and decorating are an important part of home design. While one can buy commercial items for many of these tasks, this can be costly, and re-use is more sustainable. An alternative is a "home hack", a functional assembly that can be constructed from existing household items. However, coming up with such hacks requires combining objects to make a physically valid design, which might be difficult to test if they are large, require nailing or screwing something to the wall, or the designer has mobility limitations. In this work, we present a design and visualization system for creating workable functional assemblies, FabHacks, which is based on a solver-aided domain-specific language (S-DSL) FabHaL. By analyzing existing home hacks shared online, we create a design abstraction for connecting household items using predefined types of connections. We provide a UI for FabHaL that can be used to design assemblies that fulfill a given specification. Our system leverages a physics-based solver that takes an assembly design and finds its expected physical configuration. Our validation includes a user study showing that users can create assemblies successfully using our UI and explore a range of designs.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data
Authors:
Yongsheng Mei,
Mahdi Imani,
Tian Lan
Abstract:
Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel…
▽ More
Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel maximum a posteriori inference of Gaussian Cox processes. It leverages the Laplace approximation and change of kernel technique to transform the problem into a new reproducing kernel Hilbert space, where it becomes more tractable computationally. It enables us to obtain both a functional posterior of the latent intensity function and the covariance of the posterior, thus extending existing works that often focus on specific link functions or estimating the posterior mean. Using the result, we propose a BO framework based on the Gaussian Cox process model and further develop a Nyström approximation for efficient computation. Extensive evaluations on various synthetic and real-world datasets demonstrate significant improvement over state-of-the-art inference solutions for Gaussian Cox processes, as well as effective BO with a wide range of acquisition functions designed through the underlying Gaussian Cox process model.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Distance-aware Attention Resha**: Enhance Generalization of Neural Solver for Large-scale Vehicle Routing Problems
Authors:
Yang Wang,
Ya-Hui Jia,
Wei-Neng Chen,
Yi Mei
Abstract:
Neural solvers based on attention mechanism have demonstrated remarkable effectiveness in solving vehicle routing problems. However, in the generalization process from small scale to large scale, we find a phenomenon of the dispersion of attention scores in existing neural solvers, which leads to poor performance. To address this issue, this paper proposes a distance-aware attention resha** meth…
▽ More
Neural solvers based on attention mechanism have demonstrated remarkable effectiveness in solving vehicle routing problems. However, in the generalization process from small scale to large scale, we find a phenomenon of the dispersion of attention scores in existing neural solvers, which leads to poor performance. To address this issue, this paper proposes a distance-aware attention resha** method, assisting neural solvers in solving large-scale vehicle routing problems. Specifically, without the need for additional training, we utilize the Euclidean distance information between current nodes to adjust attention scores. This enables a neural solver trained on small-scale instances to make rational choices when solving a large-scale problem. Experimental results show that the proposed method significantly outperforms existing state-of-the-art neural solvers on the large-scale CVRPLib dataset.
△ Less
Submitted 13 January, 2024;
originally announced January 2024.
-
Design and Nonlinear Modeling of a Modular Cable Driven Soft Robotic Arm
Authors:
Xinda Qi,
Yu Mei,
Dong Chen,
Zhaojian Li,
Xiaobo Tan
Abstract:
We propose a novel multi-section cable-driven soft robotic arm inspired by octopus tentacles along with a new modeling approach. Each section of the modular manipulator is made of a soft tubing backbone, a soft silicon arm body, and two rigid endcaps, which connect adjacent sections and decouple the actuation cables of different sections. The soft robotic arm is made with casting after the rigid e…
▽ More
We propose a novel multi-section cable-driven soft robotic arm inspired by octopus tentacles along with a new modeling approach. Each section of the modular manipulator is made of a soft tubing backbone, a soft silicon arm body, and two rigid endcaps, which connect adjacent sections and decouple the actuation cables of different sections. The soft robotic arm is made with casting after the rigid endcaps are 3D-printed, achieving low-cost and convenient fabrication. To capture the nonlinear effect of cables pushing into the soft silicon arm body, which results from the absence of intermediate rigid cable guides for higher compliance, an analytical static model is developed to capture the relationship between the bending curvature and the cable lengths. The proposed model shows superior prediction performance in experiments over that of a baseline model, especially under large bending conditions. Based on the nonlinear static model, a kinematic model of a multi-section arm is further developed and used to derive a motion planning algorithm. Experiments show that the proposed soft arm has high flexibility and a large workspace, and the tracking errors under the algorithm based on the proposed modeling approach are up to 52$\%$ smaller than those with the algorithm derived from the baseline model. The presented modeling approach is expected to be applicable to a broad range of soft cable-driven actuators and manipulators.
△ Less
Submitted 15 May, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
Near-Space Communications: the Last Piece of 6G Space-Air-Ground-Sea Integrated Network Puzzle
Authors:
Hongshan Liu,
Tong Qin,
Zhen Gao,
Tianqi Mao,
Keke Ying,
Ziwei Wan,
Li Qiao,
Rui Na,
Zhongxiang Li,
Chun Hu,
Yikun Mei,
Tuan Li,
Guanghui Wen,
Lei Chen,
Zhonghuai Wu,
Ruiqi Liu,
Gaojie Chen,
Shuo Wang,
Dezhi Zheng
Abstract:
This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis…
▽ More
This article presents a comprehensive study on the emerging near-space communications (NS-COM) within the context of space-air-ground-sea integrated network (SAGSIN). Specifically, we firstly explore the recent technical developments of NS-COM, followed by the discussions about motivations behind integrating NS-COM into SAGSIN. To further demonstrate the necessity of NS-COM, a comparative analysis between the NS-COM network and other counterparts in SAGSIN is conducted, covering aspects of deployment, coverage, channel characteristics and unique problems of NS-COM network. Afterwards, the technical aspects of NS-COM, including channel modeling, random access, channel estimation, array-based beam management and joint network optimization, are examined in detail. Furthermore, we explore the potential applications of NS-COM, such as structural expansion in SAGSIN communication, civil aviation communication, remote and urgent communication, weather monitoring and carbon neutrality. Finally, some promising research avenues are identified, including stratospheric satellite (StratoSat) -to-ground direct links for mobile terminals, reconfigurable multiple-input multiple-output (MIMO) and holographic MIMO, federated learning in NS-COM networks, maritime communication, electromagnetic spectrum sensing and adversarial game, integrated sensing and communications, StratoSat-based radar detection and imaging, NS-COM assisted enhanced global navigation system, NS-COM assisted intelligent unmanned system and free space optical (FSO) communication. Overall, this paper highlights that the NS-COM plays an indispensable role in the SAGSIN puzzle, providing substantial performance and coverage enhancement to the traditional SAGSIN architecture.
△ Less
Submitted 4 March, 2024; v1 submitted 30 December, 2023;
originally announced January 2024.
-
XC-NAS: A New Cellular Encoding Approach for Neural Architecture Search of Multi-path Convolutional Neural Networks
Authors:
Trevor Londt,
Xiaoying Gao,
Peter Andreae,
Yi Mei
Abstract:
Convolutional Neural Networks (CNNs) continue to achieve great success in classification tasks as innovative techniques and complex multi-path architecture topologies are introduced. Neural Architecture Search (NAS) aims to automate the design of these complex architectures, reducing the need for costly manual design work by human experts. Cellular Encoding (CE) is an evolutionary computation tech…
▽ More
Convolutional Neural Networks (CNNs) continue to achieve great success in classification tasks as innovative techniques and complex multi-path architecture topologies are introduced. Neural Architecture Search (NAS) aims to automate the design of these complex architectures, reducing the need for costly manual design work by human experts. Cellular Encoding (CE) is an evolutionary computation technique which excels in constructing novel multi-path topologies of varying complexity and has recently been applied with NAS to evolve CNN architectures for various classification tasks. However, existing CE approaches have severe limitations. They are restricted to only one domain, only partially implement the theme of CE, or only focus on the micro-architecture search space. This paper introduces a new CE representation and algorithm capable of evolving novel multi-path CNN architectures of varying depth, width, and complexity for image and text classification tasks. The algorithm explicitly focuses on the macro-architecture search space. Furthermore, by using a surrogate model approach, we show that the algorithm can evolve a performant CNN architecture in less than one GPU day, thereby allowing a sufficient number of experiment runs to be conducted to achieve scientific robustness. Experiment results show that the approach is highly competitive, defeating several state-of-the-art methods, and is generalisable to both the image and text domains.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Real-time Network Intrusion Detection via Decision Transformers
Authors:
**gdi Chen,
Hanhan Zhou,
Yongsheng Mei,
Gina Adam,
Nathaniel D. Bastian,
Tian Lan
Abstract:
Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlyin…
▽ More
Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlying network states are often not observable. In this paper, we cast the problem of real-time network intrusion detection as casual sequence modeling and draw upon the power of the transformer architecture for real-time decision-making. By conditioning a causal decision transformer on past trajectories, consisting of the rewards, network packets, and detection decisions, our proposed framework will generate future detection decisions to achieve the desired return. It enables decision transformers to be applied to real-time network intrusion detection, as well as a novel tradeoff between the accuracy and timeliness of detection. The proposed solution is evaluated on public network intrusion detection datasets and outperforms several baseline algorithms using reinforcement learning and sequence modeling, in terms of detection accuracy and timeliness.
△ Less
Submitted 16 December, 2023; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Scientific Preparation for CSST: Classification of Galaxy and Nebula/Star Cluster Based on Deep Learning
Authors:
Yuquan Zhang,
Zhong Cao,
Feng Wang,
Lam,
Man I,
Hui Deng,
Ying Mei,
Lei Tan
Abstract:
The Chinese Space Station Telescope (abbreviated as CSST) is a future advanced space telescope. Real-time identification of galaxy and nebula/star cluster (abbreviated as NSC) images is of great value during CSST survey. While recent research on celestial object recognition has progressed, the rapid and efficient identification of high-resolution local celestial images remains challenging. In this…
▽ More
The Chinese Space Station Telescope (abbreviated as CSST) is a future advanced space telescope. Real-time identification of galaxy and nebula/star cluster (abbreviated as NSC) images is of great value during CSST survey. While recent research on celestial object recognition has progressed, the rapid and efficient identification of high-resolution local celestial images remains challenging. In this study, we conducted galaxy and NSC image classification research using deep learning methods based on data from the Hubble Space Telescope. We built a Local Celestial Image Dataset and designed a deep learning model named HR-CelestialNet for classifying images of the galaxy and NSC. HR-CelestialNet achieved an accuracy of 89.09% on the testing set, outperforming models such as AlexNet, VGGNet and ResNet, while demonstrating faster recognition speeds. Furthermore, we investigated the factors influencing CSST image quality and evaluated the generalization ability of HR-CelestialNet on the blurry image dataset, demonstrating its robustness to low image quality. The proposed method can enable real-time identification of celestial images during CSST survey mission.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
Authors:
Tong Nie,
Guoyang Qin,
Wei Ma,
Yuewen Mei,
Jian Sun
Abstract:
Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient feat…
▽ More
Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient features of expressivity but lacks prior knowledge of the underlying spatiotemporal structures. Leveraging the strengths of both two paradigms, we demonstrate a low rankness-induced Transformer to achieve a balance between strong inductive bias and high model expressivity. The exploitation of the inherent structures of spatiotemporal data enables our model to learn balanced signal-noise representations, making it generalizable for a variety of imputation problems. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters, and air quality. Promising empirical results provide strong conviction that incorporating time series primitives, such as low-rankness, can substantially facilitate the development of a generalizable model to approach a wide range of spatiotemporal imputation problems.
△ Less
Submitted 28 May, 2024; v1 submitted 4 December, 2023;
originally announced December 2023.
-
A Fully Data-Driven Approach for Realistic Traffic Signal Control Using Offline Reinforcement Learning
Authors:
Jianxiong Li,
Shichao Lin,
Tianyu Shi,
Chujie Tian,
Yu Mei,
Jian Song,
Xianyuan Zhan,
Ruimin Li
Abstract:
The optimization of traffic signal control (TSC) is critical for an efficient transportation system. In recent years, reinforcement learning (RL) techniques have emerged as a popular approach for TSC and show promising results for highly adaptive control. However, existing RL-based methods suffer from notably poor real-world applicability and hardly have any successful deployments. The reasons for…
▽ More
The optimization of traffic signal control (TSC) is critical for an efficient transportation system. In recent years, reinforcement learning (RL) techniques have emerged as a popular approach for TSC and show promising results for highly adaptive control. However, existing RL-based methods suffer from notably poor real-world applicability and hardly have any successful deployments. The reasons for such failures are mostly due to the reliance on over-idealized traffic simulators for policy optimization, as well as using unrealistic fine-grained state observations and reward signals that are not directly obtainable from real-world sensors. In this paper, we propose a fully Data-Driven and simulator-free framework for realistic Traffic Signal Control (D2TSC). Specifically, we combine well-established traffic flow theory with machine learning to construct a reward inference model to infer the reward signals from coarse-grained traffic data. With the inferred rewards, we further propose a sample-efficient offline RL method to enable direct signal control policy learning from historical offline datasets of real-world intersections. To evaluate our approach, we collect historical traffic data from a real-world intersection, and develop a highly customized simulation environment that strictly follows real data characteristics. We demonstrate through extensive experiments that our approach achieves superior performance over conventional and offline RL baselines, and also enjoys much better real-world applicability.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
DeltaLCA: Comparative Life-Cycle Assessment for Electronics Design
Authors:
Zhihan Zhang,
Felix Hähnlein,
Yuxuan Mei,
Zachary Englhardt,
Shwetak Patel,
Adriana Schulz,
Vikram Iyer
Abstract:
Reducing the environmental footprint of electronics and computing devices requires new tools that empower designers to make informed decisions about sustainability during the design process itself. This is not possible with current tools for life cycle assessment (LCA) which require substantial domain expertise and time to evaluate the numerous chips and other components that make up a device. We…
▽ More
Reducing the environmental footprint of electronics and computing devices requires new tools that empower designers to make informed decisions about sustainability during the design process itself. This is not possible with current tools for life cycle assessment (LCA) which require substantial domain expertise and time to evaluate the numerous chips and other components that make up a device. We observe first that informed decision-making does not require absolute metrics and can instead be done by comparing designs. Second, we can use domain-specific heuristics to perform these comparisons. We combine these insights to develop DeltaLCA, an open-source interactive design tool that addresses the dual challenges of automating life cycle inventory generation and data availability by performing comparative analyses of electronics designs. Users can upload standard design files from Electronic Design Automation (EDA) software and the tool will guide them through determining which one has greater carbon footprint. DeltaLCA leverages electronics-specific LCA datasets and heuristics and tries to automatically rank the two designs, prompting users to provide additional information only when necessary. We show through case studies DeltaLCA achieves the same result as evaluating full LCAs, and that it accelerates LCA comparisons from eight expert-hours to a single click for devices with ~30 components, and 15 minutes for more complex devices with ~100 components.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Compressive Sensing-Based Grant-Free Massive Access for 6G Massive Communication
Authors:
Zhen Gao,
Malong Ke,
Yikun Mei,
Li Qiao,
Sheng Chen,
Derrick Wing Kwan Ng,
H. Vincent Poor
Abstract:
The advent of the sixth-generation (6G) of wireless communications has given rise to the necessity to connect vast quantities of heterogeneous wireless devices, which requires advanced system capabilities far beyond existing network architectures. In particular, such massive communication has been recognized as a prime driver that can empower the 6G vision of future ubiquitous connectivity, suppor…
▽ More
The advent of the sixth-generation (6G) of wireless communications has given rise to the necessity to connect vast quantities of heterogeneous wireless devices, which requires advanced system capabilities far beyond existing network architectures. In particular, such massive communication has been recognized as a prime driver that can empower the 6G vision of future ubiquitous connectivity, supporting Internet of Human-Machine-Things for which massive access is critical. This paper surveys the most recent advances toward massive access in both academic and industry communities, focusing primarily on the promising compressive sensing-based grant-free massive access paradigm. We first specify the limitations of existing random access schemes and reveal that the practical implementation of massive communication relies on a dramatically different random access paradigm from the current ones mainly designed for human-centric communications. Then, a compressive sensing-based grant-free massive access roadmap is presented, where the evolutions from single-antenna to large-scale antenna array-based base stations, from single-station to cooperative massive multiple-input multiple-output systems, and from unsourced to sourced random access scenarios are detailed. Finally, we discuss the key challenges and open issues to shed light on the potential future research directions of grant-free massive access.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process with Uncertainty Quantification
Authors:
Zichong Li,
Qunzhi Xu,
Zhenghao Xu,
Yajun Mei,
Tuo Zhao,
Hongyuan Zha
Abstract:
Spatio-temporal point processes (STPPs) are potent mathematical tools for modeling and predicting events with both temporal and spatial features. Despite their versatility, most existing methods for learning STPPs either assume a restricted form of the spatio-temporal distribution, or suffer from inaccurate approximations of the intractable integral in the likelihood training objective. These issu…
▽ More
Spatio-temporal point processes (STPPs) are potent mathematical tools for modeling and predicting events with both temporal and spatial features. Despite their versatility, most existing methods for learning STPPs either assume a restricted form of the spatio-temporal distribution, or suffer from inaccurate approximations of the intractable integral in the likelihood training objective. These issues typically arise from the normalization term of the probability density function. Moreover, current techniques fail to provide uncertainty quantification for model predictions, such as confidence intervals for the predicted event's arrival time and confidence regions for the event's location, which is crucial given the considerable randomness of the data. To tackle these challenges, we introduce SMASH: a Score MAtching-based pSeudolikeliHood estimator for learning marked STPPs with uncertainty quantification. Specifically, our framework adopts a normalization-free objective by estimating the pseudolikelihood of marked STPPs through score-matching and offers uncertainty quantification for the predicted event time, location and mark by computing confidence regions over the generated samples. The superior performance of our proposed framework is demonstrated through extensive experiments in both event prediction and uncertainty quantification.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Irregular Traffic Time Series Forecasting Based on Asynchronous Spatio-Temporal Graph Convolutional Network
Authors:
Weijia Zhang,
Le Zhang,
**dong Han,
Hao Liu,
**gbo Zhou,
Yu Mei,
Hui Xiong
Abstract:
Accurate traffic forecasting at intersections governed by intelligent traffic signals is critical for the advancement of an effective intelligent traffic signal control system. However, due to the irregular traffic time series produced by intelligent intersections, the traffic forecasting task becomes much more intractable and imposes three major new challenges: 1) asynchronous spatial dependency,…
▽ More
Accurate traffic forecasting at intersections governed by intelligent traffic signals is critical for the advancement of an effective intelligent traffic signal control system. However, due to the irregular traffic time series produced by intelligent intersections, the traffic forecasting task becomes much more intractable and imposes three major new challenges: 1) asynchronous spatial dependency, 2) irregular temporal dependency among traffic data, and 3) variable-length sequence to be predicted, which severely impede the performance of current traffic forecasting methods. To this end, we propose an Asynchronous Spatio-tEmporal graph convolutional nEtwoRk (ASeer) to predict the traffic states of the lanes entering intelligent intersections in a future time window. Specifically, by linking lanes via a traffic diffusion graph, we first propose an Asynchronous Graph Diffusion Network to model the asynchronous spatial dependency between the time-misaligned traffic state measurements of lanes. After that, to capture the temporal dependency within irregular traffic state sequence, a learnable personalized time encoding is devised to embed the continuous time for each lane. Then we propose a Transformable Time-aware Convolution Network that learns meta-filters to derive time-aware convolution filters with transformable filter sizes for efficient temporal convolution on the irregular sequence. Furthermore, a Semi-Autoregressive Prediction Network consisting of a state evolution unit and a semiautoregressive predictor is designed to effectively and efficiently predict variable-length traffic state sequences. Extensive experiments on two real-world datasets demonstrate the effectiveness of ASeer in six metrics.
△ Less
Submitted 1 September, 2023; v1 submitted 31 August, 2023;
originally announced August 2023.
-
Understanding the Benefits and Challenges of Using Large Language Model-based Conversational Agents for Mental Well-being Support
Authors:
Zilin Ma,
Yiyang Mei,
Zhaoyuan Su
Abstract:
Conversational agents powered by large language models (LLM) have increasingly been utilized in the realm of mental well-being support. However, the implications and outcomes associated with their usage in such a critical field remain somewhat ambiguous and unexplored. We conducted a qualitative analysis of 120 posts, encompassing 2917 user comments, drawn from the most popular subreddit focused o…
▽ More
Conversational agents powered by large language models (LLM) have increasingly been utilized in the realm of mental well-being support. However, the implications and outcomes associated with their usage in such a critical field remain somewhat ambiguous and unexplored. We conducted a qualitative analysis of 120 posts, encompassing 2917 user comments, drawn from the most popular subreddit focused on mental health support applications powered by large language models (u/Replika). This exploration aimed to shed light on the advantages and potential pitfalls associated with the integration of these sophisticated models in conversational agents intended for mental health support. We found the app (Replika) beneficial in offering on-demand, non-judgmental support, boosting user confidence, and aiding self-discovery. Yet, it faced challenges in filtering harmful content, sustaining consistent communication, remembering new information, and mitigating users' overdependence. The stigma attached further risked isolating users socially. We strongly assert that future researchers and designers must thoroughly evaluate the appropriateness of employing LLMs for mental well-being support, ensuring their responsible and effective application.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Quarl: A Learning-Based Quantum Circuit Optimizer
Authors:
Zikun Li,
**jun Peng,
Yixuan Mei,
Sina Lin,
Yi Wu,
Oded Padon,
Zhihao Jia
Abstract:
Optimizing quantum circuits is challenging due to the very large search space of functionally equivalent circuits and the necessity of applying transformations that temporarily decrease performance to achieve a final performance improvement. This paper presents Quarl, a learning-based quantum circuit optimizer. Applying reinforcement learning (RL) to quantum circuit optimization raises two main ch…
▽ More
Optimizing quantum circuits is challenging due to the very large search space of functionally equivalent circuits and the necessity of applying transformations that temporarily decrease performance to achieve a final performance improvement. This paper presents Quarl, a learning-based quantum circuit optimizer. Applying reinforcement learning (RL) to quantum circuit optimization raises two main challenges: the large and varying action space and the non-uniform state representation. Quarl addresses these issues with a novel neural architecture and RL-training procedure. Our neural architecture decomposes the action space into two parts and leverages graph neural networks in its state representation, both of which are guided by the intuition that optimization decisions can be mostly guided by local reasoning while allowing global circuit-wide reasoning. Our evaluation shows that Quarl significantly outperforms existing circuit optimizers on almost all benchmark circuits. Surprisingly, Quarl can learn to perform rotation merging, a complex, non-local circuit optimization implemented as a separate pass in existing optimizers.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at Scale
Authors:
Tong Nie,
Guoyang Qin,
Lijun Sun,
Wei Ma,
Yu Mei,
Jian Sun
Abstract:
Spatiotemporal urban data (STUD) displays complex correlational patterns. Extensive advanced techniques have been designed to capture these patterns for effective forecasting. However, because STUD is often massive in scale, practitioners need to strike a balance between effectiveness and efficiency by choosing computationally efficient models. An alternative paradigm called MLP-Mixer has the pote…
▽ More
Spatiotemporal urban data (STUD) displays complex correlational patterns. Extensive advanced techniques have been designed to capture these patterns for effective forecasting. However, because STUD is often massive in scale, practitioners need to strike a balance between effectiveness and efficiency by choosing computationally efficient models. An alternative paradigm called MLP-Mixer has the potential for both simplicity and effectiveness. Taking inspiration from its success in other domains, we propose an adapted version, named NexuSQN, for STUD forecast at scale. We identify the challenges faced when directly applying MLP-Mixers as series- and window-wise multivaluedness and propose the ST-contextualization to distinguish between spatial and temporal patterns. Experimental results surprisingly demonstrate that MLP-Mixers with ST-contextualization can rival SOTA performance when tested on several urban benchmarks. Furthermore, it was deployed in a collaborative urban congestion project with Baidu, specifically evaluating its ability to forecast traffic states in megacities like Bei**g and Shanghai. Our findings contribute to the exploration of simple yet effective models for real-world STUD forecasting.
△ Less
Submitted 7 February, 2024; v1 submitted 4 July, 2023;
originally announced July 2023.
-
Understanding Client Reactions in Online Mental Health Counseling
Authors:
Anqi Li,
Lizhi Ma,
Yaling Mei,
Hongliang He,
Shuai Zhang,
Huachuan Qiu,
Zhenzhong Lan
Abstract:
Communication success relies heavily on reading participants' reactions. Such feedback is especially important for mental health counselors, who must carefully consider the client's progress and adjust their approach accordingly. However, previous NLP research on counseling has mainly focused on studying counselors' intervention strategies rather than their clients' reactions to the intervention.…
▽ More
Communication success relies heavily on reading participants' reactions. Such feedback is especially important for mental health counselors, who must carefully consider the client's progress and adjust their approach accordingly. However, previous NLP research on counseling has mainly focused on studying counselors' intervention strategies rather than their clients' reactions to the intervention. This work aims to fill this gap by develo** a theoretically grounded annotation framework that encompasses counselors' strategies and client reaction behaviors. The framework has been tested against a large-scale, high-quality text-based counseling dataset we collected over the past two years from an online welfare counseling platform. Our study shows how clients react to counselors' strategies, how such reactions affect the final counseling outcomes, and how counselors can adjust their strategies in response to these reactions. We also demonstrate that this study can help counselors automatically predict their clients' states.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization
Authors:
Kailash Gogineni,
Yongsheng Mei,
Peng Wei,
Tian Lan,
Guru Venkataramani
Abstract:
Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning~(RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralize…
▽ More
Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning~(RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralized execution~(CTDE) paradigm. Current multi-agent reinforcement learning~(MARL) algorithms consider experience replay with uniform sampling or based on priority weights to improve transition data sample efficiency in the sampling phase. However, moving transition data histories for each agent through the processor memory hierarchy is a performance limiter. Also, as the agents' transitions continuously renew every iteration, the finite cache capacity results in increased cache misses.
To this end, we propose \name, that repeatedly reuses the transitions~(experiences) for a window of $n$ steps in order to improve the cache locality and minimize the transition data movement, instead of sampling new transitions at each step. Specifically, our optimization uses priority weights to select the transitions so that only high-priority transitions will be reused frequently, thereby improving the cache performance. Our experimental results on the Predator-Prey environment demonstrate the effectiveness of reusing the essential transitions based on the priority weights, where we observe an end-to-end training time reduction of $25.4\%$~(for $32$ agents) compared to existing prioritized MER algorithms without notable degradation in the mean reward.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
Local Optima Correlation Assisted Adaptive Operator Selection
Authors:
Jiyuan Pei,
Hao Tong,
Jialin Liu,
Yi Mei,
Xin Yao
Abstract:
For solving combinatorial optimisation problems with metaheuristics, different search operators are applied for sampling new solutions in the neighbourhood of a given solution. It is important to understand the relationship between operators for various purposes, e.g., adaptively deciding when to use which operator to find optimal solutions efficiently. However, it is difficult to theoretically an…
▽ More
For solving combinatorial optimisation problems with metaheuristics, different search operators are applied for sampling new solutions in the neighbourhood of a given solution. It is important to understand the relationship between operators for various purposes, e.g., adaptively deciding when to use which operator to find optimal solutions efficiently. However, it is difficult to theoretically analyse this relationship, especially in the complex solution space of combinatorial optimisation problems. In this paper, we propose to empirically analyse the relationship between operators in terms of the correlation between their local optima and develop a measure for quantifying their relationship. The comprehensive analyses on a wide range of capacitated vehicle routing problem benchmark instances show that there is a consistent pattern in the correlation between commonly used operators. Based on this newly proposed local optima correlation metric, we propose a novel approach for adaptively selecting among the operators during the search process. The core intention is to improve search efficiency by preventing wasting computational resources on exploring neighbourhoods where the local optima have already been reached. Experiments on randomly generated instances and commonly used benchmark datasets are conducted. Results show that the proposed approach outperforms commonly used adaptive operator selection methods.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
LightPainter: Interactive Portrait Relighting with Freehand Scribble
Authors:
Yiqun Mei,
He Zhang,
Xuaner Zhang,
Jianming Zhang,
Zhixin Shu,
Yilin Wang,
Zijun Wei,
Shi Yan,
HyunJoon Jung,
Vishal M. Patel
Abstract:
Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map. However, these methods are not intuitive for user interaction and lack precise lighting control. We introduce LightPainter, a scribble-based relighting system that allows users to interactively manipulate portrait lighting effect with e…
▽ More
Recent portrait relighting methods have achieved realistic results of portrait lighting effects given a desired lighting representation such as an environment map. However, these methods are not intuitive for user interaction and lack precise lighting control. We introduce LightPainter, a scribble-based relighting system that allows users to interactively manipulate portrait lighting effect with ease. This is achieved by two conditional neural networks, a delighting module that recovers geometry and albedo optionally conditioned on skin tone, and a scribble-based module for relighting. To train the relighting module, we propose a novel scribble simulation procedure to mimic real user scribbles, which allows our pipeline to be trained without any human annotations. We demonstrate high-quality and flexible portrait lighting editing capability with both quantitative and qualitative experiments. User study comparisons with commercial lighting editing tools also demonstrate consistent user preference for our method.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
MAC-PO: Multi-Agent Experience Replay via Collective Priority Optimization
Authors:
Yongsheng Mei,
Hanhan Zhou,
Tian Lan,
Guru Venkataramani,
Peng Wei
Abstract:
Experience replay is crucial for off-policy reinforcement learning (RL) methods. By remembering and reusing the experiences from past different policies, experience replay significantly improves the training efficiency and stability of RL algorithms. Many decision-making problems in practice naturally involve multiple agents and require multi-agent reinforcement learning (MARL) under centralized t…
▽ More
Experience replay is crucial for off-policy reinforcement learning (RL) methods. By remembering and reusing the experiences from past different policies, experience replay significantly improves the training efficiency and stability of RL algorithms. Many decision-making problems in practice naturally involve multiple agents and require multi-agent reinforcement learning (MARL) under centralized training decentralized execution paradigm. Nevertheless, existing MARL algorithms often adopt standard experience replay where the transitions are uniformly sampled regardless of their importance. Finding prioritized sampling weights that are optimized for MARL experience replay has yet to be explored. To this end, we propose MAC-PO, which formulates optimal prioritized experience replay for multi-agent problems as a regret minimization over the sampling weights of transitions. Such optimization is relaxed and solved using the Lagrangian multiplier approach to obtain the close-form optimal sampling weights. By minimizing the resulting policy regret, we can narrow the gap between the current policy and a nominal optimal policy, thus acquiring an improved prioritization scheme for multi-agent tasks. Our experimental results on Predator-Prey and StarCraft Multi-Agent Challenge environments demonstrate the effectiveness of our method, having a better ability to replay important transitions and outperforming other state-of-the-art baselines.
△ Less
Submitted 27 February, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
ReMIX: Regret Minimization for Monotonic Value Function Factorization in Multiagent Reinforcement Learning
Authors:
Yongsheng Mei,
Hanhan Zhou,
Tian Lan
Abstract:
Value function factorization methods have become a dominant approach for cooperative multiagent reinforcement learning under a centralized training and decentralized execution paradigm. By factorizing the optimal joint action-value function using a monotonic mixing function of agents' utilities, these algorithms ensure the consistency between joint and local action selections for decentralized dec…
▽ More
Value function factorization methods have become a dominant approach for cooperative multiagent reinforcement learning under a centralized training and decentralized execution paradigm. By factorizing the optimal joint action-value function using a monotonic mixing function of agents' utilities, these algorithms ensure the consistency between joint and local action selections for decentralized decision-making. Nevertheless, the use of monotonic mixing functions also induces representational limitations. Finding the optimal projection of an unrestricted mixing function onto monotonic function classes is still an open problem. To this end, we propose ReMIX, formulating this optimal projection problem for value function factorization as a regret minimization over the projection weights of different state-action values. Such an optimization problem can be relaxed and solved using the Lagrangian multiplier method to obtain the close-form optimal projection weights. By minimizing the resulting policy regret, we can narrow the gap between the optimal and the restricted monotonic mixing functions, thus obtaining an improved monotonic value function factorization. Our experimental results on Predator-Prey and StarCraft Multiagent Challenge environments demonstrate the effectiveness of our method, indicating the better capabilities of handling environments with non-monotonic value functions.
△ Less
Submitted 10 February, 2023;
originally announced February 2023.
-
Exploiting Partial Common Information Microstructure for Multi-Modal Brain Tumor Segmentation
Authors:
Yongsheng Mei,
Guru Venkataramani,
Tian Lan
Abstract:
Learning with multiple modalities is crucial for automated brain tumor segmentation from magnetic resonance imaging data. Explicitly optimizing the common information shared among all modalities (e.g., by maximizing the total correlation) has been shown to achieve better feature representations and thus enhance the segmentation performance. However, existing approaches are oblivious to partial com…
▽ More
Learning with multiple modalities is crucial for automated brain tumor segmentation from magnetic resonance imaging data. Explicitly optimizing the common information shared among all modalities (e.g., by maximizing the total correlation) has been shown to achieve better feature representations and thus enhance the segmentation performance. However, existing approaches are oblivious to partial common information shared by subsets of the modalities. In this paper, we show that identifying such partial common information can significantly boost the discriminative power of image segmentation models. In particular, we introduce a novel concept of partial common information mask (PCI-mask) to provide a fine-grained characterization of what partial common information is shared by which subsets of the modalities. By solving a masked correlation maximization and simultaneously learning an optimal PCI-mask, we identify the latent microstructure of partial common information and leverage it in a self-attention module to selectively weight different feature representations in multi-modal data. We implement our proposed framework on the standard U-Net. Our experimental results on the Multi-modal Brain Tumor Segmentation Challenge (BraTS) datasets outperform those of state-of-the-art segmentation baselines, with validation Dice similarity coefficients of 0.920, 0.897, 0.837 for the whole tumor, tumor core, and enhancing tumor on BraTS-2020.
△ Less
Submitted 14 July, 2023; v1 submitted 5 February, 2023;
originally announced February 2023.
-
oneDNN Graph Compiler: A Hybrid Approach for High-Performance Deep Learning Compilation
Authors:
Jianhui Li,
Zhennan Qin,
Yijie Mei,
**gze Cui,
Yunfei Song,
Ciyong Chen,
Yifei Zhang,
Longsheng Du,
Xianhang Cheng,
Baihui **,
Yan Zhang,
Jason Ye,
Eric Lin,
Dan Lavery
Abstract:
With the rapid development of deep learning models and hardware support for dense computing, the deep learning workload characteristics changed significantly from a few hot spots on compute-intensive operations to a broad range of operations scattered across the models. Accelerating a few compute-intensive operations using the expert-tuned implementation of primitives does not fully exploit the pe…
▽ More
With the rapid development of deep learning models and hardware support for dense computing, the deep learning workload characteristics changed significantly from a few hot spots on compute-intensive operations to a broad range of operations scattered across the models. Accelerating a few compute-intensive operations using the expert-tuned implementation of primitives does not fully exploit the performance potential of AI hardware. Various efforts have been made to compile a full deep neural network (DNN) graph. One of the biggest challenges is to achieve high-performance tensor compilation by generating expert level performance code for the dense compute-intensive operations and applying compilation optimization at the scope of DNN computation graph across multiple compute-intensive operations.
We present oneDNN Graph Compiler, a tensor compiler that employs a hybrid approach of using techniques from both compiler optimization and expert-tuned kernels for high performance code generation of the deep neural network graph. oneDNN Graph Compiler addresses unique optimization challenges in the deep learning domain, such as low-precision computation, aggressive fusion of graph operations, optimization for static tensor shapes and memory layout, constant weight optimization, and memory buffer reuse. Experimental results demonstrate significant performance gains over existing tensor compiler and primitives library for performance-critical DNN computation graphs and end-to-end models on Intel Xeon Scalable Processors.
△ Less
Submitted 11 March, 2024; v1 submitted 3 January, 2023;
originally announced January 2023.
-
Covariance Estimators for the ROOT-SGD Algorithm in Online Learning
Authors:
Yiling Luo,
Xiaoming Huo,
Yajun Mei
Abstract:
Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges t…
▽ More
Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges to a normal distribution. However, this normal distribution has unknown asymptotic covariance; thus cannot be directly applied to measure the uncertainty. To fill this gap, we develop two estimators for the asymptotic covariance of ROOT-SGD. Our covariance estimators are useful for statistical inference in ROOT-SGD. Our first estimator adopts the idea of plug-in. For each unknown component in the formula of the asymptotic covariance, we substitute it with its empirical counterpart. The plug-in estimator converges at the rate $\mathcal{O}(1/\sqrt{t})$, where $t$ is the sample size. Despite its quick convergence, the plug-in estimator has the limitation that it relies on the Hessian of the loss function, which might be unavailable in some cases. Our second estimator is a Hessian-free estimator that overcomes the aforementioned limitation. The Hessian-free estimator uses the random-scaling technique, and we show that it is an asymptotically consistent estimator of the true covariance.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
Canal: A Flexible Interconnect Generator for Coarse-Grained Reconfigurable Arrays
Authors:
Jackson Melchert,
Keyi Zhang,
Yuchen Mei,
Mark Horowitz,
Christopher Torng,
Priyanka Raina
Abstract:
The architecture of a coarse-grained reconfigurable array (CGRA) interconnect has a significant effect on not only the flexibility of the resulting accelerator, but also its power, performance, and area. Design decisions that have complex trade-offs need to be explored to maintain efficiency and performance across a variety of evolving applications. This paper presents Canal, a Python-embedded dom…
▽ More
The architecture of a coarse-grained reconfigurable array (CGRA) interconnect has a significant effect on not only the flexibility of the resulting accelerator, but also its power, performance, and area. Design decisions that have complex trade-offs need to be explored to maintain efficiency and performance across a variety of evolving applications. This paper presents Canal, a Python-embedded domain-specific language (eDSL) and compiler for specifying and generating reconfigurable interconnects for CGRAs. Canal uses a graph-based intermediate representation (IR) that allows for easy hardware generation and tight integration with place and route tools. We evaluate Canal by constructing both a fully static interconnect and a hybrid interconnect with ready-valid signaling, and by conducting design space exploration of the interconnect architecture by modifying the switch box topology, the number of routing tracks, and the interconnect tile connections. Through the use of a graph-based IR for CGRA interconnects, the eDSL, and the interconnect generation system, Canal enables fast design space exploration and creation of CGRA interconnects.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
Cascade: An Application Pipelining Toolkit for Coarse-Grained Reconfigurable Arrays
Authors:
Jackson Melchert,
Yuchen Mei,
Kalhan Koul,
Qiaoyi Liu,
Mark Horowitz,
Priyanka Raina
Abstract:
While coarse-grained reconfigurable arrays (CGRAs) have emerged as promising programmable accelerator architectures, pipelining applications running on CGRAs is required to ensure high maximum clock frequencies. Current CGRA compilers either lack pipelining techniques resulting in low performance or perform exhaustive pipelining resulting in high energy and resource consumption. We introduce Casca…
▽ More
While coarse-grained reconfigurable arrays (CGRAs) have emerged as promising programmable accelerator architectures, pipelining applications running on CGRAs is required to ensure high maximum clock frequencies. Current CGRA compilers either lack pipelining techniques resulting in low performance or perform exhaustive pipelining resulting in high energy and resource consumption. We introduce Cascade, an application pipelining toolkit for CGRAs, including a CGRA application frequency model, automated pipelining techniques for CGRA application compilers that work with both dense and sparse applications, and hardware optimizations for improving application frequency. Cascade enables 7 - 34x lower critical path delays and 7 - 190x lower EDP across a variety of dense image processing and machine learning workloads, and 2 - 4.4x lower critical path delays and 1.5 - 4.2x lower EDP on sparse workloads, compared to a compiler without pipelining.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
A Bayesian Optimization Framework for Finding Local Optima in Expensive Multi-Modal Functions
Authors:
Yongsheng Mei,
Tian Lan,
Mahdi Imani,
Suresh Subramaniam
Abstract:
Bayesian optimization (BO) is a popular global optimization scheme for sample-efficient optimization in domains with expensive function evaluations. The existing BO techniques are capable of finding a single global optimum solution. However, finding a set of global and local optimum solutions is crucial in a wide range of real-world problems, as implementing some of the optimal solutions might not…
▽ More
Bayesian optimization (BO) is a popular global optimization scheme for sample-efficient optimization in domains with expensive function evaluations. The existing BO techniques are capable of finding a single global optimum solution. However, finding a set of global and local optimum solutions is crucial in a wide range of real-world problems, as implementing some of the optimal solutions might not be feasible due to various practical restrictions (e.g., resource limitation, physical constraints, etc.). In such domains, if multiple solutions are known, the implementation can be quickly switched to another solution, and the best possible system performance can still be obtained. This paper develops a multimodal BO framework to effectively find a set of local/global solutions for expensive-to-evaluate multimodal objective functions. We consider the standard BO setting with Gaussian process regression representing the objective function. We analytically derive the joint distribution of the objective function and its first-order derivatives. This joint distribution is used in the body of the BO acquisition functions to search for local optima during the optimization process. We introduce variants of the well-known BO acquisition functions to the multimodal setting and demonstrate the performance of the proposed framework in locating a set of local optimum solutions using multiple optimization problems.
△ Less
Submitted 5 August, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Joint Activity Detection and Channel Estimation for Massive IoT Access Based on Millimeter-Wave/Terahertz Multi-Panel Massive MIMO
Authors:
Hanlin Xiu,
Zhen Gao,
Anwen Liao,
Yikun Mei,
Dezhi Zheng,
Shufeng Tan,
Marco Di Renzo,
Lajos Hanzo
Abstract:
The multi-panel array, as a state-of-the-art antenna-in-package technology, is very suitable for millimeter-wave (mmWave)/terahertz (THz) systems, due to its low-cost deployment and scalable configuration. But in the context of nonuniform array structures it leads to intractable signal processing. Based on such an array structure at the base station, this paper investigates a joint active user det…
▽ More
The multi-panel array, as a state-of-the-art antenna-in-package technology, is very suitable for millimeter-wave (mmWave)/terahertz (THz) systems, due to its low-cost deployment and scalable configuration. But in the context of nonuniform array structures it leads to intractable signal processing. Based on such an array structure at the base station, this paper investigates a joint active user detection (AUD) and channel estimation (CE) scheme based on compressive sensing (CS) for application to the massive Internet of Things (IoT). Specifically, by exploiting the structured sparsity of mmWave/THz massive IoT access channels, we firstly formulate the multi-panel massive multiple-input multiple-output (mMIMO)-based joint AUD and CE problem as a multiple measurement vector (MMV)-CS problem. Then, we harness the expectation maximization (EM) algorithm to learn the prior parameters (i.e., the noise variance and the sparsity ratio) and an orthogonal approximate message passing (OAMP)-EM-MMV algorithm is developed to solve this problem. Our simulation results verify the improved AUD and CE performance of the proposed scheme compared to conventional CS-based algorithms.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
Adaptive Resources Allocation CUSUM for Binomial Count Data Monitoring with Application to COVID-19 Hotspot Detection
Authors:
Jiuyun Hu,
Yajun Mei,
Sarah Holte,
Hao Yan
Abstract:
In this paper, we present an efficient statistical method (denoted as "Adaptive Resources Allocation CUSUM") to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted up…
▽ More
In this paper, we present an efficient statistical method (denoted as "Adaptive Resources Allocation CUSUM") to robustly and efficiently detect the hotspot with limited sampling resources. Our main idea is to combine the multi-arm bandit (MAB) and change-point detection methods to balance the exploration and exploitation of resource allocation for hotspot detection. Further, a Bayesian weighted update is used to update the posterior distribution of the infection rate. Then, the upper confidence bound (UCB) is used for resource allocation and planning. Finally, CUSUM monitoring statistics to detect the change point as well as the change location. For performance evaluation, we compare the performance of the proposed method with several benchmark methods in the literature and showed the proposed algorithm is able to achieve a lower detection delay and higher detection precision. Finally, this method is applied to hotspot detection in a real case study of county-level daily positive COVID-19 cases in Washington State WA) and demonstrates the effectiveness with very limited distributed samples.
△ Less
Submitted 17 August, 2022; v1 submitted 9 August, 2022;
originally announced August 2022.
-
Massive Access in Extra Large-Scale MIMO with Mixed-ADC over Near Field Channels
Authors:
Yikun Mei,
Zhen Gao,
De Mi,
Mingyu Zhou,
Dezhi Zheng,
Michail Matthaiou,
Pei Xiao,
Robert Schober
Abstract:
Massive connectivity for extra large-scale multi-input multi-output (XL-MIMO) systems is a challenging issue due to the near-field access channels and the prohibitive cost. In this paper, we propose an uplink grant-free massive access scheme for XL-MIMO systems, in which a mixed-analog-to-digital converters (ADC) architecture is adopted to strike the right balance between access performance and po…
▽ More
Massive connectivity for extra large-scale multi-input multi-output (XL-MIMO) systems is a challenging issue due to the near-field access channels and the prohibitive cost. In this paper, we propose an uplink grant-free massive access scheme for XL-MIMO systems, in which a mixed-analog-to-digital converters (ADC) architecture is adopted to strike the right balance between access performance and power consumption. By exploiting the spatial-domain structured sparsity and the piecewise angular-domain cluster sparsity of massive access channels, a compressive sensing (CS)-based two-stage orthogonal approximate message passing algorithm is proposed to efficiently solve the joint activity detection and channel estimation problem. Particularly, high-precision quantized measurements are leveraged to perform accurate hyper-parameter estimation, thereby facilitating the activity detection. Moreover, we adopt a subarray-wise estimation strategy to overcome the severe angular-domain energy dispersion problem which is caused by the near-field effect in XL-MIMO channels. Simulation results verify the superiority of our proposed algorithm over state-of-the-art CS algorithms for massive access based on XL-MIMO with mixed-ADC architectures.
△ Less
Submitted 3 April, 2023; v1 submitted 5 July, 2022;
originally announced July 2022.
-
The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models
Authors:
Yiling Luo,
Xiaoming Huo,
Yajun Mei
Abstract:
We study the Stochastic Gradient Descent (SGD) algorithm in nonparametric statistics: kernel regression in particular. The directional bias property of SGD, which is known in the linear regression setting, is generalized to the kernel regression. More specifically, we prove that SGD with moderate and annealing step-size converges along the direction of the eigenvector that corresponds to the large…
▽ More
We study the Stochastic Gradient Descent (SGD) algorithm in nonparametric statistics: kernel regression in particular. The directional bias property of SGD, which is known in the linear regression setting, is generalized to the kernel regression. More specifically, we prove that SGD with moderate and annealing step-size converges along the direction of the eigenvector that corresponds to the largest eigenvalue of the Gram matrix. In addition, the Gradient Descent (GD) with a moderate or small step-size converges along the direction that corresponds to the smallest eigenvalue. These facts are referred to as the directional bias properties; they may interpret how an SGD-computed estimator has a potentially smaller generalization error than a GD-computed estimator. The application of our theory is demonstrated by simulation studies and a case study that is based on the FashionMNIST dataset.
△ Less
Submitted 29 April, 2022;
originally announced May 2022.