-
Translatotron-V(ison): An End-to-End Model for In-Image Machine Translation
Authors:
Zhibin Lan,
Liqiang Niu,
Fandong Meng,
Jie Zhou,
Min Zhang,
**song Su
Abstract:
In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language. In this regard, conventional cascaded methods suffer from issues such as error propagation, massive parameters, and difficulties in deployment and retaining visual characteristics of the input image. Thus, constructing end-to-end models has be…
▽ More
In-image machine translation (IIMT) aims to translate an image containing texts in source language into an image containing translations in target language. In this regard, conventional cascaded methods suffer from issues such as error propagation, massive parameters, and difficulties in deployment and retaining visual characteristics of the input image. Thus, constructing end-to-end models has become an option, which, however, faces two main challenges: 1) the huge modeling burden, as it is required to simultaneously learn alignment across languages and preserve the visual characteristics of the input image; 2) the difficulties of directly predicting excessively lengthy pixel sequences. In this paper, we propose \textit{Translatotron-V(ision)}, an end-to-end IIMT model consisting of four modules. In addition to an image encoder, and an image decoder, our model contains a target text decoder and an image tokenizer. Among them, the target text decoder is used to alleviate the language alignment burden, and the image tokenizer converts long sequences of pixels into shorter sequences of visual tokens, preventing the model from focusing on low-level visual features. Besides, we present a two-stage training framework for our model to assist the model in learning alignment across modalities and languages. Finally, we propose a location-aware evaluation metric called Structure-BLEU to assess the translation quality of the generated images. Experimental results demonstrate that our model achieves competitive performance compared to cascaded models with only 70.9\% of parameters, and significantly outperforms the pixel-level end-to-end IIMT model.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection
Authors:
Zixing Li,
Chao Yan,
Zhen Lan,
Dengqing Tang,
Xiaojia Xiang,
Han Zhou,
Jun Lai
Abstract:
Advanced cognition can be extracted from the human brain using brain-computer interfaces. Integrating these interfaces with computer vision techniques, which possess efficient feature extraction capabilities, can achieve more robust and accurate detection of dim targets in aerial images. However, existing target detection methods primarily concentrate on homogeneous data, lacking efficient and ver…
▽ More
Advanced cognition can be extracted from the human brain using brain-computer interfaces. Integrating these interfaces with computer vision techniques, which possess efficient feature extraction capabilities, can achieve more robust and accurate detection of dim targets in aerial images. However, existing target detection methods primarily concentrate on homogeneous data, lacking efficient and versatile processing capabilities for heterogeneous multimodal data. In this paper, we first build a brain-eye-computer based object detection system for aerial images under few-shot conditions. This system detects suspicious targets using region proposal networks, evokes the event-related potential (ERP) signal in electroencephalogram (EEG) through the eye-tracking-based slow serial visual presentation (ESSVP) paradigm, and constructs the EEG-image data pairs with eye movement data. Then, an adaptive modality balanced online knowledge distillation (AMBOKD) method is proposed to recognize dim objects with the EEG-image data. AMBOKD fuses EEG and image features using a multi-head attention module, establishing a new modality with comprehensive features. To enhance the performance and robust capability of the fusion modality, simultaneous training and mutual learning between modalities are enabled by end-to-end online knowledge distillation. During the learning process, an adaptive modality balancing module is proposed to ensure multimodal equilibrium by dynamically adjusting the weights of the importance and the training gradients across various modalities. The effectiveness and superiority of our method are demonstrated by comparing it with existing state-of-the-art methods. Additionally, experiments conducted on public datasets and system validations in real-world scenarios demonstrate the reliability and practicality of the proposed system and the designed method.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
LASSI: An LLM-based Automated Self-Correcting Pipeline for Translating Parallel Scientific Codes
Authors:
Matthew T. Dearing,
Yiheng Tao,
Xingfu Wu,
Zhiling Lan,
Valerie Taylor
Abstract:
This paper addresses the problem of providing a novel approach to sourcing significant training data for LLMs focused on science and engineering. In particular, a crucial challenge is sourcing parallel scientific codes in the ranges of millions to billions of codes. To tackle this problem, we propose an automated pipeline framework, called LASSI, designed to translate between parallel programming…
▽ More
This paper addresses the problem of providing a novel approach to sourcing significant training data for LLMs focused on science and engineering. In particular, a crucial challenge is sourcing parallel scientific codes in the ranges of millions to billions of codes. To tackle this problem, we propose an automated pipeline framework, called LASSI, designed to translate between parallel programming languages by bootstrap** existing closed- or open-source LLMs. LASSI incorporates autonomous enhancement through self-correcting loops where errors encountered during compilation and execution of generated code are fed back to the LLM through guided prompting for debugging and refactoring. We highlight the bi-directional translation of existing GPU benchmarks between OpenMP target offload and CUDA to validate LASSI.
The results of evaluating LASSI with different application codes across four LLMs demonstrate the effectiveness of LASSI for generating executable parallel codes, with 80% of OpenMP to CUDA translations and 85% of CUDA to OpenMP translations producing the expected output. We also observe approximately 78% of OpenMP to CUDA translations and 62% of CUDA to OpenMP translations execute within 10% of or at a faster runtime than the original benchmark code in the same language.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models
Authors:
Yang Yan,
Lizhi Ma,
Anqi Li,
**gsong Ma,
Zhenzhong Lan
Abstract:
Accurate assessment of personality traits is crucial for effective psycho-counseling, yet traditional methods like self-report questionnaires are time-consuming and biased. This study exams whether Large Language Models (LLMs) can predict the Big Five personality traits directly from counseling dialogues and introduces an innovative framework to perform the task. Our framework applies role-play an…
▽ More
Accurate assessment of personality traits is crucial for effective psycho-counseling, yet traditional methods like self-report questionnaires are time-consuming and biased. This study exams whether Large Language Models (LLMs) can predict the Big Five personality traits directly from counseling dialogues and introduces an innovative framework to perform the task. Our framework applies role-play and questionnaire-based prompting to condition LLMs on counseling sessions, simulating client responses to the Big Five Inventory. We evaluated our framework on 853 real-world counseling sessions, finding a significant correlation between LLM-predicted and actual Big Five traits, proving the validity of framework. Moreover, ablation studies highlight the importance of role-play simulations and task simplification via questionnaires in enhancing prediction accuracy. Meanwhile, our fine-tuned Llama3-8B model, utilizing Direct Preference Optimization with Supervised Fine-Tuning, achieves a 130.95\% improvement, surpassing the state-of-the-art Qwen1.5-110B by 36.94\% in personality prediction validity. In conclusion, LLMs can predict personality based on counseling dialogues. Our code and model are publicly available at \url{https://github.com/kuri-leo/BigFive-LLM-Predictor}, providing a valuable tool for future research in computational psychometrics.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Modeling and Analysis of Application Interference on Dragonfly+
Authors:
Yao Kang,
Xin Wang,
Neil McGlohon,
Misbah Mubarak,
Sudheer Chunduri,
Zhiling Lan
Abstract:
Dragonfly class of networks are considered as promising interconnects for next-generation supercomputers. While Dragonfly+ networks offer more path diversity than the original Dragonfly design, they are still prone to performance variability due to their hierarchical architecture and resource sharing design. Event-driven network simulators are indispensable tools for navigating complex system desi…
▽ More
Dragonfly class of networks are considered as promising interconnects for next-generation supercomputers. While Dragonfly+ networks offer more path diversity than the original Dragonfly design, they are still prone to performance variability due to their hierarchical architecture and resource sharing design. Event-driven network simulators are indispensable tools for navigating complex system design. In this study, we quantitatively evaluate a variety of application communication interactions on a 3,456-node Dragonfly+ system by using the CODES toolkit. This study looks at the impact of communication interference from a user's perspective. Specifically, for a given application submitted by a user, we examine how this application will behave with the existing workload running in the system under different job placement policies. Our simulation study considers hundreds of experiment configurations including four target applications with representative communication patterns under a variety of network traffic conditions. Our study shows that intra-job interference can cause severe performance degradation for communication-intensive applications. Inter-job interference can generally be reduced for applications with one-to-one or one-to-many communication patterns through job isolation. Application with one-to-all communication pattern is resilient to network interference.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations
Authors:
Lichao Zhang,
Jia Yu,
Shuai Zhang,
Long Li,
Yangyang Zhong,
Guanbao Liang,
Yuming Yan,
Qing Ma,
Fangsheng Weng,
Fayu Pan,
**g Li,
Renjun Xu,
Zhenzhong Lan
Abstract:
Large Language Models (LLMs) have significantly advanced user-bot interactions, enabling more complex and coherent dialogues. However, the prevalent text-only modality might not fully exploit the potential for effective user engagement. This paper explores the impact of multi-modal interactions, which incorporate images and audio alongside text, on user engagement in chatbot conversations. We cond…
▽ More
Large Language Models (LLMs) have significantly advanced user-bot interactions, enabling more complex and coherent dialogues. However, the prevalent text-only modality might not fully exploit the potential for effective user engagement. This paper explores the impact of multi-modal interactions, which incorporate images and audio alongside text, on user engagement in chatbot conversations. We conduct a comprehensive analysis using a diverse set of chatbots and real-user interaction data, employing metrics such as retention rate and conversation length to evaluate user engagement. Our findings reveal a significant enhancement in user engagement with multi-modal interactions compared to text-only dialogues. Notably, the incorporation of a third modality significantly amplifies engagement beyond the benefits observed with just two modalities. These results suggest that multi-modal interactions optimize cognitive processing and facilitate richer information comprehension. This study underscores the importance of multi-modality in chatbot design, offering valuable insights for creating more engaging and immersive AI communication experiences and informing the broader AI community about the benefits of multi-modal interactions in enhancing user engagement.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
QCDGE database, Quantum Chemistry Database with Ground- and Excited-state Properties of 450 Kilo Molecules
Authors:
Yifei Zhu,
Mengge Li,
Chao Xu,
Zhenggang Lan
Abstract:
Due to rapid advancements in deep learning techniques, the demand for large-volume high-quality databases grows significantly in chemical research. We developed a quantum-chemistry database that includes 443,106 small organic molecules with sizes up to 10 heavy atoms including carbon (C), nitrogen (N), oxygen (O), and fluorine (F). Ground-state geometry optimizations and frequency calculations of…
▽ More
Due to rapid advancements in deep learning techniques, the demand for large-volume high-quality databases grows significantly in chemical research. We developed a quantum-chemistry database that includes 443,106 small organic molecules with sizes up to 10 heavy atoms including carbon (C), nitrogen (N), oxygen (O), and fluorine (F). Ground-state geometry optimizations and frequency calculations of all compounds were performed at the B3LYP/6-31G* level with the BJD3 dispersion correction, while the excited-state single-point calculations were conducted at the $ω$B97X-D/6-31G* level. Totally twenty seven molecular properties, such as geometric, thermodynamic, electronic and energetic properties, were gathered from these calculations. Meanwhile, we also established a comprehensive protocol for the construction of a high-volume quantum-chemistry database. Our QCDGE (Quantum Chemistry Database with Ground- and Excited-State Properties) database contains a substantial volume of data, exhibits high chemical diversity, and most importantly includes excited-state information. This database, along with its construction protocol, is expected to have a significant impact on the broad applications of machine learning studies across different fields of chemistry, especially in the area of excited-state research.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Coherent Control of Spontaneous Emission for a giant driven $Λ$-type three-level atom
Authors:
Yang ya,
Sun ge,
Li **g,
Lu **g,
Zhou lan
Abstract:
Quantum optics with giant atoms provides a new approach for implementing optical memory devices at the atomic scale. Here, we theoretically study the relaxation dynamics of a single driven three-level atom interacting with a one-dimensional waveguide, via two coupling points. Under certain conditions, after the long-time dynamics, we found that the population of giant atom can either maintain stab…
▽ More
Quantum optics with giant atoms provides a new approach for implementing optical memory devices at the atomic scale. Here, we theoretically study the relaxation dynamics of a single driven three-level atom interacting with a one-dimensional waveguide, via two coupling points. Under certain conditions, after the long-time dynamics, we found that the population of giant atom can either maintain stable values or exhibit regular periodic oscillation behavior, while photons can be trapped in the region of giant atoms. This phenomenon is not achievable using a two-level atom with two legs. It is worth noting that the atomic excitation probability of a stable bound state is a constant value, which is determined by the size of the atom. Crucially, the size of the atom (the distance between the two coupling points) is much larger than the wavelength of the light field, which is a necessary condition for the existence of oscillating bound states.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
FocSAM: Delving Deeply into Focused Objects in Segmenting Anything
Authors:
You Huang,
Zongyu Lan,
Liujuan Cao,
Xianming Lin,
Shengchuan Zhang,
Guannan Jiang,
Rongrong Ji
Abstract:
The Segment Anything Model (SAM) marks a notable milestone in segmentation models, highlighted by its robust zero-shot capabilities and ability to handle diverse prompts. SAM follows a pipeline that separates interactive segmentation into image preprocessing through a large encoder and interactive inference via a lightweight decoder, ensuring efficient real-time performance. However, SAM faces sta…
▽ More
The Segment Anything Model (SAM) marks a notable milestone in segmentation models, highlighted by its robust zero-shot capabilities and ability to handle diverse prompts. SAM follows a pipeline that separates interactive segmentation into image preprocessing through a large encoder and interactive inference via a lightweight decoder, ensuring efficient real-time performance. However, SAM faces stability issues in challenging samples upon this pipeline. These issues arise from two main factors. Firstly, the image preprocessing disables SAM from dynamically using image-level zoom-in strategies to refocus on the target object during interaction. Secondly, the lightweight decoder struggles to sufficiently integrate interactive information with image embeddings. To address these two limitations, we propose FocSAM with a pipeline redesigned on two pivotal aspects. First, we propose Dynamic Window Multi-head Self-Attention (Dwin-MSA) to dynamically refocus SAM's image embeddings on the target object. Dwin-MSA localizes attention computations around the target object, enhancing object-related embeddings with minimal computational overhead. Second, we propose Pixel-wise Dynamic ReLU (P-DyReLU) to enable sufficient integration of interactive information from a few initial clicks that have significant impacts on the overall segmentation results. Experimentally, FocSAM augments SAM's interactive segmentation performance to match the existing state-of-the-art method in segmentation quality, requiring only about 5.6% of this method's inference time on CPUs.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
A Survey on Multi-modal Machine Translation: Tasks, Methods and Challenges
Authors:
Huangjun Shen,
Liangying Shao,
Wenbo Li,
Zhibin Lan,
Zhanyu Liu,
**song Su
Abstract:
In recent years, multi-modal machine translation has attracted significant interest in both academia and industry due to its superior performance. It takes both textual and visual modalities as inputs, leveraging visual context to tackle the ambiguities in source texts. In this paper, we begin by offering an exhaustive overview of 99 prior works, comprehensively summarizing representative studies…
▽ More
In recent years, multi-modal machine translation has attracted significant interest in both academia and industry due to its superior performance. It takes both textual and visual modalities as inputs, leveraging visual context to tackle the ambiguities in source texts. In this paper, we begin by offering an exhaustive overview of 99 prior works, comprehensively summarizing representative studies from the perspectives of dominant models, datasets, and evaluation metrics. Afterwards, we analyze the impact of various factors on model performance and finally discuss the possible research directions for this task in the future. Over time, multi-modal machine translation has developed more types to meet diverse needs. Unlike previous surveys confined to the early stage of multi-modal machine translation, our survey thoroughly concludes these emerging types from different aspects, so as to provide researchers with a better understanding of its current state.
△ Less
Submitted 22 May, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models
Authors:
Zhengxing Lan,
Hongbo Li,
Lingshan Liu,
Bo Fan,
Yisheng Lv,
Yilong Ren,
Zhiyong Cui
Abstract:
Predicting the future trajectories of dynamic traffic actors is a cornerstone task in autonomous driving. Though existing notable efforts have resulted in impressive performance improvements, a gap persists in scene cognitive and understanding of the complex traffic semantics. This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) without explici…
▽ More
Predicting the future trajectories of dynamic traffic actors is a cornerstone task in autonomous driving. Though existing notable efforts have resulted in impressive performance improvements, a gap persists in scene cognitive and understanding of the complex traffic semantics. This paper proposes Traj-LLM, the first to investigate the potential of using Large Language Models (LLMs) without explicit prompt engineering to generate future motion from agents' past/observed trajectories and scene semantics. Traj-LLM starts with sparse context joint coding to dissect the agent and scene features into a form that LLMs understand. On this basis, we innovatively explore LLMs' powerful comprehension abilities to capture a spectrum of high-level scene knowledge and interactive information. Emulating the human-like lane focus cognitive function and enhancing Traj-LLM's scene comprehension, we introduce lane-aware probabilistic learning powered by the pioneering Mamba module. Finally, a multi-modal Laplace decoder is designed to achieve scene-compliant multi-modal predictions. Extensive experiments manifest that Traj-LLM, fortified by LLMs' strong prior knowledge and understanding prowess, together with lane-aware probability learning, outstrips state-of-the-art methods across evaluation metrics. Moreover, the few-shot analysis further substantiates Traj-LLM's performance, wherein with just 50% of the dataset, it outperforms the majority of benchmarks relying on complete data utilization. This study explores equip** the trajectory prediction task with advanced capabilities inherent in LLMs, furnishing a more universal and adaptable solution for forecasting agent motion in a new way.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Integrating Mamba and Transformer for Long-Short Range Time Series Forecasting
Authors:
Xiongxiao Xu,
Yueqing Liang,
Baixiang Huang,
Zhiling Lan,
Kai Shu
Abstract:
Time series forecasting is an important problem and plays a key role in a variety of applications including weather forecasting, stock market, and scientific simulations. Although transformers have proven to be effective in capturing dependency, its quadratic complexity of attention mechanism prevents its further adoption in long-range time series forecasting, thus limiting them attend to short-ra…
▽ More
Time series forecasting is an important problem and plays a key role in a variety of applications including weather forecasting, stock market, and scientific simulations. Although transformers have proven to be effective in capturing dependency, its quadratic complexity of attention mechanism prevents its further adoption in long-range time series forecasting, thus limiting them attend to short-range range. Recent progress on state space models (SSMs) have shown impressive performance on modeling long range dependency due to their subquadratic complexity. Mamba, as a representative SSM, enjoys linear time complexity and has achieved strong scalability on tasks that requires scaling to long sequences, such as language, audio, and genomics. In this paper, we propose to leverage a hybrid framework Mambaformer that internally combines Mamba for long-range dependency, and Transformer for short range dependency, for long-short range forecasting. To the best of our knowledge, this is the first paper to combine Mamba and Transformer architecture in time series data. We investigate possible hybrid architectures to combine Mamba layer and attention layer for long-short range time series forecasting. The comparative study shows that the Mambaformer family can outperform Mamba and Transformer in long-short range time series forecasting problem. The code is available at https://github.com/XiongxiaoXu/Mambaformerin-Time-Series.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
No General Code of Ethics for All: Ethical Considerations in Human-bot Psycho-counseling
Authors:
Lizhi Ma,
Tong Zhao,
Huachuan Qiu,
Zhenzhong Lan
Abstract:
The pervasive use of AI applications is increasingly influencing our everyday decisions. However, the ethical challenges associated with AI transcend conventional ethics and single-discipline approaches. In this paper, we propose aspirational ethical principles specifically tailored for human-bot psycho-counseling during an era when AI-powered mental health services are continually emerging. We ex…
▽ More
The pervasive use of AI applications is increasingly influencing our everyday decisions. However, the ethical challenges associated with AI transcend conventional ethics and single-discipline approaches. In this paper, we propose aspirational ethical principles specifically tailored for human-bot psycho-counseling during an era when AI-powered mental health services are continually emerging. We examined the responses generated by EVA2.0, GPT-3.5, and GPT-4.0 in the context of psycho-counseling and mental health inquiries. Our analysis focused on standard psycho-counseling ethical codes (respect for autonomy, non-maleficence, beneficence, justice, and responsibility) as well as crisis intervention strategies (risk assessment, involvement of emergency services, and referral to human professionals). The results indicate that although there has been progress in adhering to regular ethical codes as large language models (LLMs) evolve, the models' capabilities in handling crisis situations need further improvement. Additionally, we assessed the linguistic quality of the generated responses and found that misleading responses are still produced by the models. Furthermore, the ability of LLMs to encourage individuals to introspect in the psycho-counseling setting remains underdeveloped.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
Rethink Arbitrary Style Transfer with Transformer and Contrastive Learning
Authors:
Zhanjie Zhang,
Jiakai Sun,
Guangyuan Li,
Lei Zhao,
Quanwei Zhang,
Zehua Lan,
Haolin Yin,
Wei Xing,
Huaizhong Lin,
Zhiwen Zuo
Abstract:
Arbitrary style transfer holds widespread attention in research and boasts numerous practical applications. The existing methods, which either employ cross-attention to incorporate deep style attributes into content attributes or use adaptive normalization to adjust content features, fail to generate high-quality stylized images. In this paper, we introduce an innovative technique to improve the q…
▽ More
Arbitrary style transfer holds widespread attention in research and boasts numerous practical applications. The existing methods, which either employ cross-attention to incorporate deep style attributes into content attributes or use adaptive normalization to adjust content features, fail to generate high-quality stylized images. In this paper, we introduce an innovative technique to improve the quality of stylized images. Firstly, we propose Style Consistency Instance Normalization (SCIN), a method to refine the alignment between content and style features. In addition, we have developed an Instance-based Contrastive Learning (ICL) approach designed to understand the relationships among various styles, thereby enhancing the quality of the resulting stylized images. Recognizing that VGG networks are more adept at extracting classification features and need to be better suited for capturing style features, we have also introduced the Perception Encoder (PE) to capture style features. Extensive experiments demonstrate that our proposed method generates high-quality stylized images and effectively prevents artifacts compared with the existing state-of-the-art methods.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Realizing Laser-driven Deuteron Acceleration with Low Energy Spread via In-situ D$_2$O-deposited Target
Authors:
Tianyun Wei,
Yasunobu Arikawa,
Seyed Reza Mirfayzi,
Yanjun Gu,
Takehito Hayakawa,
Alessio Morace,
Kunioki Mima,
Zechen Lan,
Ryuya Yamada,
Kohei Yamanoi,
Koichi Honda,
Sergei V. Bulanov,
Akifumi Yogo
Abstract:
Generation of quasi-monoenergetic ion pulse by laser-driven acceleration is one of the hot topics in laser plasma physics. In this study, we present a new method for the \textit{In-situ} deposition of an ultra-thin D$_2$O layer on the surface of an aluminum foil target utilizing a spherical D$_2$O capsule. Employing a 10$^{19}$ W/cm$^2$ laser, we achieve the acceleration of 10.8 MeV deuterons with…
▽ More
Generation of quasi-monoenergetic ion pulse by laser-driven acceleration is one of the hot topics in laser plasma physics. In this study, we present a new method for the \textit{In-situ} deposition of an ultra-thin D$_2$O layer on the surface of an aluminum foil target utilizing a spherical D$_2$O capsule. Employing a 10$^{19}$ W/cm$^2$ laser, we achieve the acceleration of 10.8 MeV deuterons with an energy spread of $Δ$E/E = 4.6% in the most favorable shot. The energy spread depends on the exposure time of the D$_2$O capsule in the vacuum chamber. This method has the potential to extend its applicability to other ion species.
△ Less
Submitted 1 June, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
Union: An Automatic Workload Manager for Accelerating Network Simulation
Authors:
Xin Wang,
Misbah Mubarak,
Yao Kang,
Robert B. Ross,
Zhiling Lan
Abstract:
With the rapid growth of the machine learning applications, the workloads of future HPC systems are anticipated to be a mix of scientific simulation, big data analytics, and machine learning applications. Simulation is a great research vehicle to understand the performance implications of co-running scientific applications with big data and machine learning workloads on large-scale systems. In thi…
▽ More
With the rapid growth of the machine learning applications, the workloads of future HPC systems are anticipated to be a mix of scientific simulation, big data analytics, and machine learning applications. Simulation is a great research vehicle to understand the performance implications of co-running scientific applications with big data and machine learning workloads on large-scale systems. In this paper, we present Union, a workload manager that provides an automatic framework to facilitate hybrid workload simulation in CODES. Furthermore, we use Union, along with CODES, to investigate various hybrid workloads composed of traditional simulation applications and emerging learning applications on two dragonfly systems. The experiment results show that both message latency and communication time are important performance metrics to evaluate network interference. Network interference on HPC applications is more reflected by the message latency variation, whereas ML application performance depends more on the communication time.
△ Less
Submitted 3 April, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Q-adaptive: A Multi-Agent Reinforcement Learning Based Routing on Dragonfly Network
Authors:
Yao Kang,
Xin Wang,
Zhiling Lan
Abstract:
High-radix interconnects such as Dragonfly and its variants rely on adaptive routing to balance network traffic for optimum performance. Ideally, adaptive routing attempts to forward packets between minimal and non-minimal paths with the least congestion. In practice, current adaptive routing algorithms estimate routing path congestion based on local information such as output queue occupancy. Usi…
▽ More
High-radix interconnects such as Dragonfly and its variants rely on adaptive routing to balance network traffic for optimum performance. Ideally, adaptive routing attempts to forward packets between minimal and non-minimal paths with the least congestion. In practice, current adaptive routing algorithms estimate routing path congestion based on local information such as output queue occupancy. Using local information to estimate global path congestion is inevitably inaccurate because a router has no precise knowledge of link states a few hops away. This inaccuracy could lead to interconnect congestion. In this study, we present Q-adaptive routing, a multi-agent reinforcement learning routing scheme for Dragonfly systems. Q-adaptive routing enables routers to learn to route autonomously by leveraging advanced reinforcement learning technology. The proposed Q-adaptive routing is highly scalable thanks to its fully distributed nature without using any shared information between routers. Furthermore, a new two-level Q-table is designed for Q-adaptive to make it computational lightly and saves 50% of router memory usage compared with the previous Q-routing. We implement the proposed Q-adaptive routing in SST/Merlin simulator. Our evaluation results show that Q-adaptive routing achieves up to 10.5% system throughput improvement and 5.2x average packet latency reduction compared with adaptive routing algorithms. Remarkably, Q-adaptive can even outperform the optimal VALn non-minimal routing under the ADV+1 adversarial traffic pattern with up to 3% system throughput improvement and 75% average packet latency reduction.
△ Less
Submitted 3 April, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
MRSch: Multi-Resource Scheduling for HPC
Authors:
Boyang Li,
Yu** Fan,
Matthew Dearing,
Zhiling Lan,
Paul Richy,
William Allcocky,
Michael Papka
Abstract:
Emerging workloads in high-performance computing (HPC) are embracing significant changes, such as having diverse resource requirements instead of being CPU-centric. This advancement forces cluster schedulers to consider multiple schedulable resources during decision-making. Existing scheduling studies rely on heuristic or optimization methods, which are limited by an inability to adapt to new scen…
▽ More
Emerging workloads in high-performance computing (HPC) are embracing significant changes, such as having diverse resource requirements instead of being CPU-centric. This advancement forces cluster schedulers to consider multiple schedulable resources during decision-making. Existing scheduling studies rely on heuristic or optimization methods, which are limited by an inability to adapt to new scenarios for ensuring long-term scheduling performance. We present an intelligent scheduling agent named MRSch for multi-resource scheduling in HPC that leverages direct future prediction (DFP), an advanced multi-objective reinforcement learning algorithm. While DFP demonstrated outstanding performance in a gaming competition, it has not been previously explored in the context of HPC scheduling. Several key techniques are developed in this study to tackle the challenges involved in multi-resource scheduling. These techniques enable MRSch to learn an appropriate scheduling policy automatically and dynamically adapt its policy in response to workload changes via dynamic resource prioritizing. We compare MRSch with existing scheduling methods through extensive tracebase simulations. Our results demonstrate that MRSch improves scheduling performance by up to 48% compared to the existing scheduling methods.
△ Less
Submitted 3 April, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling
Authors:
Boyang Li,
Zhiling Lan,
Michael E. Papka
Abstract:
In the field of high-performance computing (HPC), there has been recent exploration into the use of deep reinforcement learning for cluster scheduling (DRL scheduling), which has demonstrated promising outcomes. However, a significant challenge arises from the lack of interpretability in deep neural networks (DNN), rendering them as black-box models to system managers. This lack of model interpret…
▽ More
In the field of high-performance computing (HPC), there has been recent exploration into the use of deep reinforcement learning for cluster scheduling (DRL scheduling), which has demonstrated promising outcomes. However, a significant challenge arises from the lack of interpretability in deep neural networks (DNN), rendering them as black-box models to system managers. This lack of model interpretability hinders the practical deployment of DRL scheduling. In this work, we present a framework called IRL (Interpretable Reinforcement Learning) to address the issue of interpretability of DRL scheduling. The core idea is to interpret DNN (i.e., the DRL policy) as a decision tree by utilizing imitation learning. Unlike DNN, decision tree models are non-parametric and easily comprehensible to humans. To extract an effective and efficient decision tree, IRL incorporates the Dataset Aggregation (DAgger) algorithm and introduces the notion of critical state to prune the derived decision tree. Through trace-based experiments, we demonstrate that IRL is capable of converting a black-box DNN policy into an interpretable rulebased decision tree while maintaining comparable scheduling performance. Additionally, IRL can contribute to the setting of rewards in DRL scheduling.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
Study of Workload Interference with Intelligent Routing on Dragonfly
Authors:
Yao Kang,
Xin Wang,
Zhiling Lan
Abstract:
Dragonfly interconnect is a crucial network technology for supercomputers. To support exascale systems, network resources are shared such that links and routers are not dedicated to any node pair. While link utilization is increased, workload performance is often offset by network contention. Recently, intelligent routing built on reinforcement learning demonstrates higher network throughput with…
▽ More
Dragonfly interconnect is a crucial network technology for supercomputers. To support exascale systems, network resources are shared such that links and routers are not dedicated to any node pair. While link utilization is increased, workload performance is often offset by network contention. Recently, intelligent routing built on reinforcement learning demonstrates higher network throughput with lower packet latency. However, its effectiveness in reducing workload interference is unknown. In this work, we present extensive network simulations to study multi-workload contention under different routing mechanisms, intelligent routing and adaptive routing, on a large-scale Dragonfly system. We develop an enhanced network simulation toolkit, along with a suite of workloads with distinctive communication patterns. We also present two metrics to characterize application communication intensity. Our analysis focuses on examining how different workloads interfere with each other under different routing mechanisms by inspecting both application-level and network-level metrics. Several key insights are made from the analysis.
△ Less
Submitted 3 April, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models
Authors:
Huachuan Qiu,
Shuai Zhang,
Hongliang He,
Anqi Li,
Zhenzhong Lan
Abstract:
Pornographic content occurring in human-machine interaction dialogues can cause severe side effects for users in open-domain dialogue systems. However, research on detecting pornographic language within human-machine interaction dialogues is an important subject that is rarely studied. To advance in this direction, we introduce CensorChat, a dialogue monitoring dataset aimed at detecting whether t…
▽ More
Pornographic content occurring in human-machine interaction dialogues can cause severe side effects for users in open-domain dialogue systems. However, research on detecting pornographic language within human-machine interaction dialogues is an important subject that is rarely studied. To advance in this direction, we introduce CensorChat, a dialogue monitoring dataset aimed at detecting whether the dialogue session contains pornographic content. To this end, we collect real-life human-machine interaction dialogues in the wild and break them down into single utterances and single-turn dialogues, with the last utterance spoken by the chatbot. We propose utilizing knowledge distillation of large language models to annotate the dataset. Specifically, first, the raw dataset is annotated by four open-source large language models, with the majority vote determining the label. Second, we use ChatGPT to update the empty label from the first step. Third, to ensure the quality of the validation and test sets, we utilize GPT-4 for label calibration. If the current label does not match the one generated by GPT-4, we employ a self-criticism strategy to verify its correctness. Finally, to facilitate the detection of pornographic text, we develop a series of text classifiers using a pseudo-labeled dataset. Detailed data analysis demonstrates that leveraging knowledge distillation techniques with large language models provides a practical and cost-efficient method for develo** pornographic text detectors.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Development of neutron beamline for laser-driven neutron resonance spectroscopy
Authors:
Zechen Lan,
Yasunobu Arikawa,
Alessio Morace,
Yuki Abe,
S. Reza Mirfayzi,
Tianyun Wei,
Takehito Hayakawa,
Akifumi Yogo
Abstract:
Recent progress of laser science provides laser-driven neutron source (LDNS), which has remarkable features such as the short pulse width. One of the key techniques to be developed for more efficient use of the LDNS is neutron collimation tubes to increase the number of neutrons arriving at a detector in the time-of-flight method. However, when a tube with a thick wall is used as a collimator the…
▽ More
Recent progress of laser science provides laser-driven neutron source (LDNS), which has remarkable features such as the short pulse width. One of the key techniques to be developed for more efficient use of the LDNS is neutron collimation tubes to increase the number of neutrons arriving at a detector in the time-of-flight method. However, when a tube with a thick wall is used as a collimator the neutron collection efficiency at the detector increases but the time resolution becomes wider because of multiple scattering inside of the tube. In the present study, we have developed a collimation tube made of Ni-0, which is optimized for the increased neutron collection efficiency and a reasonable time resolution. This collimator has been demonstrated experimentally using neutron resonance spectroscopy with neutrons provided from LFEX laser.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Spin and Orbital Angular Momenta of Electromagnetic Waves: From Classical to Quantum Forms
Authors:
Wei E. I. Sha,
Zhihao Lan,
Menglin L. N. Chen,
Yongpin P. Chen,
Sheng Sun
Abstract:
Angular momenta of electromagnetic waves are important both in concepts and applications. In this work, we systematically discuss two types of angular momenta, i.e., spin angular momentum and orbital angular momentum in various cases, e.g., with source and without source, in classical and quantum forms. Numerical results demonstrating how to extract the topological charge of a classical vortex bea…
▽ More
Angular momenta of electromagnetic waves are important both in concepts and applications. In this work, we systematically discuss two types of angular momenta, i.e., spin angular momentum and orbital angular momentum in various cases, e.g., with source and without source, in classical and quantum forms. Numerical results demonstrating how to extract the topological charge of a classical vortex beam by spectral method are also presented.
△ Less
Submitted 3 March, 2024;
originally announced March 2024.
-
Automatic Evaluation for Mental Health Counseling using LLMs
Authors:
Anqi Li,
Yu Lu,
Nirui Song,
Shuai Zhang,
Lizhi Ma,
Zhenzhong Lan
Abstract:
High-quality psychological counseling is crucial for mental health worldwide, and timely evaluation is vital for ensuring its effectiveness. However, obtaining professional evaluation for each counseling session is expensive and challenging. Existing methods that rely on self or third-party manual reports to assess the quality of counseling suffer from subjective biases and limitations of time-con…
▽ More
High-quality psychological counseling is crucial for mental health worldwide, and timely evaluation is vital for ensuring its effectiveness. However, obtaining professional evaluation for each counseling session is expensive and challenging. Existing methods that rely on self or third-party manual reports to assess the quality of counseling suffer from subjective biases and limitations of time-consuming.
To address above challenges, this paper proposes an innovative and efficient automatic approach using large language models (LLMs) to evaluate the working alliance in counseling conversations. We collected a comprehensive counseling dataset and conducted multiple third-party evaluations based on therapeutic relationship theory. Our LLM-based evaluation, combined with our guidelines, shows high agreement with human evaluations and provides valuable insights into counseling scripts. This highlights the potential of LLMs as supervisory tools for psychotherapists. By integrating LLMs into the evaluation process, our approach offers a cost-effective and dependable means of assessing counseling quality, enhancing overall effectiveness.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Unveiling the Secrets of Engaging Conversations: Factors that Keep Users Hooked on Role-Playing Dialog Agents
Authors:
Shuai Zhang,
Yu Lu,
Junwen Liu,
Jia Yu,
Huachuan Qiu,
Yuming Yan,
Zhenzhong Lan
Abstract:
With the growing humanlike nature of dialog agents, people are now engaging in extended conversations that can stretch from brief moments to substantial periods of time. Understanding the factors that contribute to sustaining these interactions is crucial, yet existing studies primarily focusing on short-term simulations that rarely explore such prolonged and real conversations.
In this paper, w…
▽ More
With the growing humanlike nature of dialog agents, people are now engaging in extended conversations that can stretch from brief moments to substantial periods of time. Understanding the factors that contribute to sustaining these interactions is crucial, yet existing studies primarily focusing on short-term simulations that rarely explore such prolonged and real conversations.
In this paper, we investigate the factors influencing retention rates in real interactions with roleplaying models. By analyzing a large dataset of interactions between real users and thousands of characters, we systematically examine multiple factors and assess their impact on user retention rate. Surprisingly, we find that the degree to which the bot embodies the roles it plays has limited influence on retention rates, while the length of each turn it speaks significantly affects retention rates. This study sheds light on the critical aspects of user engagement with role-playing models and provides valuable insights for future improvements in the development of large language models for role-playing purposes.
△ Less
Submitted 12 March, 2024; v1 submitted 18 February, 2024;
originally announced February 2024.
-
The photodissociation dynamics and ultrafast electron diffraction image of cyclobutanone from the surface hop** dynamics simulation
Authors:
Jiawei Peng,
Hong Liu,
Zhenggang Lan
Abstract:
The comprehension of nonadiabatic dynamics in polyatomic systems relies heavily on the simultaneous advancements in theoretical and experimental domains. The gas-phase electron diffraction (GUED) technique has attracted widespread attention as a promising tool for observing the photochemical and photophysical features at all-atomic level with high temporal and spatial resolutions. In this work, th…
▽ More
The comprehension of nonadiabatic dynamics in polyatomic systems relies heavily on the simultaneous advancements in theoretical and experimental domains. The gas-phase electron diffraction (GUED) technique has attracted widespread attention as a promising tool for observing the photochemical and photophysical features at all-atomic level with high temporal and spatial resolutions. In this work, the GUED spectra were predicted to perform a double-blind test of accuracy in excited-state simulation for cyclobutanone based on the trajectory surface hop** method, with respect to the benchmark data obtained by upcoming MeV-GUED experiments at the Stanfold Linear Accelerator Laboratory. The results show that the ultrafast nonadiabatic dynamics occurs in the photoinduced dynamics, and two C2 and C3 channels play dominant roles in the nonadiabatic reactions of cyclobutanone. The simulated UED signal can be directly interpreted by atomic movements, providing a unique view to monitor the time-dependent evolution of the molecular structure in the femtosecond dynamics.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Retrosynthesis Prediction via Search in (Hyper) Graph
Authors:
Zixun Lan,
Binjie Hong,
Jiajun Zhu,
Zuo Zeng,
Zhenfu Liu,
Limin Yu,
Fei Ma
Abstract:
Predicting reactants from a specified core product stands as a fundamental challenge within organic synthesis, termed retrosynthesis prediction. Recently, semi-template-based methods and graph-edits-based methods have achieved good performance in terms of both interpretability and accuracy. However, due to their mechanisms these methods cannot predict complex reactions, e.g., reactions with multip…
▽ More
Predicting reactants from a specified core product stands as a fundamental challenge within organic synthesis, termed retrosynthesis prediction. Recently, semi-template-based methods and graph-edits-based methods have achieved good performance in terms of both interpretability and accuracy. However, due to their mechanisms these methods cannot predict complex reactions, e.g., reactions with multiple reaction center or attaching the same leaving group to more than one atom. In this study we propose a semi-template-based method, the \textbf{Retro}synthesis via \textbf{S}earch \textbf{i}n (Hyper) \textbf{G}raph (RetroSiG) framework to alleviate these limitations. In the proposed method, we turn the reaction center identification and the leaving group completion tasks as tasks of searching in the product molecular graph and leaving group hypergraph respectively. As a semi-template-based method RetroSiG has several advantages. First, RetroSiG is able to handle the complex reactions mentioned above by its novel search mechanism. Second, RetroSiG naturally exploits the hypergraph to model the implicit dependencies between leaving groups. Third, RetroSiG makes full use of the prior, i.e., one-hop constraint. It reduces the search space and enhances overall performance. Comprehensive experiments demonstrated that RetroSiG achieved competitive results. Furthermore, we conducted experiments to show the capability of RetroSiG in predicting complex reactions. Ablation experiments verified the efficacy of specific elements, such as the one-hop constraint and the leaving group hypergraph.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
Authors:
Hongliang He,
Wenlin Yao,
Kaixin Ma,
Wenhao Yu,
Yong Dai,
Hongming Zhang,
Zhenzhong Lan,
Dong Yu
Abstract:
The rapid advancement of large language models (LLMs) has led to a new era marked by the development of autonomous applications in real-world scenarios, which drives innovation in creating advanced web agents. Existing web agents typically only handle one input modality and are evaluated only in simplified web simulators or static web snapshots, greatly limiting their applicability in real-world s…
▽ More
The rapid advancement of large language models (LLMs) has led to a new era marked by the development of autonomous applications in real-world scenarios, which drives innovation in creating advanced web agents. Existing web agents typically only handle one input modality and are evaluated only in simplified web simulators or static web snapshots, greatly limiting their applicability in real-world scenarios. To bridge this gap, we introduce WebVoyager, an innovative Large Multimodal Model (LMM) powered web agent that can complete user instructions end-to-end by interacting with real-world websites. Moreover, we establish a new benchmark by compiling real-world tasks from 15 popular websites and introduce an automatic evaluation protocol leveraging multimodal understanding abilities of GPT-4V to evaluate open-ended web agents. We show that WebVoyager achieves a 59.1% task success rate on our benchmark, significantly surpassing the performance of both GPT-4 (All Tools) and the WebVoyager (text-only) setups, underscoring the exceptional capability of WebVoyager. The proposed automatic evaluation metric achieves 85.3% agreement with human judgment, indicating its effectiveness in providing reliable and accurate assessments of web agents.
△ Less
Submitted 6 June, 2024; v1 submitted 24 January, 2024;
originally announced January 2024.
-
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents
Authors:
Chang Ma,
Junlei Zhang,
Zhihao Zhu,
Cheng Yang,
Yujiu Yang,
Yaohui **,
Zhenzhong Lan,
Lingpeng Kong,
Junxian He
Abstract:
Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications. However, the evaluation process presents substantial challenges. A primary obstacle is the benchmarking of agent performance across diverse scenarios within a unified framework, especially in maintaining partially-observ…
▽ More
Evaluating large language models (LLMs) as general-purpose agents is essential for understanding their capabilities and facilitating their integration into practical applications. However, the evaluation process presents substantial challenges. A primary obstacle is the benchmarking of agent performance across diverse scenarios within a unified framework, especially in maintaining partially-observable environments and ensuring multi-round interactions. Moreover, current evaluation frameworks mostly focus on the final success rate, revealing few insights during the process and failing to provide a deep understanding of the model abilities. To address these challenges, we introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents. AgentBoard offers a fine-grained progress rate metric that captures incremental advancements as well as a comprehensive evaluation toolkit that features easy assessment of agents for multi-faceted analysis through interactive visualization. This not only sheds light on the capabilities and limitations of LLM agents but also propels the interpretability of their performance to the forefront. Ultimately, AgentBoard serves as a significant step towards demystifying agent behaviors and accelerating the development of stronger LLM agents.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Ultrafast Excited-State Energy Transfer in Phenylene Ethynylene Dendrimer: Quantum Dynamics with Tensor Network Method
Authors:
Sisi Liu,
Jiawei Peng,
Peng Bao,
Qiang Shi,
Zhenggang Lan
Abstract:
Photo-induced excited-state energy transfer (EET) processes play an important role in the solar energy conversions. The phenylene ethynylene (PE) dendrimers display great potential in improving the efficiency of solar cells, because of their excellent photo-harvesting and exciton-transport properties. In this work, we investigated the intramolecular EET dynamics in a dendrimer composed of two line…
▽ More
Photo-induced excited-state energy transfer (EET) processes play an important role in the solar energy conversions. The phenylene ethynylene (PE) dendrimers display great potential in improving the efficiency of solar cells, because of their excellent photo-harvesting and exciton-transport properties. In this work, we investigated the intramolecular EET dynamics in a dendrimer composed of two linear PE units (2-ring and 3-ring) using the full quantum dynamics based on the tensor network method. We first constructed a diabatic model Hamiltonian based on the electronic structure calculations. Using this diabatic vibronic coupling model, we tried to obtain the main features of the EET dynamics in terms of the several diabatic models with different numbers of vibrational modes (from 4 modes to 129 modes) and to explore the corresponding vibronic coupling interactions. The results show that the EET in the current PE dendrimer is an ultrafast process. Four modes with A' symmetry play dominant roles in the dynamics, other 86 modes with A' symmetry can damp the electronic coherence, and the modes of A" symmetry do not show the significant influence on the EET process. Overall, the first-order intrastate vibronic coupling terms show the dominant roles in the EET dynamics, while the second-order intrastate vibronic coupling terms give the visible impact here by dam** the electronic coherence and slowing down the overall EET process. This work provides a valuable understanding of the physical insight in the EET dynamics of PE dendrimers.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Deep Reinforcement Learning for Quantitative Trading
Authors:
Maochun Xu,
Zixun Lan,
Zheng Tao,
Jiawei Du,
Zongao Ye
Abstract:
Artificial Intelligence (AI) and Machine Learning (ML) are transforming the domain of Quantitative Trading (QT) through the deployment of advanced algorithms capable of sifting through extensive financial datasets to pinpoint lucrative investment openings. AI-driven models, particularly those employing ML techniques such as deep learning and reinforcement learning, have shown great prowess in pred…
▽ More
Artificial Intelligence (AI) and Machine Learning (ML) are transforming the domain of Quantitative Trading (QT) through the deployment of advanced algorithms capable of sifting through extensive financial datasets to pinpoint lucrative investment openings. AI-driven models, particularly those employing ML techniques such as deep learning and reinforcement learning, have shown great prowess in predicting market trends and executing trades at a speed and accuracy that far surpass human capabilities. Its capacity to automate critical tasks, such as discerning market conditions and executing trading strategies, has been pivotal. However, persistent challenges exist in current QT methods, especially in effectively handling noisy and high-frequency financial data. Striking a balance between exploration and exploitation poses another challenge for AI-driven trading agents. To surmount these hurdles, our proposed solution, QTNet, introduces an adaptive trading model that autonomously formulates QT strategies through an intelligent trading agent. Incorporating deep reinforcement learning (DRL) with imitative learning methodologies, we bolster the proficiency of our model. To tackle the challenges posed by volatile financial datasets, we conceptualize the QT mechanism within the framework of a Partially Observable Markov Decision Process (POMDP). Moreover, by embedding imitative learning, the model can capitalize on traditional trading tactics, nurturing a balanced synergy between discovery and utilization. For a more realistic simulation, our trading agent undergoes training using minute-frequency data sourced from the live financial market. Experimental findings underscore the model's proficiency in extracting robust market features and its adaptability to diverse market conditions.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Isotropic gap formation, localization, and waveguiding in mesoscale Yukawa-potential amorphous structures
Authors:
Murat Can Sarihan,
Alperen Govdeli,
Zhihao Lan,
Yildirim Batuhan Yilmaz,
Mertcan Erdil,
Yupei Wang,
Mehmet Sirin Aras,
Cenk Yanik,
Nicolae Coriolan Panoiu,
Chee Wei Wong,
Serdar Kocaman
Abstract:
Amorphous photonic structures are mesoscopic optical structures described by electrical permittivity distributions with underlying spatial randomness. They offer a unique platform for studying a broad set of electromagnetic phenomena, including transverse Anderson localization, enhanced wave transport, and suppressed diffusion in random media. Despite this, at a more practical level, there is insu…
▽ More
Amorphous photonic structures are mesoscopic optical structures described by electrical permittivity distributions with underlying spatial randomness. They offer a unique platform for studying a broad set of electromagnetic phenomena, including transverse Anderson localization, enhanced wave transport, and suppressed diffusion in random media. Despite this, at a more practical level, there is insufficient work on both understanding the nature of optical transport and the conditions conducive to vector-wave localization in these planar structures, as well as their potential applications to photonic nanodevices. In this study, we fill this gap by investigating experimentally and theoretically the characteristics of optical transport in a class of amorphous photonic structures and by demonstrating their use to some basic waveguiding nanostructures. We demonstrate that these 2-D structures have unique isotropic and asymmetric band gaps for in-plane propagation, controlled from first principles by varying the scattering strength and whose properties are elucidated by establishing an analogy between photon and carrier transport in amorphous semiconductors. We further observe Urbach band tails in these random structures and uncover their relation to frequency- and disorder-dependent Anderson-like localized modes through the modified Ioffe-Regel criterion and their mean free path - localization length character. Finally, we illustrate that our amorphous structures can serve as a versatile platform in which photonic devices such as disorder-localized waveguides can be readily implemented.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
ArtBank: Artistic Style Transfer with Pre-trained Diffusion Model and Implicit Style Prompt Bank
Authors:
Zhanjie Zhang,
Quanwei Zhang,
Guangyuan Li,
Wei Xing,
Lei Zhao,
Jiakai Sun,
Zehua Lan,
Junsheng Luan,
Yiling Huang,
Huaizhong Lin
Abstract:
Artistic style transfer aims to repaint the content image with the learned artistic style. Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches. Small model-based approaches can preserve the content strucuture, but fail to produce highly realistic stylized images and introduce artifacts and dish…
▽ More
Artistic style transfer aims to repaint the content image with the learned artistic style. Existing artistic style transfer methods can be divided into two categories: small model-based approaches and pre-trained large-scale model-based approaches. Small model-based approaches can preserve the content strucuture, but fail to produce highly realistic stylized images and introduce artifacts and disharmonious patterns; Pre-trained large-scale model-based approaches can generate highly realistic stylized images but struggle with preserving the content structure. To address the above issues, we propose ArtBank, a novel artistic style transfer framework, to generate highly realistic stylized images while preserving the content structure of the content images. Specifically, to sufficiently dig out the knowledge embedded in pre-trained large-scale models, an Implicit Style Prompt Bank (ISPB), a set of trainable parameter matrices, is designed to learn and store knowledge from the collection of artworks and behave as a visual prompt to guide pre-trained large-scale models to generate highly realistic stylized images while preserving content structure. Besides, to accelerate training the above ISPB, we propose a novel Spatial-Statistical-based self-Attention Module (SSAM). The qualitative and quantitative experiments demonstrate the superiority of our proposed method over state-of-the-art artistic style transfer methods.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
PsyChat: A Client-Centric Dialogue System for Mental Health Support
Authors:
Huachuan Qiu,
Anqi Li,
Lizhi Ma,
Zhenzhong Lan
Abstract:
Dialogue systems are increasingly integrated into mental health support to help clients facilitate exploration, gain insight, take action, and ultimately heal themselves. A practical and user-friendly dialogue system should be client-centric, focusing on the client's behaviors. However, existing dialogue systems publicly available for mental health support often concentrate solely on the counselor…
▽ More
Dialogue systems are increasingly integrated into mental health support to help clients facilitate exploration, gain insight, take action, and ultimately heal themselves. A practical and user-friendly dialogue system should be client-centric, focusing on the client's behaviors. However, existing dialogue systems publicly available for mental health support often concentrate solely on the counselor's strategies rather than the behaviors expressed by clients. This can lead to unreasonable or inappropriate counseling strategies and corresponding responses generated by the dialogue system. To address this issue, we propose PsyChat, a client-centric dialogue system that provides psychological support through online chat. The client-centric dialogue system comprises five modules: client behavior recognition, counselor strategy selection, input packer, response generator, and response selection. Both automatic and human evaluations demonstrate the effectiveness and practicality of our proposed dialogue system for real-life mental health support. Furthermore, the case study demonstrates that the dialogue system can predict the client's behaviors, select appropriate counselor strategies, and generate accurate and suitable responses.
△ Less
Submitted 19 March, 2024; v1 submitted 7 December, 2023;
originally announced December 2023.
-
Quality and Quantity: Unveiling a Million High-Quality Images for Text-to-Image Synthesis in Fashion Design
Authors:
Jia Yu,
Lichao Zhang,
Zijie Chen,
Fayu Pan,
MiaoMiao Wen,
Yuming Yan,
Fangsheng Weng,
Shuai Zhang,
Lili Pan,
Zhenzhong Lan
Abstract:
The fusion of AI and fashion design has emerged as a promising research area. However, the lack of extensive, interrelated data on clothing and try-on stages has hindered the full potential of AI in this domain. Addressing this, we present the Fashion-Diffusion dataset, a product of multiple years' rigorous effort. This dataset, the first of its kind, comprises over a million high-quality fashion…
▽ More
The fusion of AI and fashion design has emerged as a promising research area. However, the lack of extensive, interrelated data on clothing and try-on stages has hindered the full potential of AI in this domain. Addressing this, we present the Fashion-Diffusion dataset, a product of multiple years' rigorous effort. This dataset, the first of its kind, comprises over a million high-quality fashion images, paired with detailed text descriptions. Sourced from a diverse range of geographical locations and cultural backgrounds, the dataset encapsulates global fashion trends. The images have been meticulously annotated with fine-grained attributes related to clothing and humans, simplifying the fashion design process into a Text-to-Image (T2I) task. The Fashion-Diffusion dataset not only provides high-quality text-image pairs and diverse human-garment pairs but also serves as a large-scale resource about humans, thereby facilitating research in T2I generation. Moreover, to foster standardization in the T2I-based fashion design field, we propose a new benchmark comprising multiple datasets for evaluating the performance of fashion design models. This work represents a significant leap forward in the realm of AI-driven fashion design, setting a new standard for future research in this field.
△ Less
Submitted 18 March, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology
Authors:
Junlei Zhang,
Hongliang He,
Nirui Song,
Zhanchao Zhou,
Shuyuan He,
Shuai Zhang,
Huachuan Qiu,
Anqi Li,
Yong Dai,
Lizhi Ma,
Zhenzhong Lan
Abstract:
The critical field of psychology necessitates a comprehensive benchmark to enhance the evaluation and development of domain-specific Large Language Models (LLMs). Existing MMLU-type benchmarks, such as C-EVAL and CMMLU, include psychology-related subjects, but their limited number of questions and lack of systematic concept sampling strategies mean they cannot cover the concepts required in psycho…
▽ More
The critical field of psychology necessitates a comprehensive benchmark to enhance the evaluation and development of domain-specific Large Language Models (LLMs). Existing MMLU-type benchmarks, such as C-EVAL and CMMLU, include psychology-related subjects, but their limited number of questions and lack of systematic concept sampling strategies mean they cannot cover the concepts required in psychology. Consequently, despite their broad subject coverage, these benchmarks lack the necessary depth in the psychology domain, making them inadequate as psychology-specific evaluation suite. To address this issue, this paper presents ConceptPsy, designed to evaluate Chinese complex reasoning and knowledge abilities in psychology. ConceptPsy includes 12 core subjects and 1383 manually collected concepts. Specifically, we prompt GPT-4 to generate questions for each concept using carefully designed diverse prompts and hire professional psychologists to review these questions. To help to understand the fine-grained performances and enhance the weaknesses, we annotate each question with a chapter label and provide chapter-wise accuracy. Based on ConceptPsy, we evaluate a broad range of LLMs. We observe that, although some LLMs achieve similar accuracies on overall performances, they exhibit significant performance variations across different psychology concepts, even when they are models from the same series. We hope our work can facilitate the development of LLMs in the field of psychology.
△ Less
Submitted 16 June, 2024; v1 submitted 16 November, 2023;
originally announced November 2023.
-
Topological States Decorated by Twig Boundary in Plasma Photonic Crystals
Authors:
Jianfei Li,
**gfeng Yao,
Ying Wang,
Zhongxiang Zhou,
Zhihao Lan,
Chengxun Yuan
Abstract:
The twig edge states in graphene-like structures are viewed as the fourth states complementary to their zigzag, bearded, and armchair counterparts. In this work, we study a rod-in-plasma system in honeycomb lattice with twig edge truncation under external magnetic fields and lattice scaling and show that twig edge states can exist in different phases of the system, such as quantum Hall phase, quan…
▽ More
The twig edge states in graphene-like structures are viewed as the fourth states complementary to their zigzag, bearded, and armchair counterparts. In this work, we study a rod-in-plasma system in honeycomb lattice with twig edge truncation under external magnetic fields and lattice scaling and show that twig edge states can exist in different phases of the system, such as quantum Hall phase, quantum spin Hall phase and insulating phase. The twig edge states in the negative permittivity background exhibit robust one-way transmission property immune to backscattering and thus provide a novel avenue for solving the plasma communication blackout problem. Moreover, we demonstrate that corner and edge states can exist within the shrunken structure by modulating the on-site potential of the twig edges. Especially, helical edge states with the unique feature of pseudospin-momentum locking that could be excited by chiral sources are demonstrated at the twig edges. Our results show that the twig edges and interface engineering can bring new opportunities for more flexible manipulation of electromagnetic waves.
△ Less
Submitted 21 April, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Dynamics of Instruction Tuning: Each Ability of Large Language Models Has Its Own Growth Pace
Authors:
Chiyu Song,
Zhanchao Zhou,
Jianhao Yan,
Yuejiao Fei,
Zhenzhong Lan,
Yue Zhang
Abstract:
Instruction tuning is a burgeoning method to elicit the general intelligence of Large Language Models (LLMs). However, the creation of instruction data is still largely heuristic, leading to significant variation in quantity and quality across existing datasets. While some research advocates for expanding the number of instructions, others suggest that a small set of well-chosen examples is adequa…
▽ More
Instruction tuning is a burgeoning method to elicit the general intelligence of Large Language Models (LLMs). However, the creation of instruction data is still largely heuristic, leading to significant variation in quantity and quality across existing datasets. While some research advocates for expanding the number of instructions, others suggest that a small set of well-chosen examples is adequate. To better understand data construction guidelines, our research provides a granular analysis of how data volume, parameter size, and data construction methods influence the development of each underlying ability of LLM, such as creative writing, code generation, and logical reasoning. We present a meticulously curated dataset with over 40k instances across ten abilities and examine instruction-tuned models with 7b to 33b parameters. Our study reveals three primary findings: (i) Despite the models' overall performance being tied to data and parameter scale, individual abilities have different sensitivities to these factors. (ii) Human-curated data strongly outperforms synthetic data from GPT-4 in efficiency and can constantly enhance model performance with volume increases, but is unachievable with synthetic data. (iii) Instruction data brings powerful cross-ability generalization, as evidenced by out-of-domain evaluations. Furthermore, we demonstrate how these findings can guide more efficient data constructions, leading to practical performance improvements on two public benchmarks.
△ Less
Submitted 22 February, 2024; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Mid-Long Term Daily Electricity Consumption Forecasting Based on Piecewise Linear Regression and Dilated Causal CNN
Authors:
Zhou Lan,
Ben Liu,
Yi Feng,
Danhuang Dong,
Peng Zhang
Abstract:
Daily electricity consumption forecasting is a classical problem. Existing forecasting algorithms tend to have decreased accuracy on special dates like holidays. This study decomposes the daily electricity consumption series into three components: trend, seasonal, and residual, and constructs a two-stage prediction method using piecewise linear regression as a filter and Dilated Causal CNN as a pr…
▽ More
Daily electricity consumption forecasting is a classical problem. Existing forecasting algorithms tend to have decreased accuracy on special dates like holidays. This study decomposes the daily electricity consumption series into three components: trend, seasonal, and residual, and constructs a two-stage prediction method using piecewise linear regression as a filter and Dilated Causal CNN as a predictor. The specific steps involve setting breakpoints on the time axis and fitting the piecewise linear regression model with one-hot encoded information such as month, weekday, and holidays. For the challenging prediction of the Spring Festival, distance is introduced as a variable using a third-degree polynomial form in the model. The residual sequence obtained in the previous step is modeled using Dilated Causal CNN, and the final prediction of daily electricity consumption is the sum of the two-stage predictions. Experimental results demonstrate that this method achieves higher accuracy compared to existing approaches.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting
Authors:
Zijie Chen,
Lichao Zhang,
Fangsheng Weng,
Lili Pan,
Zhenzhong Lan
Abstract:
Despite significant progress in the field, it is still challenging to create personalized visual representations that align closely with the desires and preferences of individual users. This process requires users to articulate their ideas in words that are both comprehensible to the models and accurately capture their vision, posing difficulties for many users. In this paper, we tackle this chall…
▽ More
Despite significant progress in the field, it is still challenging to create personalized visual representations that align closely with the desires and preferences of individual users. This process requires users to articulate their ideas in words that are both comprehensible to the models and accurately capture their vision, posing difficulties for many users. In this paper, we tackle this challenge by leveraging historical user interactions with the system to enhance user prompts. We propose a novel approach that involves rewriting user prompts based on a newly collected large-scale text-to-image dataset with over 300k prompts from 3115 users. Our rewriting model enhances the expressiveness and alignment of user prompts with their intended visual outputs. Experimental results demonstrate the superiority of our methods over baseline approaches, as evidenced in our new offline evaluation method and online tests. Our code and dataset are available at https://github.com/zzjchen/Tailored-Visions.
△ Less
Submitted 6 April, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Single-Shot Laser-Driven Neutron Resonance Spectroscopy for Temperature Profiling
Authors:
Zechen Lan,
Yasunobu Arikawa,
S. Reza Mirfayzi,
Alessio Morace,
Takehito Hayakawa,
Hirotaka Sato,
Takashi Kamiyama,
Tianyun Wei,
Yuta Tatsumi,
Mitsuo Koizumi,
Yuki Abe,
Shinsuke Fujioka,
Kunioki Mima,
Ryosuke Kodama,
Akifumi Yogo
Abstract:
The temperature measurement of material inside of an object is one of the key technologies for control of dynamical processes. For this purpose, various techniques such as laser-based thermography and phase-contrast imaging thermography have been studied. However, it is, in principle, impossible to measure the temperature of an element inside of an object using these techniques. One of the possibl…
▽ More
The temperature measurement of material inside of an object is one of the key technologies for control of dynamical processes. For this purpose, various techniques such as laser-based thermography and phase-contrast imaging thermography have been studied. However, it is, in principle, impossible to measure the temperature of an element inside of an object using these techniques. One of the possible solutions is measurements of Doppler brooding effect in neutron resonance absorption (NRA). Here we present a method to measure the temperature of an element or an isotope inside of an object using NRA with a single neutron pulse of approximately 100 ns width provided from a high-power laser. We demonstrate temperature measurements of a tantalum (Ta) metallic foil heated from the room temperature up to 617 K. Although the neutron energy resolution is fluctuated from shot to shot, we obtain exactly the temperature using a reference of a silver (Ag) foil kept to the room temperature. A free gas model well reproduces the results. This method enables element(isotope)-sensitive thermometry to detect the instantaneous temperature rise in dynamical processes.
△ Less
Submitted 3 October, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
SEPT: Towards Efficient Scene Representation Learning for Motion Prediction
Authors:
Zhiqian Lan,
Yuxuan Jiang,
Yao Mu,
Chen Chen,
Shengbo Eben Li
Abstract:
Motion prediction is crucial for autonomous vehicles to operate safely in complex traffic environments. Extracting effective spatiotemporal relationships among traffic elements is key to accurate forecasting. Inspired by the successful practice of pretrained large language models, this paper presents SEPT, a modeling framework that leverages self-supervised learning to develop powerful spatiotempo…
▽ More
Motion prediction is crucial for autonomous vehicles to operate safely in complex traffic environments. Extracting effective spatiotemporal relationships among traffic elements is key to accurate forecasting. Inspired by the successful practice of pretrained large language models, this paper presents SEPT, a modeling framework that leverages self-supervised learning to develop powerful spatiotemporal understanding for complex traffic scenes. Specifically, our approach involves three masking-reconstruction modeling tasks on scene inputs including agents' trajectories and road network, pretraining the scene encoder to capture kinematics within trajectory, spatial structure of road network, and interactions among roads and agents. The pretrained encoder is then finetuned on the downstream forecasting task. Extensive experiments demonstrate that SEPT, without elaborate architectural design or manual feature engineering, achieves state-of-the-art performance on the Argoverse 1 and Argoverse 2 motion forecasting benchmarks, outperforming previous methods on all main metrics by a large margin.
△ Less
Submitted 19 December, 2023; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Facilitating NSFW Text Detection in Open-Domain Dialogue Systems via Knowledge Distillation
Authors:
Huachuan Qiu,
Shuai Zhang,
Hongliang He,
Anqi Li,
Zhenzhong Lan
Abstract:
NSFW (Not Safe for Work) content, in the context of a dialogue, can have severe side effects on users in open-domain dialogue systems. However, research on detecting NSFW language, especially sexually explicit content, within a dialogue context has significantly lagged behind. To address this issue, we introduce CensorChat, a dialogue monitoring dataset aimed at NSFW dialogue detection. Leveraging…
▽ More
NSFW (Not Safe for Work) content, in the context of a dialogue, can have severe side effects on users in open-domain dialogue systems. However, research on detecting NSFW language, especially sexually explicit content, within a dialogue context has significantly lagged behind. To address this issue, we introduce CensorChat, a dialogue monitoring dataset aimed at NSFW dialogue detection. Leveraging knowledge distillation techniques involving GPT-4 and ChatGPT, this dataset offers a cost-effective means of constructing NSFW content detectors. The process entails collecting real-life human-machine interaction data and breaking it down into single utterances and single-turn dialogues, with the chatbot delivering the final utterance. ChatGPT is employed to annotate unlabeled data, serving as a training set. Rationale validation and test sets are constructed using ChatGPT and GPT-4 as annotators, with a self-criticism strategy for resolving discrepancies in labeling. A BERT model is fine-tuned as a text classifier on pseudo-labeled data, and its performance is assessed. The study emphasizes the importance of AI systems prioritizing user safety and well-being in digital conversations while respecting freedom of expression. The proposed approach not only advances NSFW content detection but also aligns with evolving user protection needs in AI-driven dialogues.
△ Less
Submitted 20 March, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Use neural networks to recognize students' handwritten letters and incorrect symbols
Authors:
JiaJun Zhu,
Zichuan Yang,
Binjie Hong,
Jiacheng Song,
Jiwei Wang,
Tianhao Chen,
Shuilan Yang,
Zixun Lan,
Fei Ma
Abstract:
Correcting students' multiple-choice answers is a repetitive and mechanical task that can be considered an image multi-classification task. Assuming possible options are 'abcd' and the correct option is one of the four, some students may write incorrect symbols or options that do not exist. In this paper, five classifications were set up - four for possible correct options and one for other incorr…
▽ More
Correcting students' multiple-choice answers is a repetitive and mechanical task that can be considered an image multi-classification task. Assuming possible options are 'abcd' and the correct option is one of the four, some students may write incorrect symbols or options that do not exist. In this paper, five classifications were set up - four for possible correct options and one for other incorrect writing. This approach takes into account the possibility of non-standard writing options.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
An Adaptive Spatial-Temporal Local Feature Difference Method for Infrared Small-moving Target Detection
Authors:
Yongkang Zhao,
Chuang Zhu,
Yuan Li,
Shuaishuai Wang,
Zihan Lan,
Yuanyuan Qiao
Abstract:
Detecting small moving targets accurately in infrared (IR) image sequences is a significant challenge. To address this problem, we propose a novel method called spatial-temporal local feature difference (STLFD) with adaptive background suppression (ABS). Our approach utilizes filters in the spatial and temporal domains and performs pixel-level ABS on the output to enhance the contrast between the…
▽ More
Detecting small moving targets accurately in infrared (IR) image sequences is a significant challenge. To address this problem, we propose a novel method called spatial-temporal local feature difference (STLFD) with adaptive background suppression (ABS). Our approach utilizes filters in the spatial and temporal domains and performs pixel-level ABS on the output to enhance the contrast between the target and the background. The proposed method comprises three steps. First, we obtain three temporal frame images based on the current frame image and extract two feature maps using the designed spatial domain and temporal domain filters. Next, we fuse the information of the spatial domain and temporal domain to produce the spatial-temporal feature maps and suppress noise using our pixel-level ABS module. Finally, we obtain the segmented binary map by applying a threshold. Our experimental results demonstrate that the proposed method outperforms existing state-of-the-art methods for infrared small-moving target detection.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Twisted Equivariant Gromov-Witten Theory of the Classifying Space of a Finite Group
Authors:
Zhuoming Lan,
Zhengyu Zong
Abstract:
For any finite group $G$, the equivariant Gromov-Witten invariants of $[\mathbb{C}^r/G]$ can be viewed as a certain twisted Gromov-Witten invariants of the classifying stack $\mathcal{B} G$. In this paper, we use Tseng's orbifold quantum Riemann-Roch theorem to express the equivariant Gromov-Witten invariants of $[\mathbb{C}^r/G]$ as a sum over Feynman graphs, where the weight of each graph is exp…
▽ More
For any finite group $G$, the equivariant Gromov-Witten invariants of $[\mathbb{C}^r/G]$ can be viewed as a certain twisted Gromov-Witten invariants of the classifying stack $\mathcal{B} G$. In this paper, we use Tseng's orbifold quantum Riemann-Roch theorem to express the equivariant Gromov-Witten invariants of $[\mathbb{C}^r/G]$ as a sum over Feynman graphs, where the weight of each graph is expressed in terms of descendant integrals over moduli spaces of stable curves and representations of $G$.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Studies of Nonadiabatic Dynamics in the Singlet Fission Processes of Pentacene Dimer via Tensor Train Decomposition Method
Authors:
Jiawei Peng,
De** Hu,
Hong Liu,
Qiang Shi,
Peng Bao,
Zhenggang Lan
Abstract:
Singlet fission (SF) is a very significant photophysical phenomenon and possesses potential applications. In this work, we try to give the rather detailed theoretical investigation of the SF process in the stacked polyacene dimer by combining the high-level quantum chemistry calculations, and the quantum dynamics simulations based on the tensor train decomposition method. Starting from the constru…
▽ More
Singlet fission (SF) is a very significant photophysical phenomenon and possesses potential applications. In this work, we try to give the rather detailed theoretical investigation of the SF process in the stacked polyacene dimer by combining the high-level quantum chemistry calculations, and the quantum dynamics simulations based on the tensor train decomposition method. Starting from the construction of the linear vibronic coupling model, we explore the pure electronic dynamics and the vibronic dynamics in the SF processes. The role of vibrational modes in nonadiabatic dynamics is addressed. The results show that the super-exchange mechanism mediated by the charge-transfer state is found in both pure electronic dynamics and the nonadiabatic dynamics. Particularly, the vibrational modes with the frequency resonance with the adiabatic energy gap play very import roles in the SF dynamics. This work not only provides a deep and detailed understanding of the SF process, but also verifies the efficiency of the tensor train decomposition method that can serve as the reference dynamics method to explore the dynamics behaviors of complex systems.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
A Benchmark for Understanding Dialogue Safety in Mental Health Support
Authors:
Huachuan Qiu,
Tong Zhao,
Anqi Li,
Shuai Zhang,
Hongliang He,
Zhenzhong Lan
Abstract:
Dialogue safety remains a pervasive challenge in open-domain human-machine interaction. Existing approaches propose distinctive dialogue safety taxonomies and datasets for detecting explicitly harmful responses. However, these taxonomies may not be suitable for analyzing response safety in mental health support. In real-world interactions, a model response deemed acceptable in casual conversations…
▽ More
Dialogue safety remains a pervasive challenge in open-domain human-machine interaction. Existing approaches propose distinctive dialogue safety taxonomies and datasets for detecting explicitly harmful responses. However, these taxonomies may not be suitable for analyzing response safety in mental health support. In real-world interactions, a model response deemed acceptable in casual conversations might have a negligible positive impact on users seeking mental health support. To address these limitations, this paper aims to develop a theoretically and factually grounded taxonomy that prioritizes the positive impact on help-seekers. Additionally, we create a benchmark corpus with fine-grained labels for each dialogue session to facilitate further research. We analyze the dataset using popular language models, including BERT-base, RoBERTa-large, and ChatGPT, to detect and understand unsafe responses within the context of mental health support. Our study reveals that ChatGPT struggles to detect safety categories with detailed safety definitions in a zero- and few-shot paradigm, whereas the fine-tuned model proves to be more suitable. The developed dataset and findings serve as valuable benchmarks for advancing research on dialogue safety in mental health support, with significant implications for improving the design and deployment of conversation agents in real-world applications. We release our code and data here: https://github.com/qiuhuachuan/DialogueSafety.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark
Authors:
Liang Xu,
Anqi Li,
Lei Zhu,
Hang Xue,
Changtai Zhu,
Kangkang Zhao,
Haonan He,
Xuanwei Zhang,
Qiyue Kang,
Zhenzhong Lan
Abstract:
Large language models (LLMs) have shown the potential to be integrated into human daily lives. Therefore, user preference is the most critical criterion for assessing LLMs' performance in real-world scenarios. However, existing benchmarks mainly focus on measuring models' accuracy using multi-choice questions, which limits the understanding of their capabilities in real applications. We fill this…
▽ More
Large language models (LLMs) have shown the potential to be integrated into human daily lives. Therefore, user preference is the most critical criterion for assessing LLMs' performance in real-world scenarios. However, existing benchmarks mainly focus on measuring models' accuracy using multi-choice questions, which limits the understanding of their capabilities in real applications. We fill this gap by proposing a comprehensive Chinese benchmark SuperCLUE, named after another popular Chinese LLM benchmark CLUE. SuperCLUE encompasses three sub-tasks: actual users' queries and ratings derived from an LLM battle platform (CArena), open-ended questions with single and multiple-turn dialogues (OPEN), and closed-ended questions with the same stems as open-ended single-turn ones (CLOSE). Our study shows that accuracy on closed-ended questions is insufficient to reflect human preferences achieved on open-ended ones. At the same time, they can complement each other to predict actual user preferences. We also demonstrate that GPT-4 is a reliable judge to automatically evaluate human preferences on open-ended questions in a Chinese context. Our benchmark will be released at https://www.CLUEbenchmarks.com
△ Less
Submitted 27 July, 2023;
originally announced July 2023.
-
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models
Authors:
Huachuan Qiu,
Shuai Zhang,
Anqi Li,
Hongliang He,
Zhenzhong Lan
Abstract:
Considerable research efforts have been devoted to ensuring that large language models (LLMs) align with human values and generate safe text. However, an excessive focus on sensitivity to certain topics can compromise the model's robustness in following instructions, thereby impacting its overall performance in completing tasks. Previous benchmarks for jailbreaking LLMs have primarily focused on e…
▽ More
Considerable research efforts have been devoted to ensuring that large language models (LLMs) align with human values and generate safe text. However, an excessive focus on sensitivity to certain topics can compromise the model's robustness in following instructions, thereby impacting its overall performance in completing tasks. Previous benchmarks for jailbreaking LLMs have primarily focused on evaluating the safety of the models without considering their robustness. In this paper, we propose a benchmark that assesses both the safety and robustness of LLMs, emphasizing the need for a balanced approach. To comprehensively study text safety and output robustness, we introduce a latent jailbreak prompt dataset, each involving malicious instruction embedding. Specifically, we instruct the model to complete a regular task, such as translation, with the text to be translated containing malicious instructions. To further analyze safety and robustness, we design a hierarchical annotation framework. We present a systematic analysis of the safety and robustness of LLMs regarding the position of explicit normal instructions, word replacements (verbs in explicit normal instructions, target groups in malicious instructions, cue words for explicit normal instructions), and instruction replacements (different explicit normal instructions). Our results demonstrate that current LLMs not only prioritize certain instruction verbs but also exhibit varying jailbreak rates for different instruction verbs in explicit normal instructions. Code and data are available at https://github.com/qiuhuachuan/latent-jailbreak.
△ Less
Submitted 28 August, 2023; v1 submitted 17 July, 2023;
originally announced July 2023.