-
Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3)
Authors:
Tong Zhan,
Chenxi Shi,
Yadong Shi,
Huixiang Li,
Yiyu Lin
Abstract:
With the rapid development of natural language processing (NLP) technology, large-scale pre-trained language models such as GPT-3 have become a popular research object in NLP field. This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural lang…
▽ More
With the rapid development of natural language processing (NLP) technology, large-scale pre-trained language models such as GPT-3 have become a popular research object in NLP field. This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural language processing (NLP). By introducing the importance of sentiment analysis and the limitations of traditional methods, GPT-3 and Fine-tuning techniques are introduced in this paper, and their applications in sentiment analysis are explained in detail. The experimental results show that the Fine-tuning technique can optimize GPT-3 model and obtain good performance in sentiment analysis task. This study provides an important reference for future sentiment analysis using large-scale language models.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
RNA Secondary Structure Prediction Using Transformer-Based Deep Learning Models
Authors:
Yanlin Zhou,
Tong Zhan,
Yichao Wu,
Bo Song,
Chenxi Shi
Abstract:
The Human Genome Project has led to an exponential increase in data related to the sequence, structure, and function of biomolecules. Bioinformatics is an interdisciplinary research field that primarily uses computational methods to analyze large amounts of biological macromolecule data. Its goal is to discover hidden biological patterns and related information. Furthermore, analysing additional r…
▽ More
The Human Genome Project has led to an exponential increase in data related to the sequence, structure, and function of biomolecules. Bioinformatics is an interdisciplinary research field that primarily uses computational methods to analyze large amounts of biological macromolecule data. Its goal is to discover hidden biological patterns and related information. Furthermore, analysing additional relevant information can enhance the study of biological operating mechanisms. This paper discusses the fundamental concepts of RNA, RNA secondary structure, and its prediction.Subsequently, the application of machine learning technologies in predicting the structure of biological macromolecules is explored. This chapter describes the relevant knowledge of algorithms and computational complexity and presents a RNA tertiary structure prediction algorithm based on ResNet. To address the issue of the current scoring function's unsuitability for long RNA, a scoring model based on ResNet is proposed, and a structure prediction algorithm is designed. The chapter concludes by presenting some open and interesting challenges in the field of RNA tertiary structure prediction.
△ Less
Submitted 14 April, 2024;
originally announced May 2024.
-
Time Evidence Fusion Network: Multi-source View in Long-Term Time Series Forecasting
Authors:
Tianxiang Zhan,
Yuanpeng He,
Zhen Li,
Yong Deng
Abstract:
In real-world scenarios, time series forecasting often demands timeliness, making research on model backbones a perennially hot topic. To meet these performance demands, we propose a novel backbone from the perspective of information fusion. Introducing the Basic Probability Assignment (BPA) Module and the Time Evidence Fusion Network (TEFN), based on evidence theory, allows us to achieve superior…
▽ More
In real-world scenarios, time series forecasting often demands timeliness, making research on model backbones a perennially hot topic. To meet these performance demands, we propose a novel backbone from the perspective of information fusion. Introducing the Basic Probability Assignment (BPA) Module and the Time Evidence Fusion Network (TEFN), based on evidence theory, allows us to achieve superior performance. On the other hand, the perspective of multi-source information fusion effectively improves the accuracy of forecasting. Due to the fact that BPA is generated by fuzzy theory, TEFN also has considerable interpretability. In real data experiments, the TEFN partially achieved state-of-the-art, with low errors comparable to PatchTST, and operating efficiency surpass performance models such as Dlinear. Meanwhile, TEFN has high robustness and small error fluctuations in the random hyperparameter selection. TEFN is not a model that achieves the ultimate in single aspect, but a model that balances performance, accuracy, stability, and interpretability.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Isopignistic Canonical Decomposition via Belief Evolution Network
Authors:
Qianli Zhou,
Tianxiang Zhan,
Yong Deng
Abstract:
Develo** a general information processing model in uncertain environments is fundamental for the advancement of explainable artificial intelligence. Dempster-Shafer theory of evidence is a well-known and effective reasoning method for representing epistemic uncertainty, which is closely related to subjective probability theory and possibility theory. Although they can be transformed to each othe…
▽ More
Develo** a general information processing model in uncertain environments is fundamental for the advancement of explainable artificial intelligence. Dempster-Shafer theory of evidence is a well-known and effective reasoning method for representing epistemic uncertainty, which is closely related to subjective probability theory and possibility theory. Although they can be transformed to each other under some particular belief structures, there remains a lack of a clear and interpretable transformation process, as well as a unified approach for information processing. In this paper, we aim to address these issues from the perspectives of isopignistic belief functions and the hyper-cautious transferable belief model. Firstly, we propose an isopignistic transformation based on the belief evolution network. This transformation allows for the adjustment of the information granule while retaining the potential decision outcome. The isopignistic transformation is integrated with a hyper-cautious transferable belief model to establish a new canonical decomposition. This decomposition offers a reverse path between the possibility distribution and its isopignistic mass functions. The result of the canonical decomposition, called isopignistic function, is an identical information content distribution to reflect the propensity and relative commitment degree of the BPA. Furthermore, this paper introduces a method to reconstruct the basic belief assignment by adjusting the isopignistic function. It explores the advantages of this approach in modeling and handling uncertainty within the hyper-cautious transferable belief model. More general, this paper establishes a theoretical basis for building general models of artificial intelligence based on probability theory, Dempster-Shafer theory, and possibility theory.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities
Authors:
Kunxi Li,
Tianyu Zhan,
Kairui Fu,
Shengyu Zhang,
Kun Kuang,
Jiwei Li,
Zhou Zhao,
Fei Wu
Abstract:
In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present Mer…
▽ More
In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models, facilitating the direct interaction, extraction, and application of knowledge within these parameter spaces. The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters and adeptly learning to identify and map parameters into the target model. MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage, including the training trajectory knowledge of the source model. Extensive experiments on heterogeneous knowledge transfer demonstrate significant improvements in challenging settings, where representative approaches may falter or prove less applicable.
△ Less
Submitted 17 June, 2024; v1 submitted 20 April, 2024;
originally announced April 2024.
-
Maximizing User Experience with LLMOps-Driven Personalized Recommendation Systems
Authors:
Chenxi Shi,
Penghao Liang,
Yichao Wu,
Tong Zhan,
Zhengyu **
Abstract:
The integration of LLMOps into personalized recommendation systems marks a significant advancement in managing LLM-driven applications. This innovation presents both opportunities and challenges for enterprises, requiring specialized teams to navigate the complexity of engineering technology while prioritizing data security and model interpretability. By leveraging LLMOps, enterprises can enhance…
▽ More
The integration of LLMOps into personalized recommendation systems marks a significant advancement in managing LLM-driven applications. This innovation presents both opportunities and challenges for enterprises, requiring specialized teams to navigate the complexity of engineering technology while prioritizing data security and model interpretability. By leveraging LLMOps, enterprises can enhance the efficiency and reliability of large-scale machine learning models, driving personalized recommendations aligned with user preferences. Despite ethical considerations, LLMOps is poised for widespread adoption, promising more efficient and secure machine learning services that elevate user experience and shape the future of personalized recommendation systems.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Dynamic Viscosity of the ABC-stacked Multilayer Graphene in the Collisionless Regime
Authors:
Weiwei Chen,
Yedi Shen,
Tianle Zhan,
W. Zhu
Abstract:
We explore the dynamic shear viscosity of the undoped ABC-stacked multilayer graphene based on the chiral-$N$ effective Hamiltonian, where the chirality $N$ is equivalent to the layer number. We investigate the dependence of the dynamic shear viscosity on the frequency in the collisionless regime and calculate Coulomb interaction corrections by three leading order Feynman diagrams: self-energy dia…
▽ More
We explore the dynamic shear viscosity of the undoped ABC-stacked multilayer graphene based on the chiral-$N$ effective Hamiltonian, where the chirality $N$ is equivalent to the layer number. We investigate the dependence of the dynamic shear viscosity on the frequency in the collisionless regime and calculate Coulomb interaction corrections by three leading order Feynman diagrams: self-energy diagram, vertex diagram, and honey diagram. We propose that the dynamic shear viscosity is generated by the relaxation of momentum flux polarization through electron-hole excitations, and that the interaction can amplify this effect. Furthermore, our research indicates that the dynamic shear viscosity exhibits a robust linear positive dependence on $N$. This finding suggests that by making modifications to the number of layers in graphene, it is possible to finely tune the electron viscous effects.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Learnable WSN Deployment of Evidential Collaborative Sensing Model
Authors:
Ruijie Liu,
Tianxiang Zhan,
Zhen Li,
Yong Deng
Abstract:
In wireless sensor networks (WSNs), coverage and deployment are two most crucial issues when conducting detection tasks. However, the detection information collected from sensors is oftentimes not fully utilized and efficiently integrated. Such sensing model and deployment strategy, thereby, cannot reach the maximum quality of coverage, particularly when the amount of sensors within WSNs expands s…
▽ More
In wireless sensor networks (WSNs), coverage and deployment are two most crucial issues when conducting detection tasks. However, the detection information collected from sensors is oftentimes not fully utilized and efficiently integrated. Such sensing model and deployment strategy, thereby, cannot reach the maximum quality of coverage, particularly when the amount of sensors within WSNs expands significantly. In this article, we aim at achieving the optimal coverage quality of WSN deployment. We develop a collaborative sensing model of sensors to enhance detection capabilities of WSNs, by leveraging the collaborative information derived from the combination rule under the framework of evidence theory. In this model, the performance evaluation of evidential fusion systems is adopted as the criterion of the sensor selection. A learnable sensor deployment network (LSDNet) considering both sensor contribution and detection capability, is proposed for achieving the optimal deployment of WSNs. Moreover, we deeply investigate the algorithm for finding the requisite minimum number of sensors that realizes the full coverage of WSNs. A series of numerical examples, along with an application of forest area monitoring, are employed to demonstrate the effectiveness and the robustness of the proposed algorithms.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis
Authors:
Yichao Wu,
Zhengyu **,
Chenxi Shi,
Penghao Liang,
Tong Zhan
Abstract:
This paper explores the application of deep learning techniques, particularly focusing on BERT models, in sentiment analysis. It begins by introducing the fundamental concept of sentiment analysis and how deep learning methods are utilized in this domain. Subsequently, it delves into the architecture and characteristics of BERT models. Through detailed explanation, it elucidates the application ef…
▽ More
This paper explores the application of deep learning techniques, particularly focusing on BERT models, in sentiment analysis. It begins by introducing the fundamental concept of sentiment analysis and how deep learning methods are utilized in this domain. Subsequently, it delves into the architecture and characteristics of BERT models. Through detailed explanation, it elucidates the application effects and optimization strategies of BERT models in sentiment analysis, supported by experimental validation. The experimental findings indicate that BERT models exhibit robust performance in sentiment analysis tasks, with notable enhancements post fine-tuning. Lastly, the paper concludes by summarizing the potential applications of BERT models in sentiment analysis and suggests directions for future research and practical implementations.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Random Graph Set and Evidence Pattern Reasoning Model
Authors:
Tianxiang Zhan,
Zhen Li,
Yong Deng
Abstract:
Evidence theory is widely used in decision-making and reasoning systems. In previous research, Transferable Belief Model (TBM) is a commonly used evidential decision making model, but TBM is a non-preference model. In order to better fit the decision making goals, the Evidence Pattern Reasoning Model (EPRM) is proposed. By defining pattern operators and decision making operators, corresponding pre…
▽ More
Evidence theory is widely used in decision-making and reasoning systems. In previous research, Transferable Belief Model (TBM) is a commonly used evidential decision making model, but TBM is a non-preference model. In order to better fit the decision making goals, the Evidence Pattern Reasoning Model (EPRM) is proposed. By defining pattern operators and decision making operators, corresponding preferences can be set for different tasks. Random Permutation Set (RPS) expands order information for evidence theory. It is hard for RPS to characterize the complex relationship between samples such as cycling, paralleling relationships. Therefore, Random Graph Set (RGS) were proposed to model complex relationships and represent more event types. In order to illustrate the significance of RGS and EPRM, an experiment of aircraft velocity ranking was designed and 10,000 cases were simulated. The implementation of EPRM called Conflict Resolution Decision optimized 18.17\% of the cases compared to Mean Velocity Decision, effectively improving the aircraft velocity ranking. EPRM provides a unified solution for evidence-based decision making.
△ Less
Submitted 9 March, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
A Class of Computational Methods to Reduce Selection Bias when Designing Phase 3 Clinical Trials
Authors:
Tianyu Zhan
Abstract:
When designing confirmatory Phase 3 studies, one usually evaluates one or more efficacious and safe treatment option(s) based on data from previous studies. However, several retrospective research articles reported the phenomenon of ``diminished treatment effect in Phase 3'' based on many case studies. Even under basic assumptions, it was shown that the commonly used estimator could substantially…
▽ More
When designing confirmatory Phase 3 studies, one usually evaluates one or more efficacious and safe treatment option(s) based on data from previous studies. However, several retrospective research articles reported the phenomenon of ``diminished treatment effect in Phase 3'' based on many case studies. Even under basic assumptions, it was shown that the commonly used estimator could substantially overestimate the efficacy of selected group(s). As alternatives, we propose a class of computational methods to reduce estimation bias and mean squared error (MSE) with a broader scope of multiple treatment groups and flexibility to accommodate summary results by group as input. Based on simulation studies and a real data example, we provide practical implementation guidance for this class of methods under different scenarios. For more complicated problems, our framework can serve as a starting point with additional layers built in. Proposed methods can also be widely applied to other selection problems.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
On-chip light-scattering enhancement enables high performance single-particle tracking under conventional bright-field microscope
Authors:
Pengcheng Zhang,
Tingting Zhan,
Guoqiang Gu,
Changle Li,
Mengting Lyu,
Yi Zhang,
Hui Yang
Abstract:
Scattering-based single-particle tracking (S-SPT) has opened new avenues for highly sensitive label-free detection and characterization of nanoscopic objects, making it particularly attractive for various analytical applications. However, a long-standing issue hindering its widespread applicability is its high technical demands on optical systems. The most promising solution entails implementing o…
▽ More
Scattering-based single-particle tracking (S-SPT) has opened new avenues for highly sensitive label-free detection and characterization of nanoscopic objects, making it particularly attractive for various analytical applications. However, a long-standing issue hindering its widespread applicability is its high technical demands on optical systems. The most promising solution entails implementing on-chip light-scattering enhancement, but the existing field-enhancement technology fails as their highly localized field is insufficient to cover the three-dimensional trajectory of particles within the interrogation time. Here, we present a straightforward and robust on-chip microlens-based strategy for light-scattering enhancement, providing an enhancement range ten times greater than that of near-field optical techniques. These properties are attributed to the increased long-range optical fields and complex composite interactions between two closely spaced structures. Thanks to this strategy, we demonstrate that high-performance S-SPT can be achieved, for the first time, under a conventional bright-field microscope with illumination powers over 1,000 times lower than typically required. This significantly reduces the technical demands of S-SPT, representing a significant step forward in facilitating its practical application in biophotonics, biosensors, diagnostics, and other fields.
△ Less
Submitted 19 November, 2023;
originally announced November 2023.
-
Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding
Authors:
Shuwei Feng,
Tianyang Zhan,
Zhanming Jie,
Trung Quoc Luong,
Xiaoran **
Abstract:
This paper presents GenDoc, a general sequence-to-sequence document understanding model pre-trained with unified masking across three modalities: text, image, and layout. The proposed model utilizes an encoder-decoder architecture, which allows for increased adaptability to a wide range of downstream tasks with diverse output formats, in contrast to the encoder-only models commonly employed in doc…
▽ More
This paper presents GenDoc, a general sequence-to-sequence document understanding model pre-trained with unified masking across three modalities: text, image, and layout. The proposed model utilizes an encoder-decoder architecture, which allows for increased adaptability to a wide range of downstream tasks with diverse output formats, in contrast to the encoder-only models commonly employed in document understanding. In addition to the traditional text infilling task used in previous encoder-decoder models, our pre-training extends to include tasks of masked image token prediction and masked layout prediction. We also design modality-specific instruction and adopt both disentangled attention and the mixture-of-modality-experts strategy to effectively capture the information leveraged by each modality. Evaluation of the proposed model through extensive experiments on several downstream tasks in document understanding demonstrates its ability to achieve superior or competitive performance compared to state-of-the-art approaches. Our analysis further suggests that GenDoc is more robust than the encoder-only models in scenarios where the OCR quality is imperfect.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Differential Convolutional Fuzzy Time Series Forecasting
Authors:
Tianxiang Zhan,
Yuanpeng He,
Yong Deng,
Zhen Li
Abstract:
Fuzzy time series forecasting (FTSF) is a typical forecasting method with wide application. Traditional FTSF is regarded as an expert system which leads to loss of the ability to recognize undefined features. The mentioned is the main reason for poor forecasting with FTSF. To solve the problem, the proposed model Differential Fuzzy Convolutional Neural Network (DFCNN) utilizes a convolution neural…
▽ More
Fuzzy time series forecasting (FTSF) is a typical forecasting method with wide application. Traditional FTSF is regarded as an expert system which leads to loss of the ability to recognize undefined features. The mentioned is the main reason for poor forecasting with FTSF. To solve the problem, the proposed model Differential Fuzzy Convolutional Neural Network (DFCNN) utilizes a convolution neural network to re-implement FTSF with learnable ability. DFCNN is capable of recognizing potential information and improving forecasting accuracy. Thanks to the learnable ability of the neural network, the length of fuzzy rules established in FTSF is expended to an arbitrary length that the expert is not able to handle by the expert system. At the same time, FTSF usually cannot achieve satisfactory performance of non-stationary time series due to the trend of non-stationary time series. The trend of non-stationary time series causes the fuzzy set established by FTSF to be invalid and causes the forecasting to fail. DFCNN utilizes the Difference algorithm to weaken the non-stationary of time series so that DFCNN can forecast the non-stationary time series with a low error that FTSF cannot forecast in satisfactory performance. After the mass of experiments, DFCNN has an excellent prediction effect, which is ahead of the existing FTSF and common time series forecasting algorithms. Finally, DFCNN provides further ideas for improving FTSF and holds continued research value.
△ Less
Submitted 27 July, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
Experimental violation of Leggett-Garg inequality in a three-level trapped-ion system
Authors:
Tianxiang Zhan,
Chunwang Wu,
Manchao Zhang,
Qingqing Qin,
Xueying Yang,
Han Hu,
Wenbo Su,
Jie Zhang,
Ting Chen,
Yi Xie,
Wei Wu,
**xing Chen
Abstract:
Leggett-Garg inequality (LGI) studies the temporal correlation in the evolution of physical systems. Classical systems obey the LGI but quantum systems may violate it. The extent of the violation depends on the dimension of the quantum system and the state update rule. In this work, we experimentally test the LGI in a three-level trapped-ion system under the model of a large spin precessing in a m…
▽ More
Leggett-Garg inequality (LGI) studies the temporal correlation in the evolution of physical systems. Classical systems obey the LGI but quantum systems may violate it. The extent of the violation depends on the dimension of the quantum system and the state update rule. In this work, we experimentally test the LGI in a three-level trapped-ion system under the model of a large spin precessing in a magnetic field. The Von Neumann and Lüders state update rules are employed in our system for direct comparative analysis. The maximum observed value of Leggett-Garg correlator under the Von Neumann state update rule is $K_3 = 1.739 \pm 0.014$, which demonstrates a violation of the Lüders bound by 17 standard deviations and is by far the most significant violation in natural three-level systems.
△ Less
Submitted 15 September, 2022;
originally announced September 2022.
-
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
Authors:
Zhuang Li,
Lizhen Qu,
Qiongkai Xu,
Tongtong Wu,
Tianyang Zhan,
Gholamreza Haffari
Abstract:
In this paper, we propose a variational autoencoder with disentanglement priors, VAE-DPRIOR, for task-specific natural language generation with none or a handful of task-specific labeled examples. In order to tackle compositional generalization across tasks, our model performs disentangled representation learning by introducing a conditional prior for the latent content space and another condition…
▽ More
In this paper, we propose a variational autoencoder with disentanglement priors, VAE-DPRIOR, for task-specific natural language generation with none or a handful of task-specific labeled examples. In order to tackle compositional generalization across tasks, our model performs disentangled representation learning by introducing a conditional prior for the latent content space and another conditional prior for the latent label space. Both types of priors satisfy a novel property called $ε$-disentangled. We show both empirically and theoretically that the novel priors can disentangle representations even without specific regularizations as in the prior work. The content prior enables directly sampling diverse content representations from the content space learned from the seen tasks, and fuse them with the representations of novel tasks for generating semantically diverse texts in the low-resource settings. Our extensive experiments demonstrate the superior performance of our model over competitive baselines in terms of i) data augmentation in continuous zero/few-shot learning, and ii) text style transfer in the few-shot setting.
△ Less
Submitted 29 October, 2022; v1 submitted 27 February, 2022;
originally announced February 2022.
-
DVS: Deep Visibility Series and its Application in Construction Cost Index Forecasting
Authors:
Tianxiang Zhan,
Yuanpeng He,
Hanwen Li,
Fuyuan Xiao
Abstract:
Time series forecasting is a hot spot in recent years. Visibility Graph (VG) algorithm is used for time series forecasting in previous research, but the forecasting effect is not as good as deep learning prediction methods such as methods based on Artificial Neural Network (ANN), Convolutional Neural Network (CNN) and Long Short-Term Memory Network (LSTM). The visibility graph generated from speci…
▽ More
Time series forecasting is a hot spot in recent years. Visibility Graph (VG) algorithm is used for time series forecasting in previous research, but the forecasting effect is not as good as deep learning prediction methods such as methods based on Artificial Neural Network (ANN), Convolutional Neural Network (CNN) and Long Short-Term Memory Network (LSTM). The visibility graph generated from specific time series contains abundant network information, but the previous forecasting method did not effectively use the network information to forecast, resulting in relatively large prediction errors. To optimize the forecasting method based on VG, this article proposes the Deep Visibility Series (DVS) module through the bionic design of VG and the expansion of the past research. By applying the bionic design of biological vision to VG, DVS has obtained superior forecasting accuracy. At the same time, this paper applies the DVS forecasting method to the construction cost index forecast, which has practical significance.
△ Less
Submitted 14 May, 2022; v1 submitted 7 November, 2021;
originally announced November 2021.
-
A Comparative Study of Transformer-Based Language Models on Extractive Question Answering
Authors:
Kate Pearce,
Tiffany Zhan,
Aneesh Komanduri,
Justin Zhan
Abstract:
Question Answering (QA) is a task in natural language processing that has seen considerable growth after the advent of transformers. There has been a surge in QA datasets that have been proposed to challenge natural language processing models to improve human and existing model performance. Many pre-trained language models have proven to be incredibly effective at the task of extractive question a…
▽ More
Question Answering (QA) is a task in natural language processing that has seen considerable growth after the advent of transformers. There has been a surge in QA datasets that have been proposed to challenge natural language processing models to improve human and existing model performance. Many pre-trained language models have proven to be incredibly effective at the task of extractive question answering. However, generalizability remains as a challenge for the majority of these models. That is, some datasets require models to reason more than others. In this paper, we train various pre-trained language models and fine-tune them on multiple question answering datasets of varying levels of difficulty to determine which of the models are capable of generalizing the most comprehensively across different datasets. Further, we propose a new architecture, BERT-BiLSTM, and compare it with other language models to determine if adding more bidirectionality can improve model performance. Using the F1-score as our metric, we find that the RoBERTa and BART pre-trained models perform the best across all datasets and that our BERT-BiLSTM model outperforms the baseline BERT model.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
A General Machine Learning-based Approach for Inverse Design of One-dimensional Photonic Crystals Toward Targeted Visible Light Reflection Spectrum
Authors:
Tao Zhan,
Quan-Shan Liu,
Lu Qiu,
Yuan-Jie Sun,
Tao Wen,
Rui Zhang
Abstract:
Data-driven methods have increasingly been applied to the development of optical systems as inexpensive and effective inverse design approaches. Optical properties (e.g., band-gap properties) of photonic crystals (PCs) are closely associated with characteristics of their light reflection spectra. Finding optimal PC constructions (within a pre-specified parameter space) that generate reflection spe…
▽ More
Data-driven methods have increasingly been applied to the development of optical systems as inexpensive and effective inverse design approaches. Optical properties (e.g., band-gap properties) of photonic crystals (PCs) are closely associated with characteristics of their light reflection spectra. Finding optimal PC constructions (within a pre-specified parameter space) that generate reflection spectra closest to a targeted spectrum is thus an interesting and meaningful inverse design problem, although relevant studies are still limited. Here we report a generally effective machine learning-based inverse design approach for one-dimensional photonic crystals (1DPCs), focusing on visible light spectra which are of high practical relevance. For a given class of 1DPC system, a deep neural network (DNN) in a unified structure is first trained over data from sizeable forward calculations (from layer thicknesses to spectrum). An iterative optimization scheme is then developed based on a coherent integration of DNN backward predictions (from spectrum to layer thicknesses), forward calculations, and Monte Carlo moves. We employ this new approach to four representative 1DPC systems including periodic structures with two-, three-, and four-layer repeating units and a heterostructure. The approach successfully converges to solutions of optimal 1DPC constructions for various targeted spectra regardless of their exact achievability. As two demonstrating examples, inverse designs toward a specially constructed "rectangle-shaped" green-light or red-light reflection spectrum are presented and discussed in detail. Remarkably, the results show that the approach can efficiently find out optimal layer thicknesses even when they are far outside the range covered by the original training data of DNN.
△ Less
Submitted 30 September, 2021;
originally announced September 2021.
-
Construction Cost Index Forecasting: A Multi-feature Fusion Approach
Authors:
Tianxiang Zhan,
Yuanpeng He,
Fuyuan Xiao
Abstract:
The construction cost index is an important indicator of the construction industry. Predicting CCI has important practical significance. This paper combines information fusion with machine learning, and proposes a multi-feature fusion (MFF) module for time series forecasting. The main contribution of MFF is to improve the prediction accuracy of CCI, and propose a feature fusion framework for time…
▽ More
The construction cost index is an important indicator of the construction industry. Predicting CCI has important practical significance. This paper combines information fusion with machine learning, and proposes a multi-feature fusion (MFF) module for time series forecasting. The main contribution of MFF is to improve the prediction accuracy of CCI, and propose a feature fusion framework for time series. Compared with the convolution module, the MFF module is a module that extracts certain features. Experiments have proved that the combination of MFF module and multi-layer perceptron has a relatively good prediction effect. The MFF neural network model has high prediction accuracy and prediction efficiency, which is a study of continuous attention.
△ Less
Submitted 26 February, 2022; v1 submitted 18 August, 2021;
originally announced August 2021.
-
Uncertainty Measurement of Basic Probability Assignment Integrity Based on Approximate Entropy in Evidence Theory
Authors:
Tianxiang Zhan,
Yuanpeng He,
Hanwen Li,
Fuyuan Xiao
Abstract:
Evidence theory is that the extension of probability can better deal with unknowns and inaccurate information. Uncertainty measurement plays a vital role in both evidence theory and probability theory. Approximate Entropy (ApEn) is proposed by Pincus to describe the irregularities of complex systems. The more irregular the time series, the greater the approximate entropy. The ApEn of the network r…
▽ More
Evidence theory is that the extension of probability can better deal with unknowns and inaccurate information. Uncertainty measurement plays a vital role in both evidence theory and probability theory. Approximate Entropy (ApEn) is proposed by Pincus to describe the irregularities of complex systems. The more irregular the time series, the greater the approximate entropy. The ApEn of the network represents the ability of a network to generate new nodes, or the possibility of undiscovered nodes. Through the association of network characteristics and basic probability assignment (BPA) , a measure of the uncertainty of BPA regarding completeness can be obtained. The main contribution of paper is to define the integrity of the basic probability assignment then the approximate entropy of the BPA is proposed to measure the uncertainty of the integrity of the BPA. The proposed method is based on the logical network structure to calculate the uncertainty of BPA in evidence theory. The uncertainty based on the proposed method represents the uncertainty of integrity of BPA and contributes to the identification of the credibility of BPA.
△ Less
Submitted 17 May, 2021; v1 submitted 16 May, 2021;
originally announced May 2021.
-
Deep Neural Networks Guided Ensemble Learning for Point Estimation
Authors:
Tianyu Zhan,
Haoda Fu,
Jian Kang
Abstract:
In modern statistics, interests shift from pursuing the uniformly minimum variance unbiased estimator to reducing mean squared error (MSE) or residual squared error. Shrinkage based estimation and regression methods offer better prediction accuracy and improved interpretation. However, the characterization of such optimal statistics in terms of minimizing MSE remains open and challenging in many p…
▽ More
In modern statistics, interests shift from pursuing the uniformly minimum variance unbiased estimator to reducing mean squared error (MSE) or residual squared error. Shrinkage based estimation and regression methods offer better prediction accuracy and improved interpretation. However, the characterization of such optimal statistics in terms of minimizing MSE remains open and challenging in many problems, for example estimating treatment effect in adaptive clinical trials with pre-planned modifications to design aspects based on accumulated data. From an alternative perspective, we propose a deep neural network based automatic method to construct an improved estimator from existing ones. Theoretical properties are studied to provide guidance on applicability of our estimator to seek potential improvement. Simulation studies demonstrate that the proposed method has considerable finite-sample efficiency gain as compared with several common estimators. In the Adaptive COVID-19 Treatment Trial (ACTT) as an important application, our ensemble estimator essentially contributes to a more ethical and efficient adaptive clinical trial with fewer patients enrolled. The proposed framework can be generally applied to various statistical problems, and can be served as a reference measure to guide statistical research.
△ Less
Submitted 2 October, 2023; v1 submitted 13 May, 2021;
originally announced May 2021.
-
Quantum soft likelihood function based on ordered weighted average operator
Authors:
Tianxiang Zhan,
Yuanpeng He,
Fuyuan Xiao
Abstract:
Quantum theory is the focus of current research. Likelihood functions are widely used in many fields. Because the classic likelihood functions are too strict for extreme data in practical applications, Yager proposed soft ordered weighted average (OWA) operator. In the quantum method, probability is represented by Euler's function. How to establish a connection between quantum theory and OWA is al…
▽ More
Quantum theory is the focus of current research. Likelihood functions are widely used in many fields. Because the classic likelihood functions are too strict for extreme data in practical applications, Yager proposed soft ordered weighted average (OWA) operator. In the quantum method, probability is represented by Euler's function. How to establish a connection between quantum theory and OWA is also an open question. This article proposes OWA opreator under quantum theory, and discusses the relationship between quantum soft OWA operater and classical soft OWA operator through some examples. Similar to other quantum models, this research has more extensive applications in quantum information.
△ Less
Submitted 1 September, 2021; v1 submitted 13 April, 2021;
originally announced May 2021.
-
A Fast Evidential Approach for Stock Forecasting
Authors:
Tianxiang Zhan,
Fuyuan Xiao
Abstract:
Within the framework of evidence theory, the confidence functions of different information can be combined into a combined confidence function to solve uncertain problems. The Dempster combination rule is a classic method of fusing different information. This paper proposes a similar confidence function for the time point in the time series. The Dempster combination rule can be used to fuse the gr…
▽ More
Within the framework of evidence theory, the confidence functions of different information can be combined into a combined confidence function to solve uncertain problems. The Dempster combination rule is a classic method of fusing different information. This paper proposes a similar confidence function for the time point in the time series. The Dempster combination rule can be used to fuse the growth rate of the last time point, and finally a relatively accurate forecast data can be obtained. Stock price forecasting is a concern of economics. The stock price data is large in volume, and more accurate forecasts are required at the same time. The classic methods of time series, such as ARIMA, cannot balance forecasting efficiency and forecasting accuracy at the same time. In this paper, the fusion method of evidence theory is applied to stock price prediction. Evidence theory deals with the uncertainty of stock price prediction and improves the accuracy of prediction. At the same time, the fusion method of evidence theory has low time complexity and fast prediction processing speed.
△ Less
Submitted 17 July, 2021; v1 submitted 12 April, 2021;
originally announced April 2021.
-
A novel weighted approach for time series forecasting based on visibility graph
Authors:
Tianxiang Zhan,
Fuyuan Xiao
Abstract:
Time series has attracted a lot of attention in many fields today. Time series forecasting algorithm based on complex network analysis is a research hotspot. How to use time series information to achieve more accurate forecasting is a problem. To solve this problem, this paper proposes a weighted network forecasting method to improve the forecasting accuracy. Firstly, the time series will be trans…
▽ More
Time series has attracted a lot of attention in many fields today. Time series forecasting algorithm based on complex network analysis is a research hotspot. How to use time series information to achieve more accurate forecasting is a problem. To solve this problem, this paper proposes a weighted network forecasting method to improve the forecasting accuracy. Firstly, the time series will be transformed into a complex network, and the similarity between nodes will be found. Then, the similarity will be used as a weight to make weighted forecasting on the predicted values produced by different nodes. Compared with the previous method, the proposed method is more accurate. In order to verify the effect of the proposed method, the experimental part is tested on M1, M3 datasets and Construction Cost Index (CCI) dataset, which shows that the proposed method has more accurate forecasting performance.
△ Less
Submitted 20 August, 2022; v1 submitted 13 March, 2021;
originally announced March 2021.
-
A Matrix-based Distance of Pythagorean Fuzzy Set and its Application in Medical Diagnosis
Authors:
Yuanpeng He,
Lijian Li,
Tianxiang Zhan
Abstract:
The pythagorean fuzzy set (PFS) which is developed based on intuitionistic fuzzy set, is more efficient in elaborating and disposing uncertainties in indeterminate situations, which is a very reason of that PFS is applied in various kinds of fields. How to measure the distance between two pythagorean fuzzy sets is still an open issue. Mnay kinds of methods have been proposed to present the of the…
▽ More
The pythagorean fuzzy set (PFS) which is developed based on intuitionistic fuzzy set, is more efficient in elaborating and disposing uncertainties in indeterminate situations, which is a very reason of that PFS is applied in various kinds of fields. How to measure the distance between two pythagorean fuzzy sets is still an open issue. Mnay kinds of methods have been proposed to present the of the question in former reaserches. However, not all of existing methods can accurately manifest differences among pythagorean fuzzy sets and satisfy the property of similarity. And some other kinds of methods neglect the relationship among three variables of pythagorean fuzzy set. To addrees the proplem, a new method of measuring distance is proposed which meets the requirements of axiom of distance measurement and is able to indicate the degree of distinction of PFSs well. Then some numerical examples are offered to to verify that the method of measuring distances can avoid the situation that some counter? intuitive and irrational results are produced and is more effective, reasonable and advanced than other similar methods. Besides, the proposed method of measuring distances between PFSs is applied in a real environment of application which is the medical diagnosis and is compared with other previous methods to demonstrate its superiority and efficiency. And the feasibility of the proposed method in handling uncertainties in practice is also proved at the same time.
△ Less
Submitted 23 May, 2024; v1 submitted 31 January, 2021;
originally announced February 2021.
-
A Framework for Deep Constrained Clustering
Authors:
Hong**g Zhang,
Tianyang Zhan,
Sugato Basu,
Ian Davidson
Abstract:
The area of constrained clustering has been extensively explored by researchers and used by practitioners. Constrained clustering formulations exist for popular algorithms such as k-means, mixture models, and spectral clustering but have several limitations. A fundamental strength of deep learning is its flexibility, and here we explore a deep learning framework for constrained clustering and in p…
▽ More
The area of constrained clustering has been extensively explored by researchers and used by practitioners. Constrained clustering formulations exist for popular algorithms such as k-means, mixture models, and spectral clustering but have several limitations. A fundamental strength of deep learning is its flexibility, and here we explore a deep learning framework for constrained clustering and in particular explore how it can extend the field of constrained clustering. We show that our framework can not only handle standard together/apart constraints (without the well documented negative effects reported earlier) generated from labeled side information but more complex constraints generated from new types of side information such as continuous values and high-level domain knowledge. Furthermore, we propose an efficient training paradigm that is generally applicable to these four types of constraints. We validate the effectiveness of our approach by empirical results on both image and text datasets. We also study the robustness of our framework when learning with noisy constraints and show how different components of our framework contribute to the final performance. Our source code is available at $\href{https://github.com/blueocean92/deep_constrained_clustering}{\text{URL}}$.
△ Less
Submitted 7 January, 2021;
originally announced January 2021.
-
Deep Historical Borrowing Framework to Prospectively and Simultaneously Synthesize Control Information in Confirmatory Clinical Trials with Multiple Endpoints
Authors:
Tianyu Zhan,
Yiwang Zhou,
Ziqian Geng,
Yihua Gu,
Jian Kang,
Li Wang,
Xiaohong Huang,
Elizabeth H. Slate
Abstract:
In current clinical trial development, historical information is receiving more attention as it provides utility beyond sample size calculation. Meta-analytic-predictive (MAP) priors and robust MAP priors have been proposed for prospectively borrowing historical data on a single endpoint. To simultaneously synthesize control information from multiple endpoints in confirmatory clinical trials, we p…
▽ More
In current clinical trial development, historical information is receiving more attention as it provides utility beyond sample size calculation. Meta-analytic-predictive (MAP) priors and robust MAP priors have been proposed for prospectively borrowing historical data on a single endpoint. To simultaneously synthesize control information from multiple endpoints in confirmatory clinical trials, we propose to approximate posterior probabilities from a Bayesian hierarchical model and estimate critical values by deep learning to construct pre-specified strategies for hypothesis testing. This feature is important to ensure study integrity by establishing prospective decision functions before the trial conduct. Simulations are performed to show that our method properly controls family-wise error rate (FWER) and preserves power as compared with a typical practice of choosing constant critical values given a subset of null space. Satisfactory performance under prior-data conflict is also demonstrated. We further illustrate our method using a case study in Immunology.
△ Less
Submitted 1 August, 2022; v1 submitted 28 August, 2020;
originally announced August 2020.
-
A practical Response Adaptive Block Randomization (RABR) design with analytic type I error protection
Authors:
Tianyu Zhan,
Lu Cui,
Ziqian Geng,
Lanju Zhang,
Yihua Gu,
Ivan S. F. Chan
Abstract:
Response adaptive randomization (RAR) is appealing from methodological, ethical, and pragmatic perspectives in the sense that subjects are more likely to be randomized to better performing treatment groups based on accumulating data. However, applications of RAR in confirmatory drug clinical trials with multiple active arms are limited largely due to its complexity, and lack of control of randomiz…
▽ More
Response adaptive randomization (RAR) is appealing from methodological, ethical, and pragmatic perspectives in the sense that subjects are more likely to be randomized to better performing treatment groups based on accumulating data. However, applications of RAR in confirmatory drug clinical trials with multiple active arms are limited largely due to its complexity, and lack of control of randomization ratios to different treatment groups. To address the aforementioned issues, we propose a Response Adaptive Block Randomization (RABR) design allowing arbitrarily pre-specified randomization ratios for the control and high-performing groups to meet clinical trial objectives. We show the validity of the conventional unweighted test in RABR with a controlled type I error rate based on the weighted combination test for sample size adaptive design invoking no large sample approximation. The advantages of the proposed RABR in terms of robustly reaching target final sample size to meet regulatory requirements and increasing statistical power as compared with the popular Doubly Adaptive Biased Coin Design (DBCD) are demonstrated by statistical simulations and a practical clinical trial design example.
△ Less
Submitted 1 August, 2022; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Line Drawings of Natural Scenes Guide Visual Attention
Authors:
Kai-Fu Yang,
Wen-Wen Jiang,
Teng-Fei Zhan,
Yong-Jie Li
Abstract:
Visual search is an important strategy of the human visual system for fast scene perception. The guided search theory suggests that the global layout or other top-down sources of scenes play a crucial role in guiding object searching. In order to verify the specific roles of scene layout and regional cues in guiding visual attention, we executed a psychophysical experiment to record the human fixa…
▽ More
Visual search is an important strategy of the human visual system for fast scene perception. The guided search theory suggests that the global layout or other top-down sources of scenes play a crucial role in guiding object searching. In order to verify the specific roles of scene layout and regional cues in guiding visual attention, we executed a psychophysical experiment to record the human fixations on line drawings of natural scenes with an eye-tracking system in this work. We collected the human fixations of ten subjects from 498 natural images and of another ten subjects from the corresponding 996 human-marked line drawings of boundaries (two boundary maps per image) under free-viewing condition. The experimental results show that with the absence of some basic features like color and luminance, the distribution of the fixations on the line drawings has a high correlation with that on the natural images. Moreover, compared to the basic cues of regions, subjects pay more attention to the closed regions of line drawings which are usually related to the dominant objects of the scenes. Finally, we built a computational model to demonstrate that the fixation information on the line drawings can be used to significantly improve the performances of classical bottom-up models for fixation prediction in natural scenes. These results support that Gestalt features and scene layout are important cues for guiding fast visual object searching.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
Finite-Sample Two-Group Composite Hypothesis Testing via Machine Learning
Authors:
Tianyu Zhan,
Jian Kang
Abstract:
In the problem of composite hypothesis testing, identifying the potential uniformly most powerful (UMP) unbiased test is of great interest. Beyond typical hypothesis settings with exponential family, it is usually challenging to prove the existence and further construct such UMP unbiased tests with finite sample size. For example in the COVID-19 pandemic with limited previous assumptions on the tr…
▽ More
In the problem of composite hypothesis testing, identifying the potential uniformly most powerful (UMP) unbiased test is of great interest. Beyond typical hypothesis settings with exponential family, it is usually challenging to prove the existence and further construct such UMP unbiased tests with finite sample size. For example in the COVID-19 pandemic with limited previous assumptions on the treatment for investigation and the standard of care, adaptive clinical trials are appealing due to ethical considerations, and the ability to accommodate uncertainty while conducting the trial. Although several methods have been proposed to control type I error rates, how to find a more powerful hypothesis testing strategy is still an open question. Motivated by this problem, we propose an automatic framework of constructing test statistics and corresponding critical values via machine learning methods to enhance power in a finite sample. In this article, we particularly illustrate the performance using Deep Neural Networks (DNN) and discuss its advantages. Simulations and two case studies of adaptive designs demonstrate that our method is automatic, general and pre-specified to construct statistics with satisfactory power in finite-sample. Supplemental materials are available online including R code and an R shiny app.
△ Less
Submitted 1 August, 2022; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Optimizing Graphical Procedures for Multiplicity Control in a Confirmatory Clinical Trial via Deep Learning
Authors:
Tianyu Zhan,
Alan H Hartford,
Jian Kang,
Walter W Offen
Abstract:
In confirmatory clinical trials, it has been proposed to use a simple iterative graphical approach to construct and perform intersection hypotheses tests with a weighted Bonferroni-type procedure to control type I errors in the strong sense. Given Phase II study results or other prior knowledge, it is usually of main interest to find the optimal graph that maximizes a certain objective function in…
▽ More
In confirmatory clinical trials, it has been proposed to use a simple iterative graphical approach to construct and perform intersection hypotheses tests with a weighted Bonferroni-type procedure to control type I errors in the strong sense. Given Phase II study results or other prior knowledge, it is usually of main interest to find the optimal graph that maximizes a certain objective function in a future Phase III study. In this article, we evaluate the performance of two existing derivative-free constrained methods, and further propose a deep learning enhanced optimization framework. Our method numerically approximates the objective function via feedforward neural networks (FNNs) and then performs optimization with available gradient information. It can be constrained so that some features of the testing procedure are held fixed while optimizing over other features. Simulation studies show that our FNN-based approach has a better balance between robustness and time efficiency than some existing derivative-free constrained optimization algorithms. Compared to the traditional stochastic search method, our optimizer has moderate multiplicity adjusted power gain when the number of hypotheses is relatively large. We further apply it to a case study to illustrate how to optimize a multiple testing procedure with respect to a specific study objective.
△ Less
Submitted 1 August, 2022; v1 submitted 27 August, 2019;
originally announced August 2019.
-
Unit-free and robust detection of differential expression from RNA-Seq data
Authors:
Hui Jiang,
Tianyu Zhan
Abstract:
Ultra high-throughput sequencing of transcriptomes (RNA-Seq) is a widely used method for quantifying gene expression levels due to its low cost, high accuracy and wide dynamic range for detection. However, the nature of RNA-Seq makes it nearly impossible to provide absolute measurements of transcript abundances. Several units or data summarization methods for transcript quantification have been pr…
▽ More
Ultra high-throughput sequencing of transcriptomes (RNA-Seq) is a widely used method for quantifying gene expression levels due to its low cost, high accuracy and wide dynamic range for detection. However, the nature of RNA-Seq makes it nearly impossible to provide absolute measurements of transcript abundances. Several units or data summarization methods for transcript quantification have been proposed in the past to account for differences in transcript lengths and sequencing depths across different genes and different samples. Nevertheless, further between-sample normalization is still needed for reliable detection of differentially expressed genes. In this paper we propose a unified statistical model for joint detection of differential gene expression and between-sample normalization. Our method is independent of the unit in which gene expression levels are summarized. We also introduce an efficient algorithm for model fitting. Due to the L0-penalized likelihood used in our model, it is able to reliably normalize the data and detect differential gene expression in some cases when more than $50\%$ of the genes are differentially expressed in an asymmetric manner. We compare our method with existing methods using simulated and real data sets.
△ Less
Submitted 26 August, 2016; v1 submitted 18 May, 2014;
originally announced May 2014.
-
Determination of the quantized topological magneto-electric effect in topological insulators from Rayleigh scattering
Authors:
Lixin Ge,
Tianrong Zhan,
Dezhuan Han,
Xiaohan Liu,
Jian Zi
Abstract:
Topological insulators (TIs) exhibit many exotic properties. In particular, a topological magneto-electric (TME) effect, quantized in units of the fine structure constant, exists in TIs. In this Letter, we study theoretically the scattering properties of electromagnetic waves by TI circular cylinders particularly in the Rayleigh scattering limit. Compared with ordinary dielectric cylinders, the sc…
▽ More
Topological insulators (TIs) exhibit many exotic properties. In particular, a topological magneto-electric (TME) effect, quantized in units of the fine structure constant, exists in TIs. In this Letter, we study theoretically the scattering properties of electromagnetic waves by TI circular cylinders particularly in the Rayleigh scattering limit. Compared with ordinary dielectric cylinders, the scattering by TI cylinders shows many unusual features due to the TME effect. Two proposals are suggested to determine the TME effect of TIs simply based on measuring the electric-field components of scattered waves in the far field at one or two scattering angles. Our results could also offer a way to measure the fine structure constant.
△ Less
Submitted 9 April, 2014;
originally announced April 2014.
-
Tunable terahertz radiation from graphene induced by moving electrons
Authors:
T. R. Zhan,
D. Z. Han,
X. H. Hu,
X. H. Liu,
S. T. Chui,
J. Zi
Abstract:
Based on a structure consisting of a single graphene layer situated on a periodic dielectric grating, we show theoretically that intense terahertz (THz) radiations can be generated by an electron bunch moving atop the graphene layer. The underlying physics lies in the fact that a moving electron bunch with rather low electron energy ($\sim$1 keV) can efficiently excite graphene plasmons (GPs) of T…
▽ More
Based on a structure consisting of a single graphene layer situated on a periodic dielectric grating, we show theoretically that intense terahertz (THz) radiations can be generated by an electron bunch moving atop the graphene layer. The underlying physics lies in the fact that a moving electron bunch with rather low electron energy ($\sim$1 keV) can efficiently excite graphene plasmons (GPs) of THz frequencies with a strong confinement of near-fields. GPs can be further scattered into free space by the grating for those satisfying the phase matching condition. The radiation patterns can be controlled by varying the velocity of the moving electrons. Importantly, the radiation frequencies can be tuned by varying the Fermi level of the graphene layer, offering tunable THz radiations that can cover a wide frequency range. Our results could pave the way toward develo** tunable and miniature THz radiation sources based on graphene.
△ Less
Submitted 12 February, 2014;
originally announced February 2014.
-
Transfer matrix method for optics in graphene layers
Authors:
Tianrong Zhan,
Xi Shi,
Yunyun Dai,
Xiaohan Liu,
Jian Zi
Abstract:
A transfer matrix method is developed for optical calculations of non-interacting graphene layers. Within the framework of this method, optical properties such as reflection, transmission and absorption for single-, double- and multi-layer graphene are studied. We also apply the method to structures consisting of periodically arranged graphene layers, revealing well-defined photonic band structure…
▽ More
A transfer matrix method is developed for optical calculations of non-interacting graphene layers. Within the framework of this method, optical properties such as reflection, transmission and absorption for single-, double- and multi-layer graphene are studied. We also apply the method to structures consisting of periodically arranged graphene layers, revealing well-defined photonic band structures and even photonic bandgaps. Finally, we discuss graphene plasmons and introduce a simple way to tune the plasmon dispersion.
△ Less
Submitted 23 December, 2012;
originally announced December 2012.