-
Oralytics Reinforcement Learning Algorithm
Authors:
Anna L. Trella,
Kelly W. Zhang,
Stephanie M. Carpenter,
David Elashoff,
Zara M. Greer,
Inbal Nahum-Shani,
Dennis Ruenger,
Vivek Shetty,
Susan A. Murphy
Abstract:
Dental disease is still one of the most common chronic diseases in the United States. While dental disease is preventable through healthy oral self-care behaviors (OSCB), this basic behavior is not consistently practiced. We have developed Oralytics, an online, reinforcement learning (RL) algorithm that optimizes the delivery of personalized intervention prompts to improve OSCB. In this paper, we…
▽ More
Dental disease is still one of the most common chronic diseases in the United States. While dental disease is preventable through healthy oral self-care behaviors (OSCB), this basic behavior is not consistently practiced. We have developed Oralytics, an online, reinforcement learning (RL) algorithm that optimizes the delivery of personalized intervention prompts to improve OSCB. In this paper, we offer a full overview of algorithm design decisions made using prior data, domain expertise, and experiments in a simulation test bed. The finalized RL algorithm was deployed in the Oralytics clinical trial, conducted from fall 2023 to summer 2024.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Posterior Sampling via Autoregressive Generation
Authors:
Kelly W Zhang,
Tiffany,
Cai,
Hongseok Namkoong,
Daniel Russo
Abstract:
Real-world decision-making requires grappling with a perpetual lack of data as environments change; intelligent agents must comprehend uncertainty and actively gather information to resolve it. We propose a new framework for learning bandit algorithms from massive historical data, which we demonstrate in a cold-start recommendation problem. First, we use historical data to pretrain an autoregressi…
▽ More
Real-world decision-making requires grappling with a perpetual lack of data as environments change; intelligent agents must comprehend uncertainty and actively gather information to resolve it. We propose a new framework for learning bandit algorithms from massive historical data, which we demonstrate in a cold-start recommendation problem. First, we use historical data to pretrain an autoregressive model to predict a sequence of repeated feedback/rewards (e.g., responses to news articles shown to different users over time). In learning to make accurate predictions, the model implicitly learns an informed prior based on rich action features (e.g., article headlines) and how to sharpen beliefs as more rewards are gathered (e.g., clicks as each article is recommended). At decision-time, we autoregressively sample (impute) an imagined sequence of rewards for each action, and choose the action with the largest average imputed reward. Far from a heuristic, our approach is an implementation of Thompson sampling (with a learned prior), a prominent active exploration algorithm. We prove our pretraining loss directly controls online decision-making performance, and we demonstrate our framework on a news recommendation task where we integrate end-to-end fine-tuning of a pretrained language model to process news article headline text to improve performance.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
The Fallacy of Minimizing Local Regret in the Sequential Task Setting
Authors:
Zi** Xu,
Kelly W. Zhang,
Susan A. Murphy
Abstract:
In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret. In a stationary setting, strong theoretical guarantees, like a sublinear ($\sqrt{T}$) regret bound, can be obtained, which typically implies the convergence to an optimal policy and the cessation of explor…
▽ More
In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret. In a stationary setting, strong theoretical guarantees, like a sublinear ($\sqrt{T}$) regret bound, can be obtained, which typically implies the convergence to an optimal policy and the cessation of exploration. However, these theoretical setups often oversimplify the complexities encountered in real-world RL implementations, where tasks arrive sequentially with substantial changes between tasks and the algorithm may not be allowed to adaptively learn within certain tasks. We study the changes beyond the outcome distributions, encompassing changes in the reward designs (map**s from outcomes to rewards) and the permissible policy spaces. Our results reveal the fallacy of myopically minimizing regret within each task: obtaining optimal regret rates in the early tasks may lead to worse rates in the subsequent ones, even when the outcome distributions stay the same. To realize the optimal cumulative regret bound across all the tasks, the algorithm has to overly explore in the earlier tasks. This theoretical insight is practically significant, suggesting that due to unanticipated changes (e.g., rapid technological development or human-in-the-loop involvement) between tasks, the algorithm needs to explore more than it would in the usual stationary setting within each task. Such implication resonates with the common practice of using clipped policies in mobile health clinical trials and maintaining a fixed rate of $ε$-greedy exploration in robotic learning.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Monitoring Fidelity of Online Reinforcement Learning Algorithms in Clinical Trials
Authors:
Anna L. Trella,
Kelly W. Zhang,
Inbal Nahum-Shani,
Vivek Shetty,
Iris Yan,
Finale Doshi-Velez,
Susan A. Murphy
Abstract:
Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in cli…
▽ More
Online reinforcement learning (RL) algorithms offer great potential for personalizing treatment for participants in clinical trials. However, deploying an online, autonomous algorithm in the high-stakes healthcare setting makes quality control and data quality especially difficult to achieve. This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in clinical trials. It emphasizes the responsibility of the algorithm to (1) safeguard participants and (2) preserve the scientific utility of the data for post-trial analyses. We also present a framework for pre-deployment planning and real-time monitoring to help algorithm developers and clinical researchers ensure algorithm fidelity. To illustrate our framework's practical application, we present real-world examples from the Oralytics clinical trial. Since Spring 2023, this trial successfully deployed an autonomous, online RL algorithm to personalize behavioral interventions for participants at risk for dental disease.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Reward Design For An Online Reinforcement Learning Algorithm Supporting Oral Self-Care
Authors:
Anna L. Trella,
Kelly W. Zhang,
Inbal Nahum-Shani,
Vivek Shetty,
Finale Doshi-Velez,
Susan A. Murphy
Abstract:
Dental disease is one of the most common chronic diseases despite being largely preventable. However, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in o…
▽ More
Dental disease is one of the most common chronic diseases despite being largely preventable. However, professional advice on optimal oral hygiene practices is often forgotten or abandoned by patients. Therefore patients may benefit from timely and personalized encouragement to engage in oral self-care behaviors. In this paper, we develop an online reinforcement learning (RL) algorithm for use in optimizing the delivery of mobile-based prompts to encourage oral hygiene behaviors. One of the main challenges in develo** such an algorithm is ensuring that the algorithm considers the impact of the current action on the effectiveness of future actions (i.e., delayed effects), especially when the algorithm has been made simple in order to run stably and autonomously in a constrained, real-world setting (i.e., highly noisy, sparse data). We address this challenge by designing a quality reward which maximizes the desired health outcome (i.e., high-quality brushing) while minimizing user burden. We also highlight a procedure for optimizing the hyperparameters of the reward by building a simulation environment test bed and evaluating candidates using the test bed. The RL algorithm discussed in this paper will be deployed in Oralytics, an oral self-care app that provides behavioral strategies to boost patient engagement in oral hygiene practices.
△ Less
Submitted 14 September, 2022; v1 submitted 15 August, 2022;
originally announced August 2022.
-
A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes
Authors:
Kelly W. Zhang,
Omer Gottesman,
Finale Doshi-Velez
Abstract:
In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments. However, when deploying reinforcement learning algorithms in the real world, even with domain expertise, it is often difficult to know whether it is appropriate to treat a sequential decision making problem as a CB or an MDP. In other word…
▽ More
In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments. However, when deploying reinforcement learning algorithms in the real world, even with domain expertise, it is often difficult to know whether it is appropriate to treat a sequential decision making problem as a CB or an MDP. In other words, do actions affect future states, or only the immediate rewards? Making the wrong assumption regarding the nature of the environment can lead to inefficient learning, or even prevent the algorithm from ever learning an optimal policy, even with infinite data. In this work we develop an online algorithm that uses a Bayesian hypothesis testing approach to learn the nature of the environment. Our algorithm allows practitioners to incorporate prior knowledge about whether the environment is that of a CB or an MDP, and effectively interpolate between classical CB and MDP-based algorithms to mitigate against the effects of misspecifying the environment. We perform simulations and demonstrate that in CB settings our algorithm achieves lower regret than MDP-based algorithms, while in non-bandit MDP settings our algorithm is able to learn the optimal policy, often achieving comparable regret to MDP-based algorithms.
△ Less
Submitted 30 July, 2022;
originally announced August 2022.
-
Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation Guidelines
Authors:
Anna L. Trella,
Kelly W. Zhang,
Inbal Nahum-Shani,
Vivek Shetty,
Finale Doshi-Velez,
Susan A. Murphy
Abstract:
Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurat…
▽ More
Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education. Common challenges in designing and testing an RL algorithm in these settings include ensuring the RL algorithm can learn and run stably under real-time constraints, and accounting for the complexity of the environment, e.g., a lack of accurate mechanistic models for the user dynamics. To guide how one can tackle these challenges, we extend the PCS (Predictability, Computability, Stability) framework, a data science framework that incorporates best practices from machine learning and statistics in supervised learning (Yu and Kumbier, 2020), to the design of RL algorithms for the digital interventions setting. Further, we provide guidelines on how to design simulation environments, a crucial tool for evaluating RL candidate algorithms using the PCS framework. We illustrate the use of the PCS framework for designing an RL algorithm for Oralytics, a mobile health study aiming to improve users' tooth-brushing behaviors through the personalized delivery of intervention messages. Oralytics will go into the field in late 2022.
△ Less
Submitted 18 August, 2022; v1 submitted 8 June, 2022;
originally announced June 2022.
-
Statistical Inference After Adaptive Sampling for Longitudinal Data
Authors:
Kelly W. Zhang,
Lucas Janson,
Susan A. Murphy
Abstract:
Online reinforcement learning and other adaptive sampling algorithms are increasingly used in digital intervention experiments to optimize treatment delivery for users over time. In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or "p…
▽ More
Online reinforcement learning and other adaptive sampling algorithms are increasingly used in digital intervention experiments to optimize treatment delivery for users over time. In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users. Combining or "pooling" data across users allows adaptive sampling algorithms to potentially learn faster. However, by pooling, these algorithms induce dependence between the sampled user data trajectories; we show that this can cause standard variance estimators for i.i.d. data to underestimate the true variance of common estimators on this data type. We develop novel methods to perform a variety of statistical analyses on such adaptively sampled data via Z-estimation. Specifically, we introduce the \textit{adaptive} sandwich variance estimator, a corrected sandwich estimator that leads to consistent variance estimates under adaptive sampling. Additionally, to prove our results we develop novel theoretical tools for empirical processes on non-i.i.d., adaptively sampled longitudinal data which may be of independent interest. This work is motivated by our efforts in designing experiments in which online reinforcement learning algorithms optimize treatment decisions, yet statistical inference is essential for conducting analyses after experiments conclude.
△ Less
Submitted 19 April, 2023; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Statistical Inference with M-Estimators on Adaptively Collected Data
Authors:
Kelly W. Zhang,
Lucas Janson,
Susan A. Murphy
Abstract:
Bandit algorithms are increasingly used in real-world sequential decision-making problems. Associated with this is an increased desire to be able to use the resulting datasets to answer scientific questions like: Did one type of ad lead to more purchases? In which contexts is a mobile health intervention effective? However, classical statistical approaches fail to provide valid confidence interval…
▽ More
Bandit algorithms are increasingly used in real-world sequential decision-making problems. Associated with this is an increased desire to be able to use the resulting datasets to answer scientific questions like: Did one type of ad lead to more purchases? In which contexts is a mobile health intervention effective? However, classical statistical approaches fail to provide valid confidence intervals when used with data collected with bandit algorithms. Alternative methods have recently been developed for simple models (e.g., comparison of means). Yet there is a lack of general methods for conducting statistical inference using more complex models on data collected with (contextual) bandit algorithms; for example, current methods cannot be used for valid inference on parameters in a logistic regression model for a binary reward. In this work, we develop theory justifying the use of M-estimators -- which includes estimators based on empirical risk minimization as well as maximum likelihood -- on data collected with adaptive algorithms, including (contextual) bandit algorithms. Specifically, we show that M-estimators, modified with particular adaptive weights, can be used to construct asymptotically valid confidence regions for a variety of inferential targets.
△ Less
Submitted 19 November, 2021; v1 submitted 28 April, 2021;
originally announced April 2021.
-
Inference for Batched Bandits
Authors:
Kelly W. Zhang,
Lucas Janson,
Susan A. Murphy
Abstract:
As bandit algorithms are increasingly utilized in scientific studies and industrial applications, there is an associated increasing need for reliable inference methods based on the resulting adaptively-collected data. In this work, we develop methods for inference on data collected in batches using a bandit algorithm. We first prove that the ordinary least squares estimator (OLS), which is asympto…
▽ More
As bandit algorithms are increasingly utilized in scientific studies and industrial applications, there is an associated increasing need for reliable inference methods based on the resulting adaptively-collected data. In this work, we develop methods for inference on data collected in batches using a bandit algorithm. We first prove that the ordinary least squares estimator (OLS), which is asymptotically normal on independently sampled data, is not asymptotically normal on data collected using standard bandit algorithms when there is no unique optimal arm. This asymptotic non-normality result implies that the naive assumption that the OLS estimator is approximately normal can lead to Type-1 error inflation and confidence intervals with below-nominal coverage probabilities. Second, we introduce the Batched OLS estimator (BOLS) that we prove is (1) asymptotically normal on data collected from both multi-arm and contextual bandits and (2) robust to non-stationarity in the baseline reward.
△ Less
Submitted 8 January, 2021; v1 submitted 8 February, 2020;
originally announced February 2020.
-
Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis
Authors:
Kelly W. Zhang,
Samuel R. Bowman
Abstract:
Recent work using auxiliary prediction task classifiers to investigate the properties of LSTM representations has begun to shed light on why pretrained representations, like ELMo (Peters et al., 2018) and CoVe (McCann et al., 2017), are so beneficial for neural language understanding models. We still, though, do not yet have a clear understanding of how the choice of pretraining objective affects…
▽ More
Recent work using auxiliary prediction task classifiers to investigate the properties of LSTM representations has begun to shed light on why pretrained representations, like ELMo (Peters et al., 2018) and CoVe (McCann et al., 2017), are so beneficial for neural language understanding models. We still, though, do not yet have a clear understanding of how the choice of pretraining objective affects the type of linguistic information that models learn. With this in mind, we compare four objectives---language modeling, translation, skip-thought, and autoencoding---on their ability to induce syntactic and part-of-speech information. We make a fair comparison between the tasks by holding constant the quantity and genre of the training data, as well as the LSTM architecture. We find that representations from language models consistently perform best on our syntactic auxiliary prediction tasks, even when trained on relatively small amounts of data. These results suggest that language modeling may be the best data-rich pretraining task for transfer learning applications requiring syntactic information. We also find that the representations from randomly-initialized, frozen LSTMs perform strikingly well on our syntactic auxiliary tasks, but this effect disappears when the amount of training data for the auxiliary tasks is reduced.
△ Less
Submitted 7 January, 2019; v1 submitted 26 September, 2018;
originally announced September 2018.
-
Two-Dimensional PN Monolayer Sheets with Fantastic Structures and Properties
Authors:
ShuangYing Ma,
Chaoyu He,
L. Z. Sun,
Hai** Lin,
Youyong Li,
K. W. Zhang
Abstract:
Three two-dimensional phosphorus nitride (PN) monolayer sheets (named as $α$-, $β$-, and $γ$-PN, respectively) with fantastic structures and properties are predicted based on first-principles calculations. The $α$-PN and $γ$-PN are buckled structure, whereas $β$-PN shows puckered characteristics. Their unique structures endows these atomic PN sheets with high dynamic stabilities and anisotropic me…
▽ More
Three two-dimensional phosphorus nitride (PN) monolayer sheets (named as $α$-, $β$-, and $γ$-PN, respectively) with fantastic structures and properties are predicted based on first-principles calculations. The $α$-PN and $γ$-PN are buckled structure, whereas $β$-PN shows puckered characteristics. Their unique structures endows these atomic PN sheets with high dynamic stabilities and anisotropic mechanical properties. They are all indirect semiconductors and their band gap sensitively depends on the in-plane strain. Moreover, the nanoribbons patterned from these three PN monolayers demonstrate remarkable quantum size effect. Particularly, the Zigzag $α$-PN nanoribbon shows size-dependent ferromagnetism. Their significant properties show potential in nano-electronics. The synthesis of the three phases of PN monolayer sheets is proposed theoretically, which is deserved to further study in experiments.
△ Less
Submitted 1 October, 2015;
originally announced October 2015.
-
Novel Two-dimensional SiC2 Sheet with Full Pentagon Network
Authors:
J. Liu,
C. Y. He,
N. Jiao,
H. P. Xiao,
K. W. Zhang,
R. Z. Wang,
L. Z. Sun
Abstract:
We propose a promising two-dimensional nano-sheet of SiC2 (SiC2-pentagon) consisting of tetrahedral silicon atoms and triple-linked carbon atoms in a fully-pentagon network. The SiC2-pentagon with buckled configuration is more favorable than its planar counterpart and previously proposed SiC2-silagraphene with tetra-coordinate silicon atoms; and its dynamical stability is confirmed through phonon…
▽ More
We propose a promising two-dimensional nano-sheet of SiC2 (SiC2-pentagon) consisting of tetrahedral silicon atoms and triple-linked carbon atoms in a fully-pentagon network. The SiC2-pentagon with buckled configuration is more favorable than its planar counterpart and previously proposed SiC2-silagraphene with tetra-coordinate silicon atoms; and its dynamical stability is confirmed through phonon analyzing. Buckled SiC2-pentagon is an indirect-band-gap semiconductor with a gap of 1.388 eV. However, its one-dimensional nanoribbons can be metals or semiconductors depending on the edge type, shape, and decoration. Finally, we propose a method to produce the buckled SiC2-pentagon through chemical exfoliation on the beta-SiC(001)-c(2*2) SDB surface.
△ Less
Submitted 24 July, 2013;
originally announced July 2013.
-
Magnetic Exchange Coupling and Anisotropy of 3d Transition-Metal Nanowire on the Surface of Graphyne Sheet
Authors:
Junjie He,
Pan Zhou,
N. Jiao,
S. Y. Ma,
K. W. Zhang,
R. Z. Wang,
L. Z. Sun
Abstract:
Using density functional theory plus Hubbard-U (DFT+U) approach, we find that quasi one-dementation(1D) 3d transition metal(TM) zigzag nanowire can be constructed by TM adsorbed on the surface of graphyne sheet. The results show that the TM exchange coupling of the zigzag nanowire mediated by sp hybridized carbon atoms gives rise to long range ferromagnetic order except for Cr with anti-ferromagne…
▽ More
Using density functional theory plus Hubbard-U (DFT+U) approach, we find that quasi one-dementation(1D) 3d transition metal(TM) zigzag nanowire can be constructed by TM adsorbed on the surface of graphyne sheet. The results show that the TM exchange coupling of the zigzag nanowire mediated by sp hybridized carbon atoms gives rise to long range ferromagnetic order except for Cr with anti-ferromagnetic order. The magnetic exchange interaction of TM chains follows like-Zener's p_z-d exchange mechanism: the coexistence of out-of plane p_z-d and in-plane p_x-y-d exchange. Finally, by including spin-orbit interactions within spin-DFT, we calculate the magnetic anisotropy energy of the TM chain on graphyne. We find that the Fe and Co chains show considerable magnetic anisotropy energy (MAE) and orbital magnetic moment. The easy axis of V, Cr, Mn and Fe chains is perpendicular to the surface, whereas the easy axis of Co lies in the surface. Moreover, only V chain shows relatively larger in-plane anisotropy. Our results open a new route to realize the applications of graphyne in spintronics.
△ Less
Submitted 21 May, 2013; v1 submitted 8 May, 2013;
originally announced May 2013.
-
Structure, stability and electronic properties of tricycle type graphane
Authors:
Chaoyu He,
L. Z. Sun,
C. X. Zhang,
N. Jiao,
K. W. Zhang,
Jianxin Zhong
Abstract:
We propose a new allotrope of graphane, named as tricycle graphane,with a 4up/2down UUUDUD hydrogenation in each hexagonal carbon ring,which is different from previously proposed allotropes with UUDUUD(boat-1) and UUUUDD (boat-2) types of hydrogenation. Its stability and electronic structures are systematically studied using first-principles method. We find that the tricycle graphane is a stable p…
▽ More
We propose a new allotrope of graphane, named as tricycle graphane,with a 4up/2down UUUDUD hydrogenation in each hexagonal carbon ring,which is different from previously proposed allotropes with UUDUUD(boat-1) and UUUUDD (boat-2) types of hydrogenation. Its stability and electronic structures are systematically studied using first-principles method. We find that the tricycle graphane is a stable phase in between the previously proposed chair and stirrup allotropes. Its electronic properties are very similar to those of chair, stirrup, boat-1, boat-2, and twist-boat allotropes. The negative Gibbs free energy of tricycle graphane is -91 meV/atom, which very close to that of the most stable chair one (-103 meV/atom). Thus, this new two-dimensional hydrocarbon may be produced in the process of graphene hydrogenation with a relative high probability compared to other conformers.
△ Less
Submitted 30 April, 2012;
originally announced April 2012.
-
First-principles study of a novel superhard boron nitride phase
Authors:
Chaoyu He,
L. Z. Sun,
C. X. Zhang,
Xiangyang Peng,
K. W. Zhang,
Jianxin Zhong
Abstract:
A superhard boron nitride phase dubbed as Z-BN is proposed as possible intermediate phase between h-BN and zinc blende BN (c-BN), and investigated using first-principles calculations within the framework of the density functional theory. Although the structure of Z-BN is similar to that of bct-BN containing four-eight BN rings, it is more energy favorable than bct-BN. Our study reveals that Z-BN,…
▽ More
A superhard boron nitride phase dubbed as Z-BN is proposed as possible intermediate phase between h-BN and zinc blende BN (c-BN), and investigated using first-principles calculations within the framework of the density functional theory. Although the structure of Z-BN is similar to that of bct-BN containing four-eight BN rings, it is more energy favorable than bct-BN. Our study reveals that Z-BN, with a considerable structural stability and high density comparable to c-BN, is a transparent insulator with an indirect band gap about 5.27 eV. Amazingly, its Vickers hardness is 55.88 Gpa which is comparable to that of c-BN. This new BN phase may be produced in experiments through cold compressing AB stacking h-BN due to its low transition pressure point of 3.3 GPa.
△ Less
Submitted 10 June, 2012; v1 submitted 10 April, 2012;
originally announced April 2012.
-
Four superhard carbon allotropes: First-principle study
Authors:
Chaoyu He,
L. Z. Sun,
C. X. Zhang,
K. W. Zhang,
Xiangyang Peng,
Jianxin Zhong
Abstract:
Using a generalized genetic algorithm, we propose four new sp3 carbon allotropes with 5-6-7 (5-6-7-type Z-ACA and Z-CACB) or 4-6-8(4-6-8-type Z4-A3B1 and A4-A2B2) carbon rings. Their stability, mechanical and electronic properties are systematically studied using first-principles method. We find that all these four carbon allotropes show amazing stability in comparison with recently proposed carbo…
▽ More
Using a generalized genetic algorithm, we propose four new sp3 carbon allotropes with 5-6-7 (5-6-7-type Z-ACA and Z-CACB) or 4-6-8(4-6-8-type Z4-A3B1 and A4-A2B2) carbon rings. Their stability, mechanical and electronic properties are systematically studied using first-principles method. We find that all these four carbon allotropes show amazing stability in comparison with recently proposed carbon phases. Both ZACA and Z-CACB are direct-band-gap semiconductors with band gaps of 2.261 eV and 4.196 eV, respectively. Whereas Z4-A3B1 and A4-A2B2 are indirect-band-gap semiconductors with band gaps of 3.105 eV and 3.271 eV, respectively. Their mechanical properties reveal that all these four carbon allotropes are superhard materials comparable to diamond.
△ Less
Submitted 19 April, 2012; v1 submitted 27 March, 2012;
originally announced March 2012.
-
New Superhard Carbon Phases Between Graphite and Diamond
Authors:
Chaoyu He,
L. Z. Sun,
C. X. Zhang,
K. W. Zhang,
Xiangyang Peng,
Jianxin Zhong
Abstract:
Two new carbon allotropes (H-carbon and S-carbon) are proposed, as possible candidates for the intermediate superhard phases between graphite and diamond obtained in the process of cold compressing graphite, based on the results of first-principles calculations. Both H-carbon and S-carbon are more stable than previously proposed M-carbon and W-carbon and their bulk modulus are comparable to that o…
▽ More
Two new carbon allotropes (H-carbon and S-carbon) are proposed, as possible candidates for the intermediate superhard phases between graphite and diamond obtained in the process of cold compressing graphite, based on the results of first-principles calculations. Both H-carbon and S-carbon are more stable than previously proposed M-carbon and W-carbon and their bulk modulus are comparable to that of diamond. H-carbon is an indirect-band-gap semiconductor with a gap of 4.459 eV and S-carbon is a direct-band-gap semiconductor with a gap of 4.343 eV. The transition pressure from cold compressing graphite is 10.08 GPa and 5.93 Gpa for H-carbon and S-carbon, respectively, which is in consistent with the recent experimental report.
△ Less
Submitted 11 June, 2012; v1 submitted 25 March, 2012;
originally announced March 2012.