-
Double-Barreled Question Detection at Momentive
Authors:
Peng Jiang,
Krishna Sumanth Muppalla,
Qing Wei,
Chidambara Natarajan Gopal,
Chun Wang
Abstract:
Momentive offers solutions in market research, customer experience, and enterprise feedback. The technology is gleaned from the billions of real responses to questions asked on the platform. However, people may create biased questions. A double-barreled question (DBQ) is a common type of biased question that asks two aspects in one question. For example, "Do you agree with the statement: The food…
▽ More
Momentive offers solutions in market research, customer experience, and enterprise feedback. The technology is gleaned from the billions of real responses to questions asked on the platform. However, people may create biased questions. A double-barreled question (DBQ) is a common type of biased question that asks two aspects in one question. For example, "Do you agree with the statement: The food is yummy, and the service is great.". This DBQ confuses survey respondents because there are two parts in a question. DBQs impact both the survey respondents and the survey owners. Momentive aims to detect DBQs and recommend survey creators to make a change towards gathering high quality unbiased survey data. Previous research work has suggested detecting DBQs by checking the existence of grammatical conjunction. While this is a simple rule-based approach, this method is error-prone because conjunctions can also exist in properly constructed questions. We present an end-to-end machine learning approach for DBQ classification in this work. We handled this imbalanced data using active learning, and compared state-of-the-art embedding algorithms to transform text data into vectors. Furthermore, we proposed a model interpretation technique propagating the vector-level SHAP values to a SHAP value for each word in the questions. We concluded that the word2vec subword embedding with maximum pooling is the optimal word embedding representation in terms of precision and running time in the offline experiments using the survey data at Momentive. The A/B test and production metrics indicate that this model brings a positive change to the business. To the best of our knowledge, this is the first machine learning framework for DBQ detection, and it successfully differentiates Momentive from the competitors. We hope our work sheds light on machine learning approaches for bias question detection.
△ Less
Submitted 11 February, 2022;
originally announced March 2022.
-
Li$_x$CoO$_2$ phase stability studied by machine learning-enabled scale bridging between electronic structure, statistical mechanics and phase field theories
Authors:
Gregory H. Teichert,
Sambit Das,
Muratahan Aykol,
Chirranjeevi Gopal,
Vikram Gavini,
Krishna Garikipati
Abstract:
Li$_xTM$O$_2$ (TM={Ni, Co, Mn}) are promising cathodes for Li-ion batteries, whose electrochemical cycling performance is strongly governed by crystal structure and phase stability as a function of Li content at the atomistic scale. Here, we use Li$_x$CoO$_2$ (LCO) as a model system to benchmark a scale-bridging framework that combines density functional theory (DFT) calculations at the atomistic…
▽ More
Li$_xTM$O$_2$ (TM={Ni, Co, Mn}) are promising cathodes for Li-ion batteries, whose electrochemical cycling performance is strongly governed by crystal structure and phase stability as a function of Li content at the atomistic scale. Here, we use Li$_x$CoO$_2$ (LCO) as a model system to benchmark a scale-bridging framework that combines density functional theory (DFT) calculations at the atomistic scale with phase field modeling at the continuum scale to understand the impact of phase stability on microstructure evolution. This scale bridging is accomplished by incorporating traditional statistical mechanics methods with integrable deep neural networks, which allows formation energies for specific atomic configurations to be coarse-grained and incorporated in a neural network description of the free energy of the material. The resulting realistic free energy functions enable atomistically informed phase-field simulations. These computational results allow us to make connections to experimental work on LCO cathode degradation as a function of temperature, morphology and particle size.
△ Less
Submitted 22 April, 2021; v1 submitted 16 April, 2021;
originally announced April 2021.
-
A user-centered approach to designing an experimental laboratory data platform
Authors:
Ha-Kyung Kwon,
Chirranjeevi Balaji Gopal,
Jared Kirschner,
Santiago Caicedo,
Brian D. Storey
Abstract:
While automated experiments and high-throughput methods are becoming more mainstream in the age of data, empowering individual researchers to capture, collate, and contextualize their data faster and more reproducibly still remains a challenge in science. Despite the abundance of software products to help digitize and organize scientific information, their broader adoption in the scientific commun…
▽ More
While automated experiments and high-throughput methods are becoming more mainstream in the age of data, empowering individual researchers to capture, collate, and contextualize their data faster and more reproducibly still remains a challenge in science. Despite the abundance of software products to help digitize and organize scientific information, their broader adoption in the scientific community has been hindered by the lack of a holistic understanding of the diverse needs of researchers and their experimental processes. In this work, we take a user-centered approach to understand what essential elements of design and functionality researchers (in chemical and materials science) want in an experimental data platform to address the problem of data capture in their experimental processes. We found that having the capability to contextualize rich, complex experimental datasets is the primary user requirement. We synthesize this and other key findings into design criteria for a potential solution.
△ Less
Submitted 28 July, 2020;
originally announced July 2020.
-
Dynamics of Content Quality in Collaborative Knowledge Production
Authors:
Emilio Ferrara,
Nazanin Alipourfard,
Keith Burghardt,
Chiranth Gopal,
Kristina Lerman
Abstract:
We explore the dynamics of user performance in collaborative knowledge production by studying the quality of answers to questions posted on Stack Exchange. We propose four indicators of answer quality: answer length, the number of code lines and hyperlinks to external web content it contains, and whether it is accepted by the asker as the most helpful answer to the question. Analyzing millions of…
▽ More
We explore the dynamics of user performance in collaborative knowledge production by studying the quality of answers to questions posted on Stack Exchange. We propose four indicators of answer quality: answer length, the number of code lines and hyperlinks to external web content it contains, and whether it is accepted by the asker as the most helpful answer to the question. Analyzing millions of answers posted over the period from 2008 to 2014, we uncover regular short-term and long-term changes in quality. In the short-term, quality deteriorates over the course of a single session, with each successive answer becoming shorter, with fewer code lines and links, and less likely to be accepted. In contrast, performance improves over the long-term, with more experienced users producing higher quality answers. These trends are not a consequence of data heterogeneity, but rather have a behavioral origin. Our findings highlight the complex interplay between short-term deterioration in performance, potentially due to mental fatigue or attention depletion, and long-term performance improvement due to learning and skill acquisition, and its impact on the quality of user-generated content.
△ Less
Submitted 10 June, 2017;
originally announced June 2017.