-
Designing an Evaluation Framework for Large Language Models in Astronomy Research
Authors:
John F. Wu,
Alina Hyk,
Kiera McCormick,
Christine Ye,
Simone Astarita,
Elina Baral,
Jo Ciuca,
Jesse Cranney,
Anjalie Field,
Kartheik Iyer,
Philipp Koehn,
Jenn Kotler,
Sandor Kruk,
Michelle Ntampaka,
Charles O'Neill,
Joshua E. G. Peek,
Sanjib Sharma,
Mikaeel Yunus
Abstract:
Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy rese…
▽ More
Large Language Models (LLMs) are shifting how scientific research is done. It is imperative to understand how researchers interact with these models and how scientific sub-communities like astronomy might benefit from them. However, there is currently no standard for evaluating the use of LLMs in astronomy. Therefore, we present the experimental design for an evaluation study on how astronomy researchers interact with LLMs. We deploy a Slack chatbot that can answer queries from users via Retrieval-Augmented Generation (RAG); these responses are grounded in astronomy papers from arXiv. We record and anonymize user questions and chatbot answers, user upvotes and downvotes to LLM responses, user feedback to the LLM, and retrieved documents and similarity scores with the query. Our data collection method will enable future dynamic evaluations of LLM tools for astronomy.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Asymptotics of $p$-torsion subgroup sizes in class groups of monogenized cubic fields
Authors:
Mikaeel Yunus
Abstract:
Bhargava, Hanke, and Shankar have recently shown that the asymptotic average $2$-torsion subgroup size of the family of class groups of monogenized cubic fields with positive and negative discriminants is $3/2$ and $2$, respectively. In this paper, we provide strong computational evidence for these asymptotes. We then develop a pair of novel conjectures that predicts, for $p$ prime, the asymptotic…
▽ More
Bhargava, Hanke, and Shankar have recently shown that the asymptotic average $2$-torsion subgroup size of the family of class groups of monogenized cubic fields with positive and negative discriminants is $3/2$ and $2$, respectively. In this paper, we provide strong computational evidence for these asymptotes. We then develop a pair of novel conjectures that predicts, for $p$ prime, the asymptotic average $p$-torsion subgroup size in class groups of monogenized cubic fields.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
The LHC Olympics 2020: A Community Challenge for Anomaly Detection in High Energy Physics
Authors:
Gregor Kasieczka,
Benjamin Nachman,
David Shih,
Oz Amram,
Anders Andreassen,
Kees Benkendorfer,
Blaz Bortolato,
Gustaaf Brooijmans,
Florencia Canelli,
Jack H. Collins,
Biwei Dai,
Felipe F. De Freitas,
Barry M. Dillon,
Ioan-Mihail Dinu,
Zhongtian Dong,
Julien Donini,
Javier Duarte,
D. A. Faroughy,
Julia Gonski,
Philip Harris,
Alan Kahn,
Jernej F. Kamenik,
Charanjit K. Khosa,
Patrick Komiske,
Luc Le Pottier
, et al. (22 additional authors not shown)
Abstract:
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a…
▽ More
A new paradigm for data-driven, model-agnostic new physics searches at colliders is emerging, and aims to leverage recent breakthroughs in anomaly detection and machine learning. In order to develop and benchmark new anomaly detection methods within this framework, it is essential to have standard datasets. To this end, we have created the LHC Olympics 2020, a community challenge accompanied by a set of simulated collider events. Participants in these Olympics have developed their methods using an R&D dataset and then tested them on black boxes: datasets with an unknown anomaly (or not). This paper will review the LHC Olympics 2020 challenge, including an overview of the competition, a description of methods deployed in the competition, lessons learned from the experience, and implications for data analyses with future datasets as well as future colliders.
△ Less
Submitted 20 January, 2021;
originally announced January 2021.
-
Quasi Anomalous Knowledge: Searching for new physics with embedded knowledge
Authors:
Sang Eon Park,
Dylan Rankin,
Silviu-Marian Udrescu,
Mikaeel Yunus,
Philip Harris
Abstract:
Discoveries of new phenomena often involve a dedicated search for a hypothetical physics signature. Recently, novel deep learning techniques have emerged for anomaly detection in the absence of a signal prior. However, by ignoring signal priors, the sensitivity of these approaches is significantly reduced. We present a new strategy dubbed Quasi Anomalous Knowledge (QUAK), whereby we introduce alte…
▽ More
Discoveries of new phenomena often involve a dedicated search for a hypothetical physics signature. Recently, novel deep learning techniques have emerged for anomaly detection in the absence of a signal prior. However, by ignoring signal priors, the sensitivity of these approaches is significantly reduced. We present a new strategy dubbed Quasi Anomalous Knowledge (QUAK), whereby we introduce alternative signal priors that capture some of the salient features of new physics signatures, allowing for the recovery of sensitivity even when the alternative signal is incorrect. This approach can be applied to a broad range of physics models and neural network architectures. In this paper, we apply QUAK to anomaly detection of new physics events at the CERN Large Hadron Collider utilizing variational autoencoders with normalizing flow.
△ Less
Submitted 11 June, 2021; v1 submitted 6 November, 2020;
originally announced November 2020.
-
HalalNet: A Deep Neural Network that Classifies the Halalness Slaughtered Chicken from their Images
Authors:
A. Elfakharany,
R. Yusof,
N. Ismail,
R. Arfa,
M. Yunus
Abstract:
Halal requirement in food is important for millions of Muslims worldwide especially for meat and chicken products, insuring that slaughter houses adhere to this requirement is a challenging task to do manually. In this paper a method is proposed that uses a camera that takes images of slaughtered chicken on the conveyor in a slaughter house, the images are then analyzed by a deep neural network to…
▽ More
Halal requirement in food is important for millions of Muslims worldwide especially for meat and chicken products, insuring that slaughter houses adhere to this requirement is a challenging task to do manually. In this paper a method is proposed that uses a camera that takes images of slaughtered chicken on the conveyor in a slaughter house, the images are then analyzed by a deep neural network to classify if the image is of a halal slaughtered chicken or not. However, traditional deep learning models require large amounts of data to train on, which in this case these amounts of data were challenging to collect especially the images of non-halal slaughtered chicken, hence this paper shows how the use of one shot learning [1] and transfer learning [2] can reach high accuracy on the few amounts of data that were available. The architecture used is based on the Siamese neural networks architecture which ranks the similarity between two inputs [3] while using the Xception network [4] as the twin networks. We call it HalalNet. This work was done as part of SYCUT (syriah compliant slaughtering system) which is a monitoring system that monitors the halalness of the slaughtered chicken in a slaughter house. The data used to train and validate HalalNet was collected from the Azain slaughtering site (Semenyih, Selangor, Malaysia) containing images of both halal and non-halal slaughtered chicken.
△ Less
Submitted 10 June, 2019;
originally announced June 2019.
-
Cholera forecast for Dhaka, Bangladesh, with the 2016 El Niño
Authors:
Pamela P. Martinez,
Robert C. Reiner Jr.,
Manojit Roy,
Benjamin A. Cash,
Md. Yunus,
A. S. G. Faruque,
Sayeeda Huq,
Aaron A. King,
Mercedes Pascual
Abstract:
A substantial body of work supports a teleconnection between the El Niño Southern Oscillation (ENSO) and cholera incidence in Bangladesh. In particular, high positive anomalies during the winter (Dec-Feb) in Sea Surface Temperatures (SST) in the Tropical Pacific have been shown to exacerbate the seasonal outbreak of cholera following the monsoons from Aug to Nov, and climate studies have indicated…
▽ More
A substantial body of work supports a teleconnection between the El Niño Southern Oscillation (ENSO) and cholera incidence in Bangladesh. In particular, high positive anomalies during the winter (Dec-Feb) in Sea Surface Temperatures (SST) in the Tropical Pacific have been shown to exacerbate the seasonal outbreak of cholera following the monsoons from Aug to Nov, and climate studies have indicated a role of regional precipitation over Bangladesh in mediating this long-distance effect. Thus, the current strong El Niño has the potential to significantly increase cholera risk this year in Dhaka, Bangladesh, where the last five years have experienced low seasons of the disease. To examine this possibility and produce a forecast for the city, we considered two models for the transmission dynamics of cholera: a statistical model previously developed for the disease in this region, and a process-based model presented here that includes the effect of SST anomalies in the force of infection and is fitted to extensive cholera surveillance record between 1995 and 2010. Prediction accuracy was evaluated with 'out-of-fit' data from the same surveillance efforts, by comparing the total number of cholera cases observed for the season to those predicted by model simulations 8 to 12 months ahead, starting in January each year. Encouraged by accurate forecasts for the low risk of cholera for this period, we then generated a prediction for this coming season. An increase above the third quantile in cholera cases is expected for the period of Aug - Dec 2016 with 92% and 87% probability respectively for the two models. This alert warrants the preparedness of the public health system. We discuss the possible limitations of our approach, including variations in the impact of El Niño events, and the importance of this large, warm event for further informing an early-warning system for cholera in Dhaka
△ Less
Submitted 4 July, 2016;
originally announced July 2016.
-
Does it Matter Which Citation Tool is Used to Compare the h-index of a Group of Highly Cited Researchers?
Authors:
Nader Ale Ebrahim,
Hadi Farhadi,
Hadi Salehi,
Melor Md Yunus,
Arezoo Aghaei Chadegani,
Maryam Farhadi,
Masood Fooladi
Abstract:
h-index retrieved by citation indexes (Scopus, Google scholar, and Web of Science) is used to measure the scientific performance and the research impact studies based on the number of publications and citations of a scientist. It also is easily available and may be used for performance measures of scientists, and for recruitment decisions. The aim of this study is to investigate the difference bet…
▽ More
h-index retrieved by citation indexes (Scopus, Google scholar, and Web of Science) is used to measure the scientific performance and the research impact studies based on the number of publications and citations of a scientist. It also is easily available and may be used for performance measures of scientists, and for recruitment decisions. The aim of this study is to investigate the difference between the outputs and results from these three citation databases namely Scopus, Google Scholar, and Web of Science based upon the h-index of a group of highly cited researchers (Nobel Prize winner scientist). The purposive sampling method was adopted to collect the required data. The results showed that there is a significant difference in the h-index between three citation indexes of Scopus, Google scholar, and Web of Science; the Google scholar h-index was more than the h-index in two other databases. It was also concluded that there is a significant positive relationship between h-indices based on Google scholar and Scopus. The citation indexes of Scopus, Google scholar, and Web of Science may be useful for evaluating h-index of scientists but they have some limitations as well.
△ Less
Submitted 4 June, 2013;
originally announced June 2013.
-
Using Visual Aids as a Motivational Tool in Enhancing Students Interest in Reading Literary Texts
Authors:
Melor Md Yunus,
Hadi Salehi,
Dexter Sigan Anak John
Abstract:
This study aims to investigate the teachers perceptions on the use of visual aids (e.g., animation videos, pictures, films and projectors) as a motivational tool in enhancing students interest in reading literary texts. To achieve the aim of the study, the mixed-method approach was used to collect the required data. Therefore, 52 English teachers from seven national secondary schools in Kapit, Sar…
▽ More
This study aims to investigate the teachers perceptions on the use of visual aids (e.g., animation videos, pictures, films and projectors) as a motivational tool in enhancing students interest in reading literary texts. To achieve the aim of the study, the mixed-method approach was used to collect the required data. Therefore, 52 English teachers from seven national secondary schools in Kapit, Sarawak, Malaysia were selected. Five of the respondents were also randomly selected for the interview. The analysis of the data indicated that the majority of the teachers had positive perceptions of the use of visual aids. The use of visual aids enable the teachers to engage their students closely with the literary texts despite of being able to facilitate students of different English proficiency level in reading the texts with interest. This aspect is vital as literature helps to generate students creative and critical thinking skills. Although the teachers had positive attitudes towards the use of visual aids, the study suggests that it will be more interesting and precise if it includes students perceptions as well.
△ Less
Submitted 28 May, 2013;
originally announced May 2013.
-
Using Blogs to Promote Writing Skill in ESL Classroom
Authors:
Melor Md Yunus,
Julian Lau Kiing Tuan,
Hadi Salehi
Abstract:
This study provides details on the motivational factors for using blogs as an essential tool to promote students writing skills in ESL classrooms. The study aims to discuss how using blogs may be integrated into classroom activities to promote students writing skills as well as polishing their skills. It would also illustrate the features offered in blogs as well as the motivational essence that i…
▽ More
This study provides details on the motivational factors for using blogs as an essential tool to promote students writing skills in ESL classrooms. The study aims to discuss how using blogs may be integrated into classroom activities to promote students writing skills as well as polishing their skills. It would also illustrate the features offered in blogs as well as the motivational essence that is attached to the blogs. To achieve the aim of the study, a semi-structured interview protocol was used to collect the required qualitative data. The findings of the study would serve as an insistent reminder that the blogs which have been clearly underlined in the curriculum should be re-orchestrated more effectively again by the teachers of English as a Second Language (ESL).
△ Less
Submitted 27 May, 2013;
originally announced May 2013.
-
Does Criticisms Overcome the Praises of Journal Impact Factor?
Authors:
Masood Fooladi,
Hadi Salehi,
Melor Md Yunus,
Maryam Farhadi,
Arezoo Aghaei Chadegani,
Hadi Farhadi,
Nader Ale Ebrahim
Abstract:
Journal impact factor (IF) as a gauge of influence and impact of a particular journal comparing with other journals in the same area of research, reports the mean number of citations to the published articles in particular journal. Although, IF attracts more attention and being used more frequently than other measures, it has been subjected to criticisms, which overcome the advantages of IF. Criti…
▽ More
Journal impact factor (IF) as a gauge of influence and impact of a particular journal comparing with other journals in the same area of research, reports the mean number of citations to the published articles in particular journal. Although, IF attracts more attention and being used more frequently than other measures, it has been subjected to criticisms, which overcome the advantages of IF. Critically, extensive use of IF may result in destroying editorial and researchers behaviour, which could compromise the quality of scientific articles. Therefore, it is the time of the timeliness and importance of a new invention of journal ranking techniques beyond the journal impact factor.
△ Less
Submitted 2 May, 2013;
originally announced May 2013.
-
A Comparison between Two Main Academic Literature Collections: Web of Science and Scopus Databases
Authors:
Arezoo Aghaei Chadegani,
Hadi Salehi,
Melor Md Yunus,
Hadi Farhadi,
Masood Fooladi,
Maryam Farhadi,
Nader Ale Ebrahim
Abstract:
Nowadays, the worlds scientific community has been publishing an enormous number of papers in different scientific fields. In such environment, it is essential to know which databases are equally efficient and objective for literature searches. It seems that two most extensive databases are Web of Science and Scopus. Besides searching the literature, these two databases used to rank journals in te…
▽ More
Nowadays, the worlds scientific community has been publishing an enormous number of papers in different scientific fields. In such environment, it is essential to know which databases are equally efficient and objective for literature searches. It seems that two most extensive databases are Web of Science and Scopus. Besides searching the literature, these two databases used to rank journals in terms of their productivity and the total citations received to indicate the journals impact, prestige or influence. This article attempts to provide a comprehensive comparison of these databases to answer frequent questions which researchers ask, such as: How Web of Science and Scopus are different? In which aspects these two databases are similar? Or, if the researchers are forced to choose one of them, which one should they prefer? For answering these questions, these two databases will be compared based on their qualitative and quantitative characteristics.
△ Less
Submitted 2 May, 2013;
originally announced May 2013.
-
Increasing power of the test through pre-test - a robust method
Authors:
Rossita M Yunus,
Shahjahan Khan
Abstract:
This paper develops robust test procedures for testing the intercept of a simple regression model when it is \textit{apriori} suspected that the slope has a specified value. Defining unrestricted test (UT), restricted test (RT) and pre-test test (PTT) corresponding to the unrestricted (UE), restricted (RE), and preliminary test estimators (PTE) in the estimation case, the M-estimation methodolog…
▽ More
This paper develops robust test procedures for testing the intercept of a simple regression model when it is \textit{apriori} suspected that the slope has a specified value. Defining unrestricted test (UT), restricted test (RT) and pre-test test (PTT) corresponding to the unrestricted (UE), restricted (RE), and preliminary test estimators (PTE) in the estimation case, the M-estimation methodology is used to formulate the M-tests and derive their asymptotic power functions. Analytical and graphical comparisons of the three tests are obtained by studying the power functions with respect to size and power of the tests. It is shown that PTT achieves a reasonable dominance over the others asymptotically.
△ Less
Submitted 9 October, 2007;
originally announced October 2007.