Search | arXiv e-print repository

Are large language models superhuman chemists?

Authors: Adrian Mirza, Nawaf Alampara, Sreekanth Kunchapu, Benedict Emoekabu, Aswanth Krishnan, Mara Wilhelmi, Macjonathan Okereke, Juliane Eberhardt, Amir Mohammad Elahi, Maximilian Greiner, Caroline T. Holick, Tanya Gupta, Mehrdad Asgari, Christina Glaubitz, Lea C. Klepsch, Yannik Köster, Jakob Meyer, Santiago Miret, Tim Hoffmann, Fabian Alexander Kreth, Michael Ringleb, Nicole Roesner, Ulrich S. Schubert, Leanne M. Stafast, Dinga Wonanke , et al. (3 additional authors not shown)

Abstract: Large language models (LLMs) have gained widespread interest due to their ability to process human language and perform tasks on which they have not been explicitly trained. This is relevant for the chemical sciences, which face the problem of small and diverse datasets that are frequently in the form of text. LLMs have shown promise in addressing these issues and are increasingly being harnessed… ▽ More Large language models (LLMs) have gained widespread interest due to their ability to process human language and perform tasks on which they have not been explicitly trained. This is relevant for the chemical sciences, which face the problem of small and diverse datasets that are frequently in the form of text. LLMs have shown promise in addressing these issues and are increasingly being harnessed to predict chemical properties, optimize reactions, and even design and conduct experiments autonomously. However, we still have only a very limited systematic understanding of the chemical reasoning capabilities of LLMs, which would be required to improve models and mitigate potential harms. Here, we introduce "ChemBench," an automated framework designed to rigorously evaluate the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of human chemists. We curated more than 7,000 question-answer pairs for a wide array of subfields of the chemical sciences, evaluated leading open and closed-source LLMs, and found that the best models outperformed the best human chemists in our study on average. The models, however, struggle with some chemical reasoning tasks that are easy for human experts and provide overconfident, misleading predictions, such as about chemicals' safety profiles. These findings underscore the dual reality that, although LLMs demonstrate remarkable proficiency in chemical tasks, further research is critical to enhancing their safety and utility in chemical sciences. Our findings also indicate a need for adaptations to chemistry curricula and highlight the importance of continuing to develop evaluation frameworks to improve safe and useful LLMs. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2301.03164 [pdf, other]

Cursive Caption Text Detection in Videos

Authors: Ali Mirza, Imran Siddiqi

Abstract: Textual content appearing in videos represents an interesting index for semantic retrieval of videos (from archives), generation of alerts (live streams) as well as high level applications like opinion mining and content summarization. One of the key components of such systems is the detection of textual content in video frames and the same makes the subject of our present study. This paper presen… ▽ More Textual content appearing in videos represents an interesting index for semantic retrieval of videos (from archives), generation of alerts (live streams) as well as high level applications like opinion mining and content summarization. One of the key components of such systems is the detection of textual content in video frames and the same makes the subject of our present study. This paper presents a robust technique for detection of textual content appearing in video frames. More specifically we target text in cursive script taking Urdu text as a case study. Detection of textual regions in video frames is carried out by fine-tuning object detectors based on deep convolutional neural networks for the specific case of text detection. Since it is common to have videos with caption text in multiple-scripts, cursive text is distinguished from Latin text using a script-identification module. Finally, detection and script identification are combined in a single end-to-end trainable system. Experiments on a comprehensive dataset of around 11,000 video frames report an F-measure of 0.91. △ Less

Submitted 8 January, 2023; originally announced January 2023.

Comments: 19 pages, 16 figures

arXiv:2204.09153 [pdf, other]

doi 10.1109/TNANO.2022.3223915

Characterization and Optimization of Integrated Silicon-Photonic Neural Networks under Fabrication-Process Variations

Authors: Asif Mirza, Amin Shafiee, Sanmitra Banerjee, Krishnendu Chakrabarty, Sudeep Pasricha, Mahdi Nikdast

Abstract: Silicon-photonic neural networks (SPNNs) have emerged as promising successors to electronic artificial intelligence (AI) accelerators by offering orders of magnitude lower latency and higher energy efficiency. Nevertheless, the underlying silicon photonic devices in SPNNs are sensitive to inevitable fabrication-process variations (FPVs) stemming from optical lithography imperfections. Consequently… ▽ More Silicon-photonic neural networks (SPNNs) have emerged as promising successors to electronic artificial intelligence (AI) accelerators by offering orders of magnitude lower latency and higher energy efficiency. Nevertheless, the underlying silicon photonic devices in SPNNs are sensitive to inevitable fabrication-process variations (FPVs) stemming from optical lithography imperfections. Consequently, the inferencing accuracy in an SPNN can be highly impacted by FPVs -- e.g., can drop to below 10% -- the impact of which is yet to be fully studied. In this paper, we, for the first time, model and explore the impact of FPVs in the waveguide width and silicon-on-insulator (SOI) thickness in coherent SPNNs that use Mach-Zehnder Interferometers (MZIs). Leveraging such models, we propose a novel variation-aware, design-time optimization solution to improve MZI tolerance to different FPVs in SPNNs. Simulation results for two example SPNNs of different scales under realistic and correlated FPVs indicate that the optimized MZIs can improve the inferencing accuracy by up to 93.95% for the MNIST handwritten digit dataset -- considered as an example in this paper -- which corresponds to a <0.5% accuracy loss compared to the variation-free case. The proposed one-time optimization method imposes low area overhead, and hence is applicable even to resource-constrained designs △ Less

Submitted 19 April, 2022; originally announced April 2022.

arXiv:2107.05530 [pdf]

ROBIN: A Robust Optical Binary Neural Network Accelerator

Authors: Febin P. Sunny, Asif Mirza, Mahdi Nikdast, Sudeep Pasricha

Abstract: Domain specific neural network accelerators have garnered attention because of their improved energy efficiency and inference performance compared to CPUs and GPUs. Such accelerators are thus well suited for resource-constrained embedded systems. However, map** sophisticated neural network models on these accelerators still entails significant energy and memory consumption, along with high infer… ▽ More Domain specific neural network accelerators have garnered attention because of their improved energy efficiency and inference performance compared to CPUs and GPUs. Such accelerators are thus well suited for resource-constrained embedded systems. However, map** sophisticated neural network models on these accelerators still entails significant energy and memory consumption, along with high inference time overhead. Binarized neural networks (BNNs), which utilize single-bit weights, represent an efficient way to implement and deploy neural network models on accelerators. In this paper, we present a novel optical-domain BNN accelerator, named ROBIN, which intelligently integrates heterogeneous microring resonator optical devices with complementary capabilities to efficiently implement the key functionalities in BNNs. We perform detailed fabrication-process variation analyses at the optical device level, explore efficient corrective tuning for these devices, and integrate circuit-level optimization to counter thermal variations. As a result, our proposed ROBIN architecture possesses the desirable traits of being robust, energy-efficient, low latency, and high throughput, when executing BNN models. Our analysis shows that ROBIN can outperform the best-known optical BNN accelerators and also many electronic accelerators. Specifically, our energy-efficient ROBIN design exhibits energy-per-bit values that are ~4x lower than electronic BNN accelerators and ~933x lower than a recently proposed photonic BNN accelerator, while a performance-efficient ROBIN design shows ~3x and ~25x better performance than electronic and photonic BNN accelerators, respectively. △ Less

Submitted 12 July, 2021; originally announced July 2021.

arXiv:2103.08828 [pdf]

ARXON: A Framework for Approximate Communication over Photonic Networks-on-Chip

Authors: Febin Sunny, Asif Mirza, Ishan Thakkar, Mahdi Nikdast, Sudeep Pasricha

Abstract: The approximate computing paradigm advocates for relaxing accuracy goals in applications to improve energy-efficiency and performance. Recently, this paradigm has been explored to improve the energy-efficiency of silicon photonic networks-on-chip (PNoCs). Silicon photonic interconnects suffer from high power dissipation because of laser sources, which generate carrier wavelengths, and tuning power… ▽ More The approximate computing paradigm advocates for relaxing accuracy goals in applications to improve energy-efficiency and performance. Recently, this paradigm has been explored to improve the energy-efficiency of silicon photonic networks-on-chip (PNoCs). Silicon photonic interconnects suffer from high power dissipation because of laser sources, which generate carrier wavelengths, and tuning power required for regulating photonic devices under different uncertainties. In this paper, we propose a framework called ARXON to reduce such power dissipation overhead by enabling intelligent and aggressive approximation during communication over silicon photonic links in PNoCs. Our framework reduces laser and tuning-power overhead while intelligently approximating communication, such that application output quality is not distorted beyond an acceptable limit. Simulation results show that our framework can achieve up to 56.4% lower laser power consumption and up to 23.8% better energy-efficiency than the best-known prior work on approximate communication with silicon photonic interconnects and for the same application output quality. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: arXiv admin note: text overlap with arXiv:2002.11289

arXiv:2102.06960 [pdf]

CrossLight: A Cross-Layer Optimized Silicon Photonic Neural Network Accelerator

Authors: Febin Sunny, Asif Mirza, Mahdi Nikdast, Sudeep Pasricha

Abstract: Domain-specific neural network accelerators have seen growing interest in recent years due to their improved energy efficiency and inference performance compared to CPUs and GPUs. In this paper, we propose a novel cross-layer optimized neural network accelerator called CrossLight that leverages silicon photonics. CrossLight includes device-level engineering for resilience to process variations and… ▽ More Domain-specific neural network accelerators have seen growing interest in recent years due to their improved energy efficiency and inference performance compared to CPUs and GPUs. In this paper, we propose a novel cross-layer optimized neural network accelerator called CrossLight that leverages silicon photonics. CrossLight includes device-level engineering for resilience to process variations and thermal crosstalk, circuit-level tuning enhancements for inference latency reduction, and architecture-level optimization to enable higher resolution, better energy-efficiency, and improved throughput. On average, CrossLight offers 9.5x lower energy-per-bit and 15.9x higher performance-per-watt at 16-bit resolution than state-of-the-art photonic deep learning accelerators. △ Less

Submitted 13 February, 2021; originally announced February 2021.

arXiv:2002.11289 [pdf]

LORAX: Loss-Aware Approximations for Energy-Efficient Silicon Photonic Networks-on-Chip

Authors: Febin Sunny, Asif Mirza, Ishan Thakkar, Sudeep Pasricha, Nikdast Mahdi

Abstract: The approximate computing paradigm advocates for relaxing accuracy goals in applications to improve energy-efficiency and performance. Recently, this paradigm has been explored to improve the energy efficiency of silicon photonic networks-on-chip (PNoCs). In this paper, we propose a novel framework (LORAX) to enable more aggressive approximation during communication over silicon photonic links in… ▽ More The approximate computing paradigm advocates for relaxing accuracy goals in applications to improve energy-efficiency and performance. Recently, this paradigm has been explored to improve the energy efficiency of silicon photonic networks-on-chip (PNoCs). In this paper, we propose a novel framework (LORAX) to enable more aggressive approximation during communication over silicon photonic links in PNoCs. Given that silicon photonic interconnects have significant power dissipation due to the laser sources that generate the wavelengths for photonic communication, our framework attempts to reduce laser power overheads while intelligently approximating communication such that application output quality is not distorted beyond an acceptable limit. To the best of our knowledge, this is the first work that considers loss-aware laser power management and multilevel signaling to enable effective data approximation and energy-efficiency in PNoCs. Simulation results show that our framework can achieve up to 31.4% lower laser power consumption and up to 12.2% better energy efficiency than the best known prior work on approximate communication with silicon photonic interconnects, for the same application output quality △ Less

Submitted 25 February, 2020; originally announced February 2020.

Comments: Submitted and accepted at GLSVLSI 2020

arXiv:2001.08155 [pdf, other]

An Intelligent and Time-Efficient DDoS Identification Framework for Real-Time Enterprise Networks SAD-F: Spark Based Anomaly Detection Framework

Authors: Awais Ahmed, Sufian Hameed, Muhammad Rafi, Qublai Khan Ali Mirza

Abstract: Anomaly detection is a crucial step for preventing malicious activities in the network and kee** resources available all the time for legitimate users. It is noticed from various studies that classical anomaly detectors work well with small and sampled data, but the chances of failures increase with real-time (non-sampled data) traffic data. In this paper, we will be exploring security analytic… ▽ More Anomaly detection is a crucial step for preventing malicious activities in the network and kee** resources available all the time for legitimate users. It is noticed from various studies that classical anomaly detectors work well with small and sampled data, but the chances of failures increase with real-time (non-sampled data) traffic data. In this paper, we will be exploring security analytic techniques for DDoS anomaly detection using different machine learning techniques. In this paper, we are proposing a novel approach which deals with real traffic as input to the system. Further, we study and compare the performance factor of our proposed framework on three different testbeds including normal commodity hardware, low-end system, and high-end system. Hardware details of testbeds are discussed in the respective section. Further in this paper, we investigate the performance of the classifiers in (near) real-time detection of anomalies attacks. This study also focused on the feature selection process that is as important for the anomaly detection process as it is for general modeling problems. Several techniques have been studied for feature selection and it is observed that proper feature selection can increase performance in terms of model's execution time - which totally depends upon the traffic file or traffic capturing process. △ Less

Submitted 14 February, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

arXiv:1910.10958 [pdf]

Malware Classification using Deep Learning based Feature Extraction and Wrapper based Feature Selection Technique

Authors: Muhammad Furqan Rafique, Muhammad Ali, Aqsa Saeed Qureshi, Asifullah Khan, Anwar Majid Mirza

Abstract: In the case of malware analysis, categorization of malicious files is an essential part after malware detection. Numerous static and dynamic techniques have been reported so far for categorizing malware. This research presents a deep learning-based malware detection (DLMD) technique based on static methods for classifying different malware families. The proposed DLMD technique uses both the byte a… ▽ More In the case of malware analysis, categorization of malicious files is an essential part after malware detection. Numerous static and dynamic techniques have been reported so far for categorizing malware. This research presents a deep learning-based malware detection (DLMD) technique based on static methods for classifying different malware families. The proposed DLMD technique uses both the byte and ASM files for feature engineering, thus classifying malware families. First, features are extracted from byte files using two different Deep Convolutional Neural Networks (CNN). After that, essential and discriminative opcode features are selected using a wrapper-based mechanism, where Support Vector Machine (SVM) is used as a classifier. The idea is to construct a hybrid feature space by combining the different feature spaces to overcome the shortcoming of particular feature space and thus, reduce the chances of missing a malware. Finally, the hybrid feature space is used to train a Multilayer Perceptron, which classifies all nine different malware families. Experimental results show that proposed DLMD technique achieves log-loss of 0.09 for ten independent runs. Moreover, the proposed DLMD technique's performance is compared against different classifiers and shows its effectiveness in categorizing malware. The relevant code and database can be found at https://github.com/cyberhunters/Malware-Detection-Using-Machine-Learning. △ Less

Submitted 26 December, 2020; v1 submitted 24 October, 2019; originally announced October 2019.

Comments: 21 pages, 8 figures, 11 tables

arXiv:1710.09207 [pdf, other]

doi 10.1109/TNNLS.2019.2935975

Unsupervised and Semi-supervised Anomaly Detection with LSTM Neural Networks

Authors: Tolga Ergen, Ali Hassan Mirza, Suleyman Serdar Kozat

Abstract: We investigate anomaly detection in an unsupervised framework and introduce Long Short Term Memory (LSTM) neural network based algorithms. In particular, given variable length data sequences, we first pass these sequences through our LSTM based structure and obtain fixed length sequences. We then find a decision function for our anomaly detectors based on the One Class Support Vector Machines (OC-… ▽ More We investigate anomaly detection in an unsupervised framework and introduce Long Short Term Memory (LSTM) neural network based algorithms. In particular, given variable length data sequences, we first pass these sequences through our LSTM based structure and obtain fixed length sequences. We then find a decision function for our anomaly detectors based on the One Class Support Vector Machines (OC-SVM) and Support Vector Data Description (SVDD) algorithms. As the first time in the literature, we jointly train and optimize the parameters of the LSTM architecture and the OC-SVM (or SVDD) algorithm using highly effective gradient and quadratic programming based training methods. To apply the gradient based training method, we modify the original objective criteria of the OC-SVM and SVDD algorithms, where we prove the convergence of the modified objective criteria to the original criteria. We also provide extensions of our unsupervised formulation to the semi-supervised and fully supervised frameworks. Thus, we obtain anomaly detection algorithms that can process variable length data sequences while providing high performance, especially for time series data. Our approach is generic so that we also apply this approach to the Gated Recurrent Unit (GRU) architecture by directly replacing our LSTM based structure with the GRU based structure. In our experiments, we illustrate significant performance gains achieved by our algorithms with respect to the conventional methods. △ Less

Submitted 25 October, 2017; originally announced October 2017.

arXiv:1105.2002 [pdf]

Security Risks and Modern Cyber Security Technologies for Corporate Networks

Authors: Wajeb Gharibi, Abdulrahman Mirza

Abstract: This article aims to highlight current trends on the market of corporate antivirus solutions. Brief overview of modern security threats that can destroy IT environment is provided as well as a typical structure and features of antivirus suits for corporate users presented on the market. The general requirements for corporate products are determined according to the last report from av-comparatives… ▽ More This article aims to highlight current trends on the market of corporate antivirus solutions. Brief overview of modern security threats that can destroy IT environment is provided as well as a typical structure and features of antivirus suits for corporate users presented on the market. The general requirements for corporate products are determined according to the last report from av-comparatives.org [1]. The detailed analysis of new features is provided based on an overview of products available on the market nowadays. At the end, an enumeration of modern trends in antivirus industry for corporate users completes this article. Finally, the main goal of this article is to stress an attention about new trends suggested by AV vendors in their solutions in order to protect customers against newest security threats. △ Less

Submitted 10 May, 2011; originally announced May 2011.

Comments: 5 pages

Journal ref: (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 1, January 2011

arXiv:1105.1720 [pdf]

Software Vulnerabilities, Banking Threats, Botnets and Malware Self-Protection Technologies

Authors: Wajeb Gharibi, Abdulrahman Mirza

Abstract: Information security is the protection of information from a wide range of threats in order to ensure success business continuity by minimizing risks and maximizing the return of investments and business opportunities. In this paper, we study and discuss the software vulnerabilities, banking threats, botnets and propose the malware self-protection technologies. Information security is the protection of information from a wide range of threats in order to ensure success business continuity by minimizing risks and maximizing the return of investments and business opportunities. In this paper, we study and discuss the software vulnerabilities, banking threats, botnets and propose the malware self-protection technologies. △ Less

Submitted 9 May, 2011; originally announced May 2011.

Comments: 5 pages

MSC Class: www.IJCSI.org

Journal ref: IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 1, January 2011, 236-241

arXiv:1003.1796 [pdf]

Content based Zero-Watermarking Algorithm for Authentication of Text Documents

Authors: Zunera Jalil, Anwar M. Mirza, Maria Sabir

Abstract: Copyright protection and authentication of digital contents has become a significant issue in the current digital epoch with efficient communication mediums such as internet. Plain text is the rampantly used medium used over the internet for information exchange and it is very crucial to verify the authenticity of information. There are very limited techniques available for plain text watermarking… ▽ More Copyright protection and authentication of digital contents has become a significant issue in the current digital epoch with efficient communication mediums such as internet. Plain text is the rampantly used medium used over the internet for information exchange and it is very crucial to verify the authenticity of information. There are very limited techniques available for plain text watermarking and authentication. This paper presents a novel zero-watermarking algorithm for authentication of plain text. The algorithm generates a watermark based on the text contents and this watermark can later be extracted using extraction algorithm to prove the authenticity of text document. Experimental results demonstrate the effectiveness of the algorithm against tampering attacks identifying watermark accuracy and distortion rate on 10 different text samples of varying length and attacks. △ Less

Submitted 9 March, 2010; originally announced March 2010.

Comments: Pages IEEE format, International Journal of Computer Science and Information Security, IJCSIS, Vol. 7 No. 2, February 2010, USA. ISSN 1947 5500, http://sites.google.com/site/ijcsis/

Showing 1–13 of 13 results for author: Mirza, A