Search | arXiv e-print repository

Harnessing the Power of LLMs: Automating Unit Test Generation for High-Performance Computing

Authors: Rabimba Karanjai, Aftab Hussain, Md Rafiqul Islam Rabin, Lei Xu, Weidong Shi, Mohammad Amin Alipour

Abstract: Unit testing is crucial in software engineering for ensuring quality. However, it's not widely used in parallel and high-performance computing software, particularly scientific applications, due to their smaller, diverse user base and complex logic. These factors make unit testing challenging and expensive, as it requires specialized knowledge and existing automated tools are often ineffective.… ▽ More Unit testing is crucial in software engineering for ensuring quality. However, it's not widely used in parallel and high-performance computing software, particularly scientific applications, due to their smaller, diverse user base and complex logic. These factors make unit testing challenging and expensive, as it requires specialized knowledge and existing automated tools are often ineffective. To address this, we propose an automated method for generating unit tests for such software, considering their unique features like complex logic and parallel processing. Recently, large language models (LLMs) have shown promise in coding and testing. We explored the capabilities of Davinci (text-davinci-002) and ChatGPT (gpt-3.5-turbo) in creating unit tests for C++ parallel programs. Our results show that LLMs can generate mostly correct and comprehensive unit tests, although they have some limitations, such as repetitive assertions and blank test cases. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2404.08151 [pdf, other]

doi 10.1145/3605098.3636029

Decentralized FaaS over Multi-Clouds with Blockchain based Management for Supporting Emerging Applications

Authors: Rabimba Karanjai, Lei Xu, Lin Chen, Nour Diallo, Weidong Shi

Abstract: Function-as-a-Service (FaaS) offers a streamlined cloud computing paradigm, but existing centralized systems suffer from vendor lock-in and single points of failure. We propose DeFaaS, a decentralized FaaS system leveraging blockchain technology and decentralized API management. DeFaaS addresses these limitations by establishing a secure, transparent registry of functions on a blockchain and enabl… ▽ More Function-as-a-Service (FaaS) offers a streamlined cloud computing paradigm, but existing centralized systems suffer from vendor lock-in and single points of failure. We propose DeFaaS, a decentralized FaaS system leveraging blockchain technology and decentralized API management. DeFaaS addresses these limitations by establishing a secure, transparent registry of functions on a blockchain and enabling applications to discover and invoke them. This approach fosters scalability, flexibility, enhanced security, and improved reliability. Furthermore, DeFaaS's architecture extends beyond decentralized FaaS, supporting other distributed computing scenarios like dApps, volunteer computing, and multi-cloud service meshes. DeFaaS represents a significant advancement in decentralized computing with the potential to unlock a multitude of novel applications and use cases. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: The 39th ACM/SIGAPP Symposium On Applied Computing

arXiv:2403.10824 [pdf, other]

LookALike: Human Mimicry based collaborative decision making

Authors: Rabimba Karanjai, Weidong Shi

Abstract: Artificial General Intelligence falls short when communicating role specific nuances to other systems. This is more pronounced when building autonomous LLM agents capable and designed to communicate with each other for real world problem solving. Humans can communicate context and domain specific nuances along with knowledge, and that has led to refinement of skills. In this work we propose and ev… ▽ More Artificial General Intelligence falls short when communicating role specific nuances to other systems. This is more pronounced when building autonomous LLM agents capable and designed to communicate with each other for real world problem solving. Humans can communicate context and domain specific nuances along with knowledge, and that has led to refinement of skills. In this work we propose and evaluate a novel method that leads to knowledge distillation among LLM agents leading to realtime human role play preserving unique contexts without relying on any stored data or pretraining. We also evaluate how our system performs better in simulated real world tasks compared to state of the art. △ Less

Submitted 16 March, 2024; originally announced March 2024.

arXiv:2403.09798 [pdf, other]

Comparing Rationality Between Large Language Models and Humans: Insights and Open Questions

Authors: Dana Alsagheer, Rabimba Karanjai, Nour Diallo, Weidong Shi, Yang Lu, Suha Beydoun, Qiaoning Zhang

Abstract: This paper delves into the dynamic landscape of artificial intelligence, specifically focusing on the burgeoning prominence of large language models (LLMs). We underscore the pivotal role of Reinforcement Learning from Human Feedback (RLHF) in augmenting LLMs' rationality and decision-making prowess. By meticulously examining the intricate relationship between human interaction and LLM behavior, w… ▽ More This paper delves into the dynamic landscape of artificial intelligence, specifically focusing on the burgeoning prominence of large language models (LLMs). We underscore the pivotal role of Reinforcement Learning from Human Feedback (RLHF) in augmenting LLMs' rationality and decision-making prowess. By meticulously examining the intricate relationship between human interaction and LLM behavior, we explore questions surrounding rationality and performance disparities between humans and LLMs, with particular attention to the Chat Generative Pre-trained Transformer. Our research employs comprehensive comparative analysis and delves into the inherent challenges of irrationality in LLMs, offering valuable insights and actionable strategies for enhancing their rationality. These findings hold significant implications for the widespread adoption of LLMs across diverse domains and applications, underscoring their potential to catalyze advancements in artificial intelligence. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.09740 [pdf, other]

doi 10.1145/3664646.3664771

Teaching Machines to Code: Smart Contract Translation with LLMs

Authors: Rabimba Karanjai, Lei Xu, Weidong Shi

Abstract: The advent of large language models (LLMs) has marked a significant milestone in the realm of artificial intelligence, with their capabilities often matching or surpassing human expertise in various domains. Among these achievements, their adeptness in translation tasks stands out, closely mimicking the intricate and preliminary processes undertaken by human translators to ensure the fidelity and… ▽ More The advent of large language models (LLMs) has marked a significant milestone in the realm of artificial intelligence, with their capabilities often matching or surpassing human expertise in various domains. Among these achievements, their adeptness in translation tasks stands out, closely mimicking the intricate and preliminary processes undertaken by human translators to ensure the fidelity and quality of the translated content. Despite the advancements in utilizing LLMs for translating programming code across different languages, the domain of smart contract translation, particularly into languages not previously encountered by the LLM, remains largely unexplored. In our research, we present a pioneering approach, SolMover, which harnesses the synergy of two distinct LLMs within a unified framework. This framework is designed to grasp coding principles and apply this understanding to the translation of code into an unfamiliar language. Our study delves into the capacity of LLMs to mimic human learning processes, offering an in-depth evaluation of our methodology for converting smart contracts written in Solidity to Move, a language with limited resources. The framework employs one LLM to decipher coding conventions for the new language, creating a blueprint for the second LLM, which, lacking planning abilities, possesses coding expertise. The empirical evidence from our experiments suggests that SolMover substantially enhances performance compared to gpt-3.5-turbo-1106, and achieves superior results over competitors such as Palm2 and Mixtral-8x7B-Instruct. Additionally, our analysis highlights the efficacy of our bug mitigation strategy in elevating code quality across all models, even outside the SolMover framework. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2308.02955 [pdf, other]

An Empirical Study of AI-based Smart Contract Creation

Authors: Rabimba Karanjai, Edward Li, Lei Xu, Weidong Shi

Abstract: The introduction of large language models (LLMs) like ChatGPT and Google Palm2 for smart contract generation seems to be the first well-established instance of an AI pair programmer. LLMs have access to a large number of open-source smart contracts, enabling them to utilize more extensive code in Solidity than other code generation tools. Although the initial and informal assessments of LLMs for s… ▽ More The introduction of large language models (LLMs) like ChatGPT and Google Palm2 for smart contract generation seems to be the first well-established instance of an AI pair programmer. LLMs have access to a large number of open-source smart contracts, enabling them to utilize more extensive code in Solidity than other code generation tools. Although the initial and informal assessments of LLMs for smart contract generation are promising, a systematic evaluation is needed to explore the limits and benefits of these models. The main objective of this study is to assess the quality of generated code provided by LLMs for smart contracts. We also aim to evaluate the impact of the quality and variety of input parameters fed to LLMs. To achieve this aim, we created an experimental setup for evaluating the generated code in terms of validity, correctness, and efficiency. Our study finds crucial evidence of security bugs getting introduced in the generated smart contracts as well as the overall quality and correctness of the code getting impacted. However, we also identified the areas where it can be improved. The paper also proposes several potential research directions to improve the process, quality and safety of generated smart contract codes. △ Less

Submitted 19 August, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

Comments: Updated to address issues

arXiv:2308.01474 [pdf, other]

doi 10.1145/3594556.3594626

Decentralized Translator of Trust: Supporting Heterogeneous TEE for Critical Infrastructure Protection

Authors: Rabimba Karanjai, Rowan Collier, Zhimin Gao, Lin Chen, Xinxin Fan, Taeweon Suh, Weidong Shi, Lei Xu

Abstract: Trusted execution environment (TEE) technology has found many applications in mitigating various security risks in an efficient manner, which is attractive for critical infrastructure protection. First, the natural of critical infrastructure requires it to be well protected from various cyber attacks. Second, performance is usually important for critical infrastructure and it cannot afford an expe… ▽ More Trusted execution environment (TEE) technology has found many applications in mitigating various security risks in an efficient manner, which is attractive for critical infrastructure protection. First, the natural of critical infrastructure requires it to be well protected from various cyber attacks. Second, performance is usually important for critical infrastructure and it cannot afford an expensive protection mechanism. While a large number of TEE-based critical infrastructure protection systems have been proposed to address various security challenges (e.g., secure sensing and reliable control), most existing works ignore one important feature, i.e., devices comprised the critical infrastructure may be equipped with multiple incompatible TEE technologies and belongs to different owners. This feature makes it hard for these devices to establish mutual trust and form a unified TEE environment. To address these challenges and fully unleash the potential of TEE technology for critical infrastructure protection, we propose DHTee, a decentralized coordination mechanism. DHTee uses blockchain technology to support key TEE functions in a heterogeneous TEE environment, especially the attestation service. A Device equipped with one TEE can interact securely with the blockchain to verify whether another potential collaborating device claiming to have a different TEE meets the security requirements. DHTee is also flexible and can support new TEE schemes without affecting devices using existing TEEs that have been supported by the system. △ Less

Submitted 2 August, 2023; originally announced August 2023.

Comments: Appeared in ACM BSCI'23

Journal ref: 12 September 2023

arXiv:2307.06554 [pdf, other]

TPU as Cryptographic Accelerator

Authors: Rabimba Karanjai, Sangwon Shin, Xinxin Fan, Lin Chen, Tianwei Zhang, Taeweon Suh, Weidong Shi, Lei Xu

Abstract: Polynomials defined on specific rings are heavily involved in various cryptographic schemes, and the corresponding operations are usually the computation bottleneck of the whole scheme. We propose to utilize TPU, an emerging hardware designed for AI applications, to speed up polynomial operations and convert TPU to a cryptographic accelerator. We also conduct preliminary evaluation and discuss… ▽ More Polynomials defined on specific rings are heavily involved in various cryptographic schemes, and the corresponding operations are usually the computation bottleneck of the whole scheme. We propose to utilize TPU, an emerging hardware designed for AI applications, to speed up polynomial operations and convert TPU to a cryptographic accelerator. We also conduct preliminary evaluation and discuss the limitations of current work and future plan. △ Less

Submitted 13 July, 2023; originally announced July 2023.

arXiv:2306.07984 [pdf, other]

Cross Chain Bribery Contracts: Majority vs Mighty Minority

Authors: Quang Tran, Lin Chen, Lei Xu, Yang Lu, Rabimba Karanjai, Weidong Shi

Abstract: Bribery is a perilous issue in the real world, especially in an economical aspect. This fraudulence is unavoidable, and more importantly, it is more difficult to trace in case smart contracts are utilized for bribing on a distributed public blockchain. In our paper, we propose a new threat to the security of a blockchain system, cross-chain bribery using smart contracts. An arbitrary wealthy bribe… ▽ More Bribery is a perilous issue in the real world, especially in an economical aspect. This fraudulence is unavoidable, and more importantly, it is more difficult to trace in case smart contracts are utilized for bribing on a distributed public blockchain. In our paper, we propose a new threat to the security of a blockchain system, cross-chain bribery using smart contracts. An arbitrary wealthy briber can utilize cross-chain smart contracts to manipulate a consensus mechanism on a victim's blockchain or to disgrace a victim's blockchain. To better understand this threat, our paper proposes a framework to analyze bribery using cross-chain smart contracts. We analyze the amount of incentive to bribe rational miners in a victim's blockchain and also a full cost of conducting a cross-chain bribery attack. The result is that such attacks can be carried out with a reasonable amount of money or cryptocurrencies. △ Less

Submitted 8 June, 2023; originally announced June 2023.

arXiv:2301.00665 [pdf, ps, other]

Targeted Phishing Campaigns using Large Scale Language Models

Authors: Rabimba Karanjai

Abstract: In this research, we aim to explore the potential of natural language models (NLMs) such as GPT-3 and GPT-2 to generate effective phishing emails. Phishing emails are fraudulent messages that aim to trick individuals into revealing sensitive information or taking actions that benefit the attackers. We propose a framework for evaluating the performance of NLMs in generating these types of emails ba… ▽ More In this research, we aim to explore the potential of natural language models (NLMs) such as GPT-3 and GPT-2 to generate effective phishing emails. Phishing emails are fraudulent messages that aim to trick individuals into revealing sensitive information or taking actions that benefit the attackers. We propose a framework for evaluating the performance of NLMs in generating these types of emails based on various criteria, including the quality of the generated text, the ability to bypass spam filters, and the success rate of tricking individuals. Our evaluations show that NLMs are capable of generating phishing emails that are difficult to detect and that have a high success rate in tricking individuals, but their effectiveness varies based on the specific NLM and training data used. Our research indicates that NLMs could have a significant impact on the prevalence of phishing attacks and emphasizes the need for further study on the ethical and security implications of using NLMs for malicious purposes. △ Less

Submitted 29 December, 2022; originally announced January 2023.

arXiv:2203.12724 [pdf, other]

doi 10.1145/3505253.3505259

Lessons Learned from Blockchain Applications of Trusted Execution Environments and Implications for Future Research

Authors: Rabimba Karanjai, Lei Xu, Lin Chen, Fengwei Zhang, Zhimin Gao, Weidong Shi

Abstract: Modern computer systems tend to rely on large trusted computing bases (TCBs) for operations. To address the TCB bloating problem, hardware vendors have developed mechanisms to enable or facilitate the creation of a trusted execution environment (TEE) in which critical software applications can execute securely in an isolated environment. Even under the circumstance that a host OS is compromised by… ▽ More Modern computer systems tend to rely on large trusted computing bases (TCBs) for operations. To address the TCB bloating problem, hardware vendors have developed mechanisms to enable or facilitate the creation of a trusted execution environment (TEE) in which critical software applications can execute securely in an isolated environment. Even under the circumstance that a host OS is compromised by an adversary, key security properties such as confidentiality and integrity of the software inside the TEEs can be guaranteed. The promise of integrity and security has driven developers to adopt it for use cases involving access control, PKS, IoT among other things. Among these applications include blockchain-related use cases. The usage of the TEEs doesn't come without its own implementation challenges and potential pitfalls. In this paper, we examine the assumptions, security models, and operational environments of the proposed TEE use cases of blockchain-based applications. The exercise and analysis help the hardware TEE research community to identify some open challenges and opportunities for research and rethink the design of hardware TEEs in general. △ Less

Submitted 23 March, 2022; originally announced March 2022.

arXiv:2101.05475 [pdf, other]

EDSC: An Event-Driven Smart Contract Platform

Authors: Mudabbir Kaleem, Keshav Kasichainula, Rabimba Karanjai, Lei Xu, Zhimin Gao, Lin Chen, Weidong Shi

Abstract: This paper presents EDSC, a novel smart contract platform design based on the event-driven execution model as opposed to the traditionally employed transaction-driven execution model. We reason that such a design is a better fit for many emerging smart contract applications and is better positioned to address the scalability and performance challenges plaguing the smart contract ecosystem. We prop… ▽ More This paper presents EDSC, a novel smart contract platform design based on the event-driven execution model as opposed to the traditionally employed transaction-driven execution model. We reason that such a design is a better fit for many emerging smart contract applications and is better positioned to address the scalability and performance challenges plaguing the smart contract ecosystem. We propose EDSC's design under the Ethereum framework, and the design can be easily adapted for other existing smart contract platforms. We have conducted implementation using Ethereum client and experiments where performance modeling results show on average 2.2 to 4.6 times reduced total latency of event triggered smart contracts, which demonstrates its effectiveness for supporting contracts that demand timely execution based on events. In addition, we discuss example use cases to demonstrate the design's utility and comment on its potential security dynamics. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: 11 pages

Showing 1–12 of 12 results for author: Karanjai, R