-
Nurgle: Exacerbating Resource Consumption in Blockchain State Storage via MPT Manipulation
Authors:
Zheyuan He,
Zihao Li,
Ao Qiao,
Xiapu Luo,
Xiaosong Zhang,
Ting Chen,
Shuwei Song,
Dijun Liu,
Weina Niu
Abstract:
Blockchains, with intricate architectures, encompass various components, e.g., consensus network, smart contracts, decentralized applications, and auxiliary services. While offering numerous advantages, these components expose various attack surfaces, leading to severe threats to blockchains. In this study, we unveil a novel attack surface, i.e., the state storage, in blockchains. The state storag…
▽ More
Blockchains, with intricate architectures, encompass various components, e.g., consensus network, smart contracts, decentralized applications, and auxiliary services. While offering numerous advantages, these components expose various attack surfaces, leading to severe threats to blockchains. In this study, we unveil a novel attack surface, i.e., the state storage, in blockchains. The state storage, based on the Merkle Patricia Trie, plays a crucial role in maintaining blockchain state. Besides, we design Nurgle, the first Denial-of-Service attack targeting the state storage. By proliferating intermediate nodes within the state storage, Nurgle forces blockchains to expend additional resources on state maintenance and verification, impairing their performance. We conduct a comprehensive and systematic evaluation of Nurgle, including the factors affecting it, its impact on blockchains, its financial cost, and practically demonstrating the resulting damage to blockchains. The implications of Nurgle extend beyond the performance degradation of blockchains, potentially reducing trust in them and the value of their cryptocurrencies. Additionally, we further discuss three feasible mitigations against Nurgle. At the time of writing, the vulnerability exploited by Nurgle has been confirmed by six mainstream blockchains, and we received thousands of USD bounty from them.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Large Language Models for Blockchain Security: A Systematic Literature Review
Authors:
Zheyuan He,
Zihao Li,
Sen Yang,
Ao Qiao,
Xiaosong Zhang,
Xiapu Luo,
Ting Chen
Abstract:
Large Language Models (LLMs) have emerged as powerful tools across various domains within cyber security. Notably, recent studies are increasingly exploring LLMs applied to the context of blockchain security (BS). However, there remains a gap in a comprehensive understanding regarding the full scope of applications, impacts, and potential constraints of LLMs on blockchain security. To fill this ga…
▽ More
Large Language Models (LLMs) have emerged as powerful tools across various domains within cyber security. Notably, recent studies are increasingly exploring LLMs applied to the context of blockchain security (BS). However, there remains a gap in a comprehensive understanding regarding the full scope of applications, impacts, and potential constraints of LLMs on blockchain security. To fill this gap, we undertake a literature review focusing on the studies that apply LLMs in blockchain security (LLM4BS).
Our study aims to comprehensively analyze and understand existing research, and elucidate how LLMs contribute to enhancing the security of blockchain systems. Through a thorough examination of existing literature, we delve into the integration of LLMs into various aspects of blockchain security. We explore the mechanisms through which LLMs can bolster blockchain security, including their applications in smart contract auditing, transaction anomaly detection, vulnerability repair, program analysis of smart contracts, and serving as participants in the cryptocurrency community. Furthermore, we assess the challenges and limitations associated with leveraging LLMs for enhancing blockchain security, considering factors such as scalability, privacy concerns, and ethical concerns. Our thorough review sheds light on the opportunities and potential risks of tasks on LLM4BS, providing valuable insights for researchers, practitioners, and policymakers alike.
△ Less
Submitted 11 May, 2024; v1 submitted 21 March, 2024;
originally announced March 2024.
-
LLM360: Towards Fully Transparent Open-Source LLMs
Authors:
Zhengzhong Liu,
Aurick Qiao,
Willie Neiswanger,
Hongyi Wang,
Bowen Tan,
Tianhua Tao,
Junbo Li,
Yuqi Wang,
Suqi Sun,
Omkar Pangarkar,
Richard Fan,
Yi Gu,
Victor Miller,
Yonghao Zhuang,
Guowei He,
Haonan Li,
Fajri Koto,
Li** Tang,
Nikhil Ranjan,
Zhiqiang Shen,
Xuguang Ren,
Roberto Iriondo,
Cun Mu,
Zhiting Hu,
Mark Schulze
, et al. (3 additional authors not shown)
Abstract:
The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as the final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics. These choices hinder prog…
▽ More
The recent surge in open-source Large Language Models (LLMs), such as LLaMA, Falcon, and Mistral, provides diverse options for AI practitioners and researchers. However, most LLMs have only released partial artifacts, such as the final model weights or inference code, and technical reports increasingly limit their scope to high-level design choices and surface statistics. These choices hinder progress in the field by degrading transparency into the training of LLMs and forcing teams to rediscover many details in the training process. We present LLM360, an initiative to fully open-source LLMs, which advocates for all training code and data, model checkpoints, and intermediate results to be made available to the community. The goal of LLM360 is to support open and collaborative AI research by making the end-to-end LLM training process transparent and reproducible by everyone. As a first step of LLM360, we release two 7B parameter LLMs pre-trained from scratch, Amber and CrystalCoder, including their training code, data, intermediate checkpoints, and analyses (at https://www.llm360.ai). We are committed to continually pushing the boundaries of LLMs through this open-source effort. More large-scale and stronger models are underway and will be released in the future.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning
Authors:
Aurick Qiao,
Sang Keun Choe,
Suhas Jayaram Subramanya,
Willie Neiswanger,
Qirong Ho,
Hao Zhang,
Gregory R. Ganger,
Eric P. Xing
Abstract:
Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-optimizing inter-dependent factors both at the per-job level and at the cluster-wide level. Most existing schedulers expect users to specify the number of resources for each job, often leading to inefficient resource use. Some recent schedulers choose job resources for users, but do so without awareness of how D…
▽ More
Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-optimizing inter-dependent factors both at the per-job level and at the cluster-wide level. Most existing schedulers expect users to specify the number of resources for each job, often leading to inefficient resource use. Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources.
Pollux simultaneously considers both aspects. By monitoring the status of each job during training, Pollux models how their goodput (a novel metric we introduce that combines system throughput with statistical efficiency) would change by adding or removing resources. Leveraging these information, Pollux dynamically (re-)assigns resources to improve cluster-wide goodput, while respecting fairness and continually optimizing each DL job to better utilize those resources.
In experiments with real DL jobs and with trace-driven simulations, Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers, even when they are provided with ideal resource and training configurations for every job. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. Pollux is implemented and publicly available as part of an open-source project at https://github.com/petuum/adaptdl.
△ Less
Submitted 26 May, 2021; v1 submitted 27 August, 2020;
originally announced August 2020.
-
Who Gets the Job and How are They Paid? Machine Learning Application on H-1B Case Data
Authors:
Barry Ke,
Angela Qiao
Abstract:
In this paper, we use machine learning techniques to explore the H-1B application dataset disclosed by the Department of Labor (DOL), from 2008 to 2018, in order to provide more stylized facts of the international workers in US labor market. We train a LASSO Regression model to analyze the impact of different features on the applicant's wage, and a Logistic Regression with L1-Penalty as a classifi…
▽ More
In this paper, we use machine learning techniques to explore the H-1B application dataset disclosed by the Department of Labor (DOL), from 2008 to 2018, in order to provide more stylized facts of the international workers in US labor market. We train a LASSO Regression model to analyze the impact of different features on the applicant's wage, and a Logistic Regression with L1-Penalty as a classifier to study the feature's impact on the likelihood of the case being certified. Our analysis shows that working in the healthcare industry, working in California, higher job level contribute to higher salaries. In the meantime, lower job level, working in the education services industry and nationality of Philippines are negatively correlated with the salaries. In terms of application status, a Ph.D. degree, working in retail or finance, majoring in computer science will give the applicants a better chance of being certified. Applicants with no or an associate degree, working in the education services industry, or majoring in education are more likely to be rejected.
△ Less
Submitted 23 April, 2019;
originally announced April 2019.
-
Fault Tolerance in Iterative-Convergent Machine Learning
Authors:
Aurick Qiao,
Bryon Aragam,
Bing**g Zhang,
Eric P. Xing
Abstract:
Machine learning (ML) training algorithms often possess an inherent self-correcting behavior due to their iterative-convergent nature. Recent systems exploit this property to achieve adaptability and efficiency in unreliable computing environments by relaxing the consistency of execution and allowing calculation errors to be self-corrected during training. However, the behavior of such systems are…
▽ More
Machine learning (ML) training algorithms often possess an inherent self-correcting behavior due to their iterative-convergent nature. Recent systems exploit this property to achieve adaptability and efficiency in unreliable computing environments by relaxing the consistency of execution and allowing calculation errors to be self-corrected during training. However, the behavior of such systems are only well understood for specific types of calculation errors, such as those caused by staleness, reduced precision, or asynchronicity, and for specific types of training algorithms, such as stochastic gradient descent. In this paper, we develop a general framework to quantify the effects of calculation errors on iterative-convergent algorithms and use this framework to design new strategies for checkpoint-based fault tolerance. Our framework yields a worst-case upper bound on the iteration cost of arbitrary perturbations to model parameters during training. Our system, SCAR, employs strategies which reduce the iteration cost upper bound due to perturbations incurred when recovering from checkpoints. We show that SCAR can reduce the iteration cost of partial failures by 78% - 95% when compared with traditional checkpoint-based fault tolerance across a variety of ML models and training algorithms.
△ Less
Submitted 16 October, 2018;
originally announced October 2018.
-
Dynamics, Structure and Glass Formation of Calcium Aluminate Liquids
Authors:
Hao Liu,
Ruikun Pan,
Wenlin Chen,
Zhitao Shan,
Ang Qiao,
Sandro Jahn,
James W. E. Drewitt,
Louis Hennet,
David P. Langstaff,
Haizheng Tao,
G. Neville Greaves,
Yuanzheng Yue
Abstract:
Crystalline calcium-aluminates include the phases essential in the setting of Portland cements developed over the last century. It is only within recent decades, however, that calcium-aluminate melts and glasses have begun to attract attention, bringing new functionalities in photonic and electronic applications. These studies, though, have been limited to compositions close to a deep eutectic fro…
▽ More
Crystalline calcium-aluminates include the phases essential in the setting of Portland cements developed over the last century. It is only within recent decades, however, that calcium-aluminate melts and glasses have begun to attract attention, bringing new functionalities in photonic and electronic applications. These studies, though, have been limited to compositions close to a deep eutectic from where glasses easily form. With the development of contactless levitation furnaces the glass forming region can now be hugely extended. We have taken advantage of these developments to rationalise, for the first time, melt rheology with structural properties across this expanded compositional range, substantiating this with atomistic simulation. In the process, we have discovered that supercooled calcium-aluminates comprise a new system where fragile-to-strong phase transitions are ubiquitous. Taking this holistic approach, we have quantified the common basis of thermo-physical and structural diversity in this novel glass-forming system, together with its inherent polyamorphism.
△ Less
Submitted 22 December, 2017;
originally announced January 2018.
-
WiSPA: A new approach for dealing with widespread parasitism
Authors:
Benjamin Drinkwater,
Angela Qiao,
Michael A. Charleston
Abstract:
Traditionally, studies of coevolving systems have considered cases where a parasite may inhabit only a single host. The case where a parasite may infect many hosts, widespread parasitism, has until recently gained little traction. This is due in part to the computational complexity involved in reconstructing the coevolutionary histories where parasites may infect only a single host, which is NP-Ha…
▽ More
Traditionally, studies of coevolving systems have considered cases where a parasite may inhabit only a single host. The case where a parasite may infect many hosts, widespread parasitism, has until recently gained little traction. This is due in part to the computational complexity involved in reconstructing the coevolutionary histories where parasites may infect only a single host, which is NP-Hard. Allowing parasites to inhabit more than one host has been seen to only further compound this computationally intractable problem. Recently however, well-established algorithms for estimating the problem instance where a parasite may infect only a single host have been extended to handle widespread parasites. Although this has offered significant progress, it has been noted that these algorithms poorly handle parasites that inhabit phylogenetically distant hosts.
In this work we extend these previous algorithms to handle cases where parasites inhabit phylogenetically distant hosts using an additional evolutionary event which we call spread. Our new framework is shown to infer significantly more congruent coevolutionary histories compared to existing methods over both synthetic and biological data sets. We then apply the newly proposed algorithm, which we call WiSPA (WideSpread Parasitism Analyser), to the well studied coevolutionary system of Primates and Enterobius (pinworms), where existing methods have been unable to reconcile the widespread parasitism present without permitting additional divergence events. Using WiSPA and the new biological event, spread, we provide the first statistically significant coevolutionary hypothesis for this system.
△ Less
Submitted 30 March, 2016;
originally announced March 2016.
-
Advances in the Development of Mid-Infrared Integrated Devices for Interferometric Arrays
Authors:
L. Labadie,
Guillermo Martin,
Airan Rodenas,
Norman C. Anheier,
Brahim Arezki,
Robert R. Thomson,
Hong A. Qiao,
Pierre Kern,
Ajoy K. Kar,
Bruce E. Bernacki
Abstract:
This article reports the advances on the development of mid-infrared integrated optics for stellar interferometry. The devices are fabricated by laser writing techniques on chalcogenide glasses. Laboratory characterizaton is reported and analyzed.
This article reports the advances on the development of mid-infrared integrated optics for stellar interferometry. The devices are fabricated by laser writing techniques on chalcogenide glasses. Laboratory characterizaton is reported and analyzed.
△ Less
Submitted 19 July, 2012;
originally announced July 2012.
-
First fringes with an integrated-optics beam combiner at 10 um - A new step towards instrument miniaturization for mid-infrared interferometry
Authors:
Lucas Labadie,
Guillermo Martin,
Norman C. Anheier,
Brahim Arezki,
H. A. Qiao,
Bruce Bernacki,
Pierre Kern
Abstract:
Observations at mas-resolution scales and high dynamic range hold a central place in achieving, for instance, the spectroscopic characterization of exo-Earths or the detailed map** of their protoplanetary disc birthplace. Ground or space-based multi-aperture infrared interferometry is a promising technique to tackle these goals. But significant efforts still need to be undertaken to achieve a si…
▽ More
Observations at mas-resolution scales and high dynamic range hold a central place in achieving, for instance, the spectroscopic characterization of exo-Earths or the detailed map** of their protoplanetary disc birthplace. Ground or space-based multi-aperture infrared interferometry is a promising technique to tackle these goals. But significant efforts still need to be undertaken to achieve a simplification of these instruments if we want to combine the light from a large number of telescopes. Integrated-optics appears as an alternative to the current conventional designs, especially if its use can be extended to a higher number of astronomical bands. This article reports for the first time the experimental demonstration of the feasibility of an integrated-optics approach to mid-infrared beam combination for single-mode stellar interferometry. We have fabricated a 2-telescope beam combiner prototype integrated on a substrate of chalcogenide glasses, a material transparent from 1 to 14 um. We have developed laboratory tools to characterize the modal properties and the interferometric capabilities of our device. We obtain fringes at 10 um and measure a mean contrast V=0.981 \pm 0.001 with high repeatability over one week and high stability over 5h. We show experimentally - as well as on the basis of modeling considerations - that the component has a single-mode behavior at this wavelength, which is essential to achieve high-accuracy interferometry. From previous studies, the propagation losses are estimated to 0.5 dB/cm for such components. We also discuss possible issues that may impact the interferometric contrast. The IO beam combiner performs well at 10. We also anticipate the requirement of a better matching between the numerical apertures of the component and the (de)coupling optics to optimize the total throughput. The next step foreseen is the achievement of wide-band interferograms.
△ Less
Submitted 14 April, 2011;
originally announced April 2011.