-
Flood Prediction Using Classical and Quantum Machine Learning Models
Authors:
Marek Grzesiak,
Param Thakkar
Abstract:
This study investigates the potential of quantum machine learning to improve flood forecasting we focus on daily flood events along Germany's Wupper River in 2023 our approach combines classical machine learning techniques with QML techniques this hybrid model leverages quantum properties like superposition and entanglement to achieve better accuracy and efficiency classical and QML models are com…
▽ More
This study investigates the potential of quantum machine learning to improve flood forecasting we focus on daily flood events along Germany's Wupper River in 2023 our approach combines classical machine learning techniques with QML techniques this hybrid model leverages quantum properties like superposition and entanglement to achieve better accuracy and efficiency classical and QML models are compared based on training time accuracy and scalability results show that QML models offer competitive training times and improved prediction accuracy this research signifies a step towards utilizing quantum technologies for climate change adaptation we emphasize collaboration and continuous innovation to implement this model in real-world flood management ultimately enhancing global resilience against floods
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Multi-line AI-assisted Code Authoring
Authors:
Omer Dunay,
Daniel Cheng,
Adam Tait,
Parth Thakkar,
Peter C Rigby,
Andy Chiu,
Imad Ahmad,
Arun Ganesan,
Chandra Maddila,
Vijayaraghavan Murali,
Ali Tayyebi,
Nachiappan Nagappan
Abstract:
CodeCompose is an AI-assisted code authoring tool powered by large language models (LLMs) that provides inline suggestions to 10's of thousands of developers at Meta. In this paper, we present how we scaled the product from displaying single-line suggestions to multi-line suggestions. This evolution required us to overcome several unique challenges in improving the usability of these suggestions f…
▽ More
CodeCompose is an AI-assisted code authoring tool powered by large language models (LLMs) that provides inline suggestions to 10's of thousands of developers at Meta. In this paper, we present how we scaled the product from displaying single-line suggestions to multi-line suggestions. This evolution required us to overcome several unique challenges in improving the usability of these suggestions for developers.
First, we discuss how multi-line suggestions can have a 'jarring' effect, as the LLM's suggestions constantly move around the developer's existing code, which would otherwise result in decreased productivity and satisfaction.
Second, multi-line suggestions take significantly longer to generate; hence we present several innovative investments we made to reduce the perceived latency for users. These model-hosting optimizations sped up multi-line suggestion latency by 2.5x.
Finally, we conduct experiments on 10's of thousands of engineers to understand how multi-line suggestions impact the user experience and contrast this with single-line suggestions. Our experiments reveal that (i) multi-line suggestions account for 42% of total characters accepted (despite only accounting for 16% for displayed suggestions) (ii) multi-line suggestions almost doubled the percentage of keystrokes saved for users from 9% to 17%. Multi-line CodeCompose has been rolled out to all engineers at Meta, and less than 1% of engineers have opted out of multi-line suggestions.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Configuration Validation with Large Language Models
Authors:
Xinyu Lian,
Yinfang Chen,
Runxiang Cheng,
Jie Huang,
Parth Thakkar,
Minjia Zhang,
Tianyin Xu
Abstract:
Misconfigurations are major causes of software failures. Existing practices rely on developer-written rules or test cases to validate configurations, which are expensive. Machine learning (ML) for configuration validation is considered a promising direction, but has been facing challenges such as the need of large-scale field data and system-specific models. Recent advances in Large Language Model…
▽ More
Misconfigurations are major causes of software failures. Existing practices rely on developer-written rules or test cases to validate configurations, which are expensive. Machine learning (ML) for configuration validation is considered a promising direction, but has been facing challenges such as the need of large-scale field data and system-specific models. Recent advances in Large Language Models (LLMs) show promise in addressing some of the long-lasting limitations of ML-based configuration validation. We present a first analysis on the feasibility and effectiveness of using LLMs for configuration validation. We empirically evaluate LLMs as configuration validators by develo** a generic LLM-based configuration validation framework, named Ciri. Ciri employs effective prompt engineering with few-shot learning based on both valid configuration and misconfiguration data. Ciri checks outputs from LLMs when producing results, addressing hallucination and nondeterminism of LLMs. We evaluate Ciri's validation effectiveness on eight popular LLMs using configuration data of ten widely deployed open-source systems. Our analysis (1) confirms the potential of using LLMs for configuration validation, (2) explores design space of LLMbased validators like Ciri, and (3) reveals open challenges such as ineffectiveness in detecting certain types of misconfigurations and biases towards popular configuration parameters.
△ Less
Submitted 2 April, 2024; v1 submitted 14 October, 2023;
originally announced October 2023.
-
Scaling Hyperledger Fabric Using Pipelined Execution and Sparse Peers
Authors:
Parth Thakkar,
Senthilnathan Natarajan
Abstract:
Permissioned blockchains are becoming popular as data management systems in the enterprise setting. Compared to traditional distributed databases, blockchain platforms provide increased security guarantees but significantly lower performance. Further, these platforms are quite expensive to run for the low throughput they provide. The following are two ways to improve performance and reduce cost: (…
▽ More
Permissioned blockchains are becoming popular as data management systems in the enterprise setting. Compared to traditional distributed databases, blockchain platforms provide increased security guarantees but significantly lower performance. Further, these platforms are quite expensive to run for the low throughput they provide. The following are two ways to improve performance and reduce cost: (1) make the system utilize allocated resources efficiently; (2) allow rapid and dynamic scaling of allocated resources based on load. We explore both of these in this work.
We first investigate the reasons for the poor performance and scalability of the dominant permissioned blockchain flavor called Execute-Order-Validate (EOV). We do this by studying the scaling characteristics of Hyperledger Fabric, a popular EOV platform, using vertical scaling and horizontal scaling. We find that the transaction throughput scales very poorly with these techniques. At least in the permissioned setting, the real bottleneck is transaction processing, not the consensus protocol. With vertical scaling, the allocated vCPUs go under-utilized. In contrast, with horizontal scaling, the allocated resources get wasted due to redundant work across nodes within an organization.
To mitigate the above concerns, we first improve resource efficiency by (a) improving CPU utilization with a pipelined execution of validation & commit phases; (b) avoiding redundant work across nodes by introducing a new type of peer node called sparse peer that selectively commits transactions. We further propose a technique that enables the rapid scaling of resources. Our implementation - SmartFabric, built on top of Hyperledger Fabric demonstrates 3x higher throughput, 12-26x faster scale-up time, and provides Fabric's throughput at 50% to 87% lower cost.
△ Less
Submitted 1 March, 2021; v1 submitted 11 March, 2020;
originally announced March 2020.
-
Performance Benchmarking and Optimizing Hyperledger Fabric Blockchain Platform
Authors:
Parth Thakkar,
Senthil Nathan,
Balaji Vishwanathan
Abstract:
The rise in popularity of permissioned blockchain platforms in recent time is significant. Hyperledger Fabric is one such permissioned blockchain platform and one of the Hyperledger projects hosted by the Linux Foundation. The Fabric comprises various components such as smart-contracts, endorsers, committers, validators, and orderers. As the performance of blockchain platform is a major concern fo…
▽ More
The rise in popularity of permissioned blockchain platforms in recent time is significant. Hyperledger Fabric is one such permissioned blockchain platform and one of the Hyperledger projects hosted by the Linux Foundation. The Fabric comprises various components such as smart-contracts, endorsers, committers, validators, and orderers. As the performance of blockchain platform is a major concern for enterprise applications, in this work, we perform a comprehensive empirical study to characterize the performance of Hyperledger Fabric and identify potential performance bottlenecks to gain a better understanding of the system. We follow a two-phased approach. In the first phase, our goal is to understand the impact of various configuration parameters such as block size, endorsement policy, channels, resource allocation, state database choice on the transaction throughput & latency to provide various guidelines on configuring these parameters. In addition, we also aim to identify performance bottlenecks and hotspots. We observed that (1) endorsement policy verification, (2) sequential policy validation of transactions in a block, and (3) state validation and commit (with CouchDB) were the three major bottlenecks. In the second phase, we focus on optimizing Hyperledger Fabric v1.0 based on our observations. We introduced and studied various simple optimizations such as aggressive caching for endorsement policy verification in the cryptography component (3x improvement in the performance) and parallelizing endorsement policy verification (7x improvement). Further, we enhanced and measured the effect of an existing bulk read/write optimization for CouchDB during state validation & commit phase (2.5x improvement). By combining all three optimizations1, we improved the overall throughput by 16x (i.e., from 140 tps to 2250 tps).
△ Less
Submitted 29 May, 2018;
originally announced May 2018.