Search | arXiv e-print repository

DAGER: Exact Gradient Inversion for Large Language Models

Authors: Ivo Petrov, Dimitar I. Dimitrov, Maximilian Baader, Mark Niklas Müller, Martin Vechev

Abstract: Federated learning works by aggregating locally computed gradients from multiple clients, thus enabling collaborative training without sharing private client data. However, prior work has shown that the data can actually be recovered by the server using so-called gradient inversion attacks. While these attacks perform well when applied on images, they are limited in the text domain and only permit… ▽ More Federated learning works by aggregating locally computed gradients from multiple clients, thus enabling collaborative training without sharing private client data. However, prior work has shown that the data can actually be recovered by the server using so-called gradient inversion attacks. While these attacks perform well when applied on images, they are limited in the text domain and only permit approximate reconstruction of small batches and short input sequences. In this work, we propose DAGER, the first algorithm to recover whole batches of input text exactly. DAGER leverages the low-rank structure of self-attention layer gradients and the discrete nature of token embeddings to efficiently check if a given token sequence is part of the client data. We use this check to exactly recover full batches in the honest-but-curious setting without any prior on the data for both encoder- and decoder-based architectures using exhaustive heuristic search and a greedy approach, respectively. We provide an efficient GPU implementation of DAGER and show experimentally that it recovers full batches of size up to 128 on large language models (LLMs), beating prior attacks in speed (20x at same batch size), scalability (10x larger batches), and reconstruction quality (ROUGE-1/2 > 0.99). △ Less

Submitted 24 May, 2024; originally announced May 2024.

ACM Class: I.2.7; I.2.11

arXiv:2403.11237 [pdf, other]

FORCE: Dataset and Method for Intuitive Physics Guided Human-object Interaction

Authors: Xiaohan Zhang, Bharat Lal Bhatnagar, Sebastian Starke, Ilya Petrov, Vladimir Guzov, Helisa Dhamo, Eduardo Pérez-Pellitero, Gerard Pons-Moll

Abstract: Interactions between human and objects are influenced not only by the object's pose and shape, but also by physical attributes such as object mass and surface friction. They introduce important motion nuances that are essential for diversity and realism. Despite advancements in recent kinematics-based methods, this aspect has been overlooked. Generating nuanced human motion presents two challenges… ▽ More Interactions between human and objects are influenced not only by the object's pose and shape, but also by physical attributes such as object mass and surface friction. They introduce important motion nuances that are essential for diversity and realism. Despite advancements in recent kinematics-based methods, this aspect has been overlooked. Generating nuanced human motion presents two challenges. First, it is non-trivial to learn from multi-modal human and object information derived from both the physical and non-physical attributes. Second, there exists no dataset capturing nuanced human interactions with objects of varying physical properties, hampering model development. This work addresses the gap by introducing the FORCE model, a kinematic approach for synthesizing diverse, nuanced human-object interactions by modeling physical attributes. Our key insight is that human motion is dictated by the interrelation between the force exerted by the human and the perceived resistance. Guided by a novel intuitive physics encoding, the model captures the interplay between human force and resistance. Experiments also demonstrate incorporating human force facilitates learning multi-class motion. Accompanying our model, we contribute the FORCE dataset. It features diverse, different-styled motion through interactions with varying resistances. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: 24 pages, 9 figures

arXiv:2311.03056 [pdf]

LitSumm: Large language models for literature summarisation of non-coding RNAs

Authors: Andrew Green, Carlos Ribas, Nancy Ontiveros-Palacios, Sam Griffiths-Jones, Anton I. Petrov, Alex Bateman, Blake Sweeney

Abstract: Motivation: Curation of literature in life sciences is a growing challenge. The continued increase in the rate of publication, coupled with the relatively fixed number of curators worldwide presents a major challenge to developers of biomedical knowledgebases. Very few knowledgebases have resources to scale to the whole relevant literature and all have to prioritise their efforts. Results: In th… ▽ More Motivation: Curation of literature in life sciences is a growing challenge. The continued increase in the rate of publication, coupled with the relatively fixed number of curators worldwide presents a major challenge to developers of biomedical knowledgebases. Very few knowledgebases have resources to scale to the whole relevant literature and all have to prioritise their efforts. Results: In this work, we take a first step to alleviating the lack of curator time in RNA science by generating summaries of literature for non-coding RNAs using large language models (LLMs). We demonstrate that high-quality, factually accurate summaries with accurate references can be automatically generated from the literature using a commercial LLM and a chain of prompts and checks. Manual assessment was carried out for a subset of summaries, with the majority being rated extremely high quality. We also applied the most commonly used automated evaluation approaches, finding that they do not correlate with human assessment. Finally, we apply our tool to a selection of over 4,600 ncRNAs and make the generated summaries available via the RNAcentral resource. We conclude that automated literature summarization is feasible with the current generation of LLMs, provided careful prompting and automated checking are applied. Availability: Code used to produce these summaries can be found here: https://github.com/RNAcentral/litscan-summarization and the dataset of contexts and summaries can be found here: https://huggingface.co/datasets/RNAcentral/litsumm-v1. Summaries are also displayed on the RNA report pages in RNAcentral (https://rnacentral.org/) △ Less

Submitted 19 April, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

arXiv:2306.00777 [pdf, other]

Object pop-up: Can we infer 3D objects and their poses from human interactions alone?

Authors: Ilya A. Petrov, Riccardo Marin, Julian Chibane, Gerard Pons-Moll

Abstract: The intimate entanglement between objects affordances and human poses is of large interest, among others, for behavioural sciences, cognitive psychology, and Computer Vision communities. In recent years, the latter has developed several object-centric approaches: starting from items, learning pipelines synthesizing human poses and dynamics in a realistic way, satisfying both geometrical and functi… ▽ More The intimate entanglement between objects affordances and human poses is of large interest, among others, for behavioural sciences, cognitive psychology, and Computer Vision communities. In recent years, the latter has developed several object-centric approaches: starting from items, learning pipelines synthesizing human poses and dynamics in a realistic way, satisfying both geometrical and functional expectations. However, the inverse perspective is significantly less explored: Can we infer 3D objects and their poses from human interactions alone? Our investigation follows this direction, showing that a generic 3D human point cloud is enough to pop up an unobserved object, even when the user is just imitating a functionality (e.g., looking through a binocular) without involving a tangible counterpart. We validate our method qualitatively and quantitatively, with synthetic data and sequences acquired for the task, showing applicability for XR/VR. The code is available at https://github.com/ptrvilya/object-popup. △ Less

Submitted 27 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: Accepted at CVPR'23

arXiv:2209.09726 [pdf, other]

Storage Management with Multi-Version Partitioned B-Trees

Authors: Christian Riegger, Ilia Petrov

Abstract: Database Management Systems and K/V-Stores operate on updatable datasets -- massively exceeding the size of available main memory. Tree-based K/V storage management structures became particularly popular in storage engines. B+ Trees allow constant search performance, however write-heavy workloads yield in inefficient write patterns to secondary storage devices and poor performance characteristics.… ▽ More Database Management Systems and K/V-Stores operate on updatable datasets -- massively exceeding the size of available main memory. Tree-based K/V storage management structures became particularly popular in storage engines. B+ Trees allow constant search performance, however write-heavy workloads yield in inefficient write patterns to secondary storage devices and poor performance characteristics. LSM-Trees overcome this issue by horizontal partitioning fractions of data - small enough to fully reside in main memory, but require frequent maintenance to sustain search performance. Firstly, we propose Multi-Version Partitioned BTrees (MV-PBT) as sole storage and index management structure in key-sorted storage engines like K/V-Stores. Secondly, we compare MV-PBT against LSM-Trees. The logical horizontal partitioning in MV-PBT allows leveraging recent advances in modern B$^+$-Tree techniques in a small transparent and memory resident portion of the structure. Structural properties sustain steady read performance, yielding efficient write patterns and reducing write amplification. We integrated MV-PBT in the WiredTiger KV storage engine. MV-PBT offers an up to 2x increased steady throughput in comparison to LSM-Trees and several orders of magnitude in comparison to B+ Trees in a YCSB workload. △ Less

Submitted 20 September, 2022; originally announced September 2022.

Comments: Extended Version, ADBIS 2022

arXiv:2207.04789 [pdf, other]

bloomRF: On Performing Range-Queries in Bloom-Filters with Piecewise-Monotone Hash Functions and Prefix Hashing

Authors: Bernhard Mößner, Christian Riegger, Arthur Bernhardt, Ilia Petrov

Abstract: We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries. As a first core idea, bloomRF introduces novel prefix hashing to efficiently encode range information in the hash-code of the key itself. As a second key concept, bloomRF proposes novel piecewise-monotone hash-functions that preserve local order and support fast range-lookups wi… ▽ More We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries. As a first core idea, bloomRF introduces novel prefix hashing to efficiently encode range information in the hash-code of the key itself. As a second key concept, bloomRF proposes novel piecewise-monotone hash-functions that preserve local order and support fast range-lookups with fewer memory accesses. bloomRF has near-optimal space complexity and constant query complexity. Although, bloomRF is designed for integer domains, it supports floating-points, and can serve as a multi-attribute filter. The evaluation in RocksDB and in a standalone library shows that it is more efficient and outperforms existing point-range-filters by up to 4x across a range of settings and distributions, while kee** the false-positive rate low. △ Less

Submitted 22 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: Extended version. Original accepted at EDBT 2023

arXiv:2204.06950 [pdf, other]

BEHAVE: Dataset and Method for Tracking Human Object Interactions

Authors: Bharat Lal Bhatnagar, Xianghui Xie, Ilya A. Petrov, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

Abstract: Modelling interactions between humans and objects in natural environments is central to many applications including gaming, virtual and mixed reality, as well as human behavior analysis and human-robot collaboration. This challenging operation scenario requires generalization to vast number of objects, scenes, and human actions. Unfortunately, there exist no such dataset. Moreover, this data needs… ▽ More Modelling interactions between humans and objects in natural environments is central to many applications including gaming, virtual and mixed reality, as well as human behavior analysis and human-robot collaboration. This challenging operation scenario requires generalization to vast number of objects, scenes, and human actions. Unfortunately, there exist no such dataset. Moreover, this data needs to be acquired in diverse natural environments, which rules out 4D scanners and marker based capture systems. We present BEHAVE dataset, the first full body human- object interaction dataset with multi-view RGBD frames and corresponding 3D SMPL and object fits along with the annotated contacts between them. We record around 15k frames at 5 locations with 8 subjects performing a wide range of interactions with 20 common objects. We use this data to learn a model that can jointly track humans and objects in natural environments with an easy-to-use portable multi-camera setup. Our key insight is to predict correspondences from the human and the object to a statistical body model to obtain human-object contacts during interactions. Our approach can record and track not just the humans and objects but also their interactions, modeled as surface contacts, in 3D. Our code and data can be found at: http://virtualhumans.mpi-inf.mpg.de/behave △ Less

Submitted 14 April, 2022; originally announced April 2022.

Comments: Accepted at CVPR'22

Journal ref: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

arXiv:2102.06583 [pdf, other]

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

Authors: Konstantin Sofiiuk, Ilia A. Petrov, Anton Konushin

Abstract: Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes. These methods are considerably more computationally expensive compared to feedforward approaches, as they require performing backward passes through a network during inference and are hard to deploy on mobile frameworks that usually support only forw… ▽ More Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes. These methods are considerably more computationally expensive compared to feedforward approaches, as they require performing backward passes through a network during inference and are hard to deploy on mobile frameworks that usually support only forward passes. In this paper, we extensively evaluate various design choices for interactive segmentation and discover that new state-of-the-art results can be obtained without any additional optimization schemes. Thus, we propose a simple feedforward model for click-based interactive segmentation that employs the segmentation masks from previous steps. It allows not only to segment an entirely new object, but also to start with an external mask and correct it. When analyzing the performance of models trained on different datasets, we observe that the choice of a training dataset greatly impacts the quality of interactive segmentation. We find that the models trained on a combination of COCO and LVIS with diverse and high-quality annotations show performance superior to all existing models. The code and trained models are available at https://github.com/saic-vul/ritm_interactive_segmentation. △ Less

Submitted 12 February, 2021; originally announced February 2021.

arXiv:2012.15596 [pdf, other]

bloomRF: On Performing Range-Queries with Bloom-Filters based on Piecewise-Monotone Hash Functions and Dyadic Trace-Trees

Authors: Christian Riegger, Arthur Bernhardt, Bernhard Moessner, Ilia Petrov

Abstract: We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries on a single data structure. bloomRF extends Bloom-Filters with range query support and may replace them. The core idea is to employ a dyadic interval scheme to determine the set of dyadic intervals covering a data point, which are then encoded and inserted. bloomRF introduces Dya… ▽ More We introduce bloomRF as a unified method for approximate membership testing that supports both point- and range-queries on a single data structure. bloomRF extends Bloom-Filters with range query support and may replace them. The core idea is to employ a dyadic interval scheme to determine the set of dyadic intervals covering a data point, which are then encoded and inserted. bloomRF introduces Dyadic Trace-Trees as novel data structure that represents those covering intervals implicitly. A Trace-Tree encoding scheme represents the set of covering intervals efficiently, in a compact bit representation. Furthermore, bloomRF introduces novel piecewise-monotone hash functions that are locally order-preserving and thus support range querying. We present an efficient membership computation method for range-queries. Although, bloomRF is designed for integers it also supports string and floating-point data types. It can also handle multiple attributes and serve as multi-attribute filter. We evaluate bloomRF in RocksDB and in a standalone library. bloomRF is more efficient and outperforms existing point-range-filters by up to 4x across a range of settings. △ Less

Submitted 31 December, 2020; originally announced December 2020.

arXiv:2009.02258 [pdf, other]

AnyDB: An Architecture-less DBMS for Any Workload

Authors: Tiemo Bang, Norman May, Ilia Petrov, Carsten Binnig

Abstract: In this paper, we propose a radical new approach for scale-out distributed DBMSs. Instead of hard-baking an architectural model, such as a shared-nothing architecture, into the distributed DBMS design, we aim for a new class of so-called architecture-less DBMSs. The main idea is that an architecture-less DBMS can mimic any architecture on a per-query basis on-the-fly without any additional overhea… ▽ More In this paper, we propose a radical new approach for scale-out distributed DBMSs. Instead of hard-baking an architectural model, such as a shared-nothing architecture, into the distributed DBMS design, we aim for a new class of so-called architecture-less DBMSs. The main idea is that an architecture-less DBMS can mimic any architecture on a per-query basis on-the-fly without any additional overhead for reconfiguration. Our initial results show that our architecture-less DBMS AnyDB can provide significant speed-ups across varying workloads compared to a traditional DBMS implementing a static architecture. △ Less

Submitted 4 September, 2020; originally announced September 2020.

Comments: Submitted to 11th Annual Conference on Innovative Data Systems Research (CIDR 21)

arXiv:2006.10861 [pdf, other]

CoinPolice:Detecting Hidden Cryptojacking Attacks with Neural Networks

Authors: Ivan Petrov, Luca Invernizzi, Elie Bursztein

Abstract: Traffic monetization is a crucial component of running most for-profit online businesses. One of its latest incarnations is cryptocurrency mining, where a website instructs the visitor's browser to participate in building a cryptocurrency ledger (e.g., Bitcoin, Monero) in exchange for a small reward in the same currency. In its essence, this practice trades the user's electric bill (or battery lev… ▽ More Traffic monetization is a crucial component of running most for-profit online businesses. One of its latest incarnations is cryptocurrency mining, where a website instructs the visitor's browser to participate in building a cryptocurrency ledger (e.g., Bitcoin, Monero) in exchange for a small reward in the same currency. In its essence, this practice trades the user's electric bill (or battery level) for cryptocurrency. With user consent, this exchange can be a legitimate funding source - for example, UNICEF has collected over 27k charity donations on a website dedicated to this purpose, thehopepage.org. Regrettably, this practice also easily lends itself to abuse: in this form, called cryptojacking, attacks surreptitiously mine in the users browser, and profits are collected either by website owners or by hackers that planted the mining script into a vulnerable page. Cryptojackers have been bettering their evasion techniques, incorporating in their toolkits domain fluxing, content obfuscation, the use of WebAssembly, and throttling. Whereas most state-of-the-art defenses address multiple of these evasion techniques, none is resistant against all. In this paper, we offer a novel detection method, CoinPolice, that is robust against all of the aforementioned evasion techniques. CoinPolice flips throttling against cryptojackers, artificially varying the browser's CPU power to observe the presence of throttling. Based on a deep neural network classifier, CoinPolice can detect 97.87% of hidden miners with a low false positive rate (0.74%). We compare CoinPolice performance with the current state of the art and show our approach outperforms it when detecting aggressively throttled miners. Finally, we deploy Coinpolice to perform the largest-scale cryptoming investigation to date, identifying 6700 sites that monetize traffic in this fashion. △ Less

Submitted 23 June, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

arXiv:2001.10331 [pdf, other]

f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

Authors: Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, Anton Konushin

Abstract: Deep neural networks have become a mainstream approach to interactive segmentation. As we show in our experiments, while for some images a trained network provides accurate segmentation result with just a few clicks, for some unknown objects it cannot achieve satisfactory result even with a large amount of user input. Recently proposed backpropagating refinement (BRS) scheme introduces an optimiza… ▽ More Deep neural networks have become a mainstream approach to interactive segmentation. As we show in our experiments, while for some images a trained network provides accurate segmentation result with just a few clicks, for some unknown objects it cannot achieve satisfactory result even with a large amount of user input. Recently proposed backpropagating refinement (BRS) scheme introduces an optimization problem for interactive segmentation that results in significantly better performance for the hard cases. At the same time, BRS requires running forward and backward pass through a deep network several times that leads to significantly increased computational budget per click compared to other methods. We propose f-BRS (feature backpropagating refinement scheme) that solves an optimization problem with respect to auxiliary variables instead of the network inputs, and requires running forward and backward pass just for a small part of a network. Experiments on GrabCut, Berkeley, DAVIS and SBD datasets set new state-of-the-art at an order of magnitude lower time per click compared to original BRS. The code and trained models are available at https://github.com/saic-vul/fbrs_interactive_segmentation . △ Less

Submitted 25 August, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

arXiv:1910.08023 [pdf, other]

MV-PBT: Multi-Version Index for Large Datasets and HTAP Workloads

Authors: Christian Riegger, Tobias Vincon, Robert Gottstein, Ilia Petrov

Abstract: Modern mixed (HTAP) workloads execute fast update-transactions and long-running analytical queries on the same dataset and system. In multi-version (MVCC) systems, such workloads result in many short-lived versions and long version-chains as well as in increased and frequent maintenance overhead. Consequently, the index pressure increases significantly. Firstly, the frequent modifications cause fr… ▽ More Modern mixed (HTAP) workloads execute fast update-transactions and long-running analytical queries on the same dataset and system. In multi-version (MVCC) systems, such workloads result in many short-lived versions and long version-chains as well as in increased and frequent maintenance overhead. Consequently, the index pressure increases significantly. Firstly, the frequent modifications cause frequent creation of new versions, yielding a surge in index maintenance overhead. Secondly and more importantly, index-scans incur extra I/O overhead to determine, which of the resulting tuple-versions are visible to the executing transaction (visibility-check) as current designs only store version/timestamp information in the base table -- not in the index. Such index-only visibility-check is critical for HTAP workloads on large datasets. In this paper we propose the Multi-Version Partitioned B-Tree (MV-PBT) as a version-aware index structure, supporting index-only visibility checks and flash-friendly I/O patterns. The experimental evaluation indicates a 2x improvement for analytical queries and 15% higher transactional throughput under HTAP workloads (CH-Benchmark). MV-PBT offers 40% higher transactional throughput compared to WiredTiger's LSM-Tree implementation under YCSB. △ Less

Submitted 17 October, 2019; originally announced October 2019.

arXiv:1905.04767 [pdf, other]

Moving Processing to Data: On the Influence of Processing in Memory on Data Management

Authors: Tobias Vincon, Andreas Koch, Ilia Petrov

Abstract: Near-Data Processing refers to an architectural hardware and software paradigm, based on the co-location of storage and compute units. Ideally, it will allow to execute application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage. Thus, Near-Data Processing seeks to minimize expensive data movement, improving performance, scalability, and r… ▽ More Near-Data Processing refers to an architectural hardware and software paradigm, based on the co-location of storage and compute units. Ideally, it will allow to execute application-defined data- or compute-intensive operations in-situ, i.e. within (or close to) the physical data storage. Thus, Near-Data Processing seeks to minimize expensive data movement, improving performance, scalability, and resource-efficiency. Processing-in-Memory is a sub-class of Near-Data processing that targets data processing directly within memory (DRAM) chips. The effective use of Near-Data Processing mandates new architectures, algorithms, interfaces, and development toolchains. △ Less

Submitted 12 May, 2019; originally announced May 2019.

Showing 1–14 of 14 results for author: Petrov, I