Search | arXiv e-print repository

Sampling From Autoencoders' Latent Space via Quantization And Probability Mass Function Concepts

Authors: Aymene Mohammed Bouayed, Adrian Iaccovelli, David Naccache

Abstract: In this study, we focus on sampling from the latent space of generative models built upon autoencoders so as the reconstructed samples are lifelike images. To do to, we introduce a novel post-training sampling algorithm rooted in the concept of probability mass functions, coupled with a quantization process. Our proposed algorithm establishes a vicinity around each latent vector from the input dat… ▽ More In this study, we focus on sampling from the latent space of generative models built upon autoencoders so as the reconstructed samples are lifelike images. To do to, we introduce a novel post-training sampling algorithm rooted in the concept of probability mass functions, coupled with a quantization process. Our proposed algorithm establishes a vicinity around each latent vector from the input data and then proceeds to draw samples from these defined neighborhoods. This strategic approach ensures that the sampled latent vectors predominantly inhabit high-probability regions, which, in turn, can be effectively transformed into authentic real-world images. A noteworthy point of comparison for our sampling algorithm is the sampling technique based on Gaussian mixture models (GMM), owing to its inherent capability to represent clusters. Remarkably, we manage to improve the time complexity from the previous $\mathcal{O}(n\times d \times k \times i)$ associated with GMM sampling to a much more streamlined $\mathcal{O}(n\times d)$, thereby resulting in substantial speedup during runtime. Moreover, our experimental results, gauged through the Fréchet inception distance (FID) for image generation, underscore the superior performance of our sampling algorithm across a diverse range of models and datasets. On the MNIST benchmark dataset, our approach outperforms GMM sampling by yielding a noteworthy improvement of up to $0.89$ in FID value. Furthermore, when it comes to generating images of faces and ocular images, our approach showcases substantial enhancements with FID improvements of $1.69$ and $0.87$ respectively, as compared to GMM sampling, as evidenced on the CelebA and MOBIUS datasets. Lastly, we substantiate our methodology's efficacy in estimating latent space distributions in contrast to GMM sampling, particularly through the lens of the Wasserstein distance. △ Less

Submitted 21 August, 2023; originally announced August 2023.

arXiv:2308.06291 [pdf]

The Balkans Continued Fraction

Authors: David Naccache, Ofer Yifrach-Stav

Abstract: In a previous escapade we gave a collection of continued fractions involving Catalan's constant. This paper provides more general formulae governing those continued fractions. Having distinguished different cases associated to regions in the plan, we nickname those continued fractions \enquote{The Balkans} as they divide into areas which are related but still different in nature. Because we do n… ▽ More In a previous escapade we gave a collection of continued fractions involving Catalan's constant. This paper provides more general formulae governing those continued fractions. Having distinguished different cases associated to regions in the plan, we nickname those continued fractions \enquote{The Balkans} as they divide into areas which are related but still different in nature. Because we do not provide formal proofs of those machine-constructed formulae we do not claim them to be theorems. Still, each and every proposed formula was extensively tested numerically. △ Less

Submitted 18 April, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

arXiv:2301.06489 [pdf, ps, other]

Simplex Autoencoders

Authors: Aymene Mohammed Bouayed, David Naccache

Abstract: Synthetic data generation is increasingly important due to privacy concerns. While Autoencoder-based approaches have been widely used for this purpose, sampling from their latent spaces can be challenging. Mixture models are currently the most efficient way to sample from these spaces. In this work, we propose a new approach that models the latent space of an Autoencoder as a simplex, allowing for… ▽ More Synthetic data generation is increasingly important due to privacy concerns. While Autoencoder-based approaches have been widely used for this purpose, sampling from their latent spaces can be challenging. Mixture models are currently the most efficient way to sample from these spaces. In this work, we propose a new approach that models the latent space of an Autoencoder as a simplex, allowing for a novel heuristic for determining the number of components in the mixture model. This heuristic is independent of the number of classes and produces comparable results. We also introduce a sampling method based on probability mass functions, taking advantage of the compactness of the latent space. We evaluate our approaches on a synthetic dataset and demonstrate their performance on three benchmark datasets: MNIST, CIFAR-10, and Celeba. Our approach achieves an image generation FID of 4.29, 13.55, and 11.90 on the MNIST, CIFAR-10, and Celeba datasets, respectively. The best AE FID results to date on those datasets are respectively 6.3, 85.3 and 35.6 we hence substantially improve those figures (the lower is the FID the better). However, AEs are not the best performing algorithms on the concerned datasets and all FID records are currently held by GANs. While we do not perform better than GANs on CIFAR and Celeba we do manage to squeeze-out a non-negligible improvement (of 0.21) over the current GAN-held record for the MNIST dataset. △ Less

Submitted 16 January, 2023; originally announced January 2023.

arXiv:2301.01624 [pdf, ps, other]

Pattern Recognition Experiments on Mathematical Expressions

Authors: David Naccache, Ofer Yifrach-Stav

Abstract: We provide the results of pattern recognition experiments on mathematical expressions. We give a few examples of conjectured results. None of which was thoroughly checked for novelty. We did not attempt to prove all the relations found and focused on their generation. We provide the results of pattern recognition experiments on mathematical expressions. We give a few examples of conjectured results. None of which was thoroughly checked for novelty. We did not attempt to prove all the relations found and focused on their generation. △ Less

Submitted 21 December, 2022; originally announced January 2023.

arXiv:2211.01058 [pdf, ps, other]

A Note on the Ramanujan Machine

Authors: Eric Brier, David Naccache, Ofer Yifrach-Stav

Abstract: The Ramanujan Machine project detects new expressions related to constants of interest, such as $ζ$ function values, $γ$ and algebraic numbers (to name a few). In particular the project lists a number of conjectures involving even and odd $ζ$ function values, logarithms etc. We show that many relations detected by the Ramanujan Machine Project stem from a specific algebraic observation and show ho… ▽ More The Ramanujan Machine project detects new expressions related to constants of interest, such as $ζ$ function values, $γ$ and algebraic numbers (to name a few). In particular the project lists a number of conjectures involving even and odd $ζ$ function values, logarithms etc. We show that many relations detected by the Ramanujan Machine Project stem from a specific algebraic observation and show how to generate infinitely many. This provides an automated proof and/or an explanation of many of the relations listed as conjectures by the project (although not all of them). △ Less

Submitted 3 November, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

arXiv:2210.15669 [pdf, ps, other]

On Catalan Constant Continued Fractions

Authors: David Naccache, Ofer Yifrach-Stav

Abstract: The Ramanujan Machine project detects new expressions related to constants of interest, such as $ζ$ function values, $γ$ and algebraic numbers (to name a few). In particular the project lists a number of conjectures concerning the Catalan constant $G= 0.91596559\ldots$ We show how to generate infinitely many. We used an ad hoc software toolchain and rather tedious mathematical developments. Becaus… ▽ More The Ramanujan Machine project detects new expressions related to constants of interest, such as $ζ$ function values, $γ$ and algebraic numbers (to name a few). In particular the project lists a number of conjectures concerning the Catalan constant $G= 0.91596559\ldots$ We show how to generate infinitely many. We used an ad hoc software toolchain and rather tedious mathematical developments. Because we do not provide a proper peer-reviewed proof of the relations given here we do not claim them to be theorems. △ Less

Submitted 18 November, 2022; v1 submitted 30 October, 2022; originally announced October 2022.

arXiv:2210.00856 [pdf, other]

doi 10.1016/j.fsidi.2022.301437

A forensic analysis of the Google Home: repairing compressed data without error correction

Authors: Hadrien Barral, Georges-Axel Jaloyan, Fabien Thomas-Brans, Matthieu Regnery, Rémi Géraud-Stewart, Thibaut Heckmann, Thomas Souvignet, David Naccache

Abstract: This paper provides a detailed explanation of the steps taken to extract and repair a Google Home's internal data. Starting with reverse engineering the hardware of a commercial off-the-shelf Google Home, internal data is then extracted by desoldering and dum** the flash memory. As error correction is performed by the CPU using an undisclosed method, a new alternative method is shown to repair a… ▽ More This paper provides a detailed explanation of the steps taken to extract and repair a Google Home's internal data. Starting with reverse engineering the hardware of a commercial off-the-shelf Google Home, internal data is then extracted by desoldering and dum** the flash memory. As error correction is performed by the CPU using an undisclosed method, a new alternative method is shown to repair a corrupted SquashFS filesystem, under the assumption of a single or double bitflip per gzip-compressed fragment. Finally, a new method to handle multiple possible repairs using three-valued logic is presented. △ Less

Submitted 29 September, 2022; originally announced October 2022.

Comments: 28 pages, modified version of paper that appeared originally at Forensic Science International: Digital Investigation

Journal ref: Forensic Science International: Digital Investigation, Volume 42, 2022, 301437, ISSN 2666-2817

arXiv:2206.03604 [pdf, other]

Automated Discovery of New $L$-Function Relations

Authors: Hadrien Barral, Rémi Géraud-Stewart, Arthur Léonard, David Naccache, Quentin Vermande, Samuel Vivien

Abstract: $L… ▽ More $L$-functions typically encode interesting information about mathematical objects. This paper reports 29 identities between such functions that hitherto never appeared in the literature. Of these we have a complete proof for 9; all others are extensively numerically checked and we welcome proofs of their (in)validity. The method we devised to obtain these identities is a two-step process whereby a list of candidate identities is automatically generated, obtained, tested, and ultimately formally proven. The approach is however only \emph{semi-}automated as human intervention is necessary for the post-processing phase, to determine the most general form of a conjectured identity and to provide a proof for them. This work complements other instances in the literature where automated symbolic computation has served as a productive step toward theorem proving and can be extended in several directions further to explore the algebraic landscape of $L$-functions and similar constructions. △ Less

Submitted 9 June, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

arXiv:2205.14236 [pdf, other]

FedControl: When Control Theory Meets Federated Learning

Authors: Adnan Ben Mansour, Gaia Carenini, Alexandre Duplessis, David Naccache

Abstract: To date, the most popular federated learning algorithms use coordinate-wise averaging of the model parameters. We depart from this approach by differentiating client contributions according to the performance of local learning and its evolution. The technique is inspired from control theory and its classification performance is evaluated extensively in IID framework and compared with FedAvg. To date, the most popular federated learning algorithms use coordinate-wise averaging of the model parameters. We depart from this approach by differentiating client contributions according to the performance of local learning and its evolution. The technique is inspired from control theory and its classification performance is evaluated extensively in IID framework and compared with FedAvg. △ Less

Submitted 27 May, 2022; originally announced May 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2205.10864

arXiv:2205.10864 [pdf, other]

Federated Learning Aggregation: New Robust Algorithms with Guarantees

Authors: Adnan Ben Mansour, Gaia Carenini, Alexandre Duplessis, David Naccache

Abstract: Federated Learning has been recently proposed for distributed model training at the edge. The principle of this approach is to aggregate models learned on distributed clients to obtain a new more general "average" model (FedAvg). The resulting model is then redistributed to clients for further training. To date, the most popular federated learning algorithm uses coordinate-wise averaging of the mo… ▽ More Federated Learning has been recently proposed for distributed model training at the edge. The principle of this approach is to aggregate models learned on distributed clients to obtain a new more general "average" model (FedAvg). The resulting model is then redistributed to clients for further training. To date, the most popular federated learning algorithm uses coordinate-wise averaging of the model parameters for aggregation. In this paper, we carry out a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework. From this, we derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses. Moreover, we go beyond the assumptions introduced in theory, by evaluating the performance of these strategies and by comparing them with the one of FedAvg in classification tasks in both the IID and the Non-IID framework without additional hypothesis. △ Less

Submitted 18 July, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

arXiv:2110.11079 [pdf, other]

Tagged Documents Co-Clustering

Authors: Gaëlle Candel, David Naccache

Abstract: Tags are short sequences of words allowing to describe textual and non-texual resources such as as music, image or book. Tags could be used by machine information retrieval systems to access quickly a document. These tags can be used to build recommender systems to suggest similar items to a user. However, the number of tags per document is limited, and often distributed according to a Zipf law. I… ▽ More Tags are short sequences of words allowing to describe textual and non-texual resources such as as music, image or book. Tags could be used by machine information retrieval systems to access quickly a document. These tags can be used to build recommender systems to suggest similar items to a user. However, the number of tags per document is limited, and often distributed according to a Zipf law. In this paper, we propose a methodology to cluster tags into conceptual groups. Data are preprocessed to remove power-law effects and enhance the context of low-frequency words. Then, a hierarchical agglomerative co-clustering algorithm is proposed to group together the most related tags into clusters. The capabilities were evaluated on a sparse synthetic dataset and a real-world tag collection associated with scientific papers. The task being unsupervised, we propose some stop** criterion for selectecting an optimal partitioning. △ Less

Submitted 14 October, 2021; originally announced October 2021.

Comments: 15 pages, submitted and accepted to the 2021 World Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE'21) - track ICAI21

MSC Class: 68T99 ACM Class: I.2.m

arXiv:2110.09212 [pdf, other]

Noise-Resilient Ensemble Learning using Evidence Accumulation Clustering

Authors: Gaëlle Candel, David Naccache

Abstract: Ensemble Learning methods combine multiple algorithms performing the same task to build a group with superior quality. These systems are well adapted to the distributed setup, where each peer or machine of the network hosts one algorithm and communicate its results to its peers. Ensemble learning methods are naturally resilient to the absence of several peers thanks to the ensemble redundancy. How… ▽ More Ensemble Learning methods combine multiple algorithms performing the same task to build a group with superior quality. These systems are well adapted to the distributed setup, where each peer or machine of the network hosts one algorithm and communicate its results to its peers. Ensemble learning methods are naturally resilient to the absence of several peers thanks to the ensemble redundancy. However, the network can be corrupted, altering the prediction accuracy of a peer, which has a deleterious effect on the ensemble quality. In this paper, we propose a noise-resilient ensemble classification method, which helps to improve accuracy and correct random errors. The approach is inspired by Evidence Accumulation Clustering , adapted to classification ensembles. We compared it to the naive voter model over four multi-class datasets. Our model showed a greater resilience, allowing us to recover prediction under a very high noise level. In addition as the method is based on the evidence accumulation clustering, our method is highly flexible as it can combines classifiers with different label definitions. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: 12 pages, submitted and accepted to ANTIC-2021 (International Conference on Advanced Network Technologies and Intelligent Computing)

MSC Class: 68T99 ACM Class: I.2.m

arXiv:2110.04263 [pdf, ps, other]

The Multiplicative Persistence Conjecture Is True for Odd Targets

Authors: Eric Brier, Christophe Clavier, Linda Gutsche, David Naccache

Abstract: In 1973, Neil Sloane published a very short paper introducing an intriguing problem: Pick a decimal integer $n$ and multiply all its digits by each other. Repeat the process until a single digit $Δ(n)$ is obtained. $Δ(n)$ is called the \textsl{multiplicative digital root of $n$} or \textsl{the target of $n$}. The number of steps $Ξ(n)$ needed to reach $Δ(n)$, called the multiplicative persistence… ▽ More In 1973, Neil Sloane published a very short paper introducing an intriguing problem: Pick a decimal integer $n$ and multiply all its digits by each other. Repeat the process until a single digit $Δ(n)$ is obtained. $Δ(n)$ is called the \textsl{multiplicative digital root of $n$} or \textsl{the target of $n$}. The number of steps $Ξ(n)$ needed to reach $Δ(n)$, called the multiplicative persistence of $n$ or \textsl{the height of $n$} is conjectured to always be at most $11$. Like many other very simple to state number-theoretic conjectures, the multiplicative persistence mystery resisted numerous explanation attempts. This paper proves that the conjecture holds for all odd target values: Namely that if $Δ(n)\in\{1,3,7,9\}$, then $Ξ(n) \leq 1$ and that if $Δ(n)=5$, then $Ξ(n) \leq 5$. Naturally, we overview the difficulties currently preventing us from extending the approach to (nonzero) even targets. △ Less

Submitted 8 October, 2021; originally announced October 2021.

arXiv:2109.14925 [pdf, other]

Genealogical Population-Based Training for Hyperparameter Optimization

Authors: Antoine Scardigli, Paul Fournier, Matteo Vilucchio, David Naccache

Abstract: HyperParameter Optimization (HPO) aims at finding the best HyperParameters (HPs) of learning models, such as neural networks, in the fastest and most efficient way possible. Most recent HPO algorithms try to optimize HPs regardless of the model that obtained them, assuming that for different models, same HPs will produce very similar results. We break free from this paradigm and propose a new take… ▽ More HyperParameter Optimization (HPO) aims at finding the best HyperParameters (HPs) of learning models, such as neural networks, in the fastest and most efficient way possible. Most recent HPO algorithms try to optimize HPs regardless of the model that obtained them, assuming that for different models, same HPs will produce very similar results. We break free from this paradigm and propose a new take on preexisting methods that we called Genealogical Population Based Training (GPBT). GPBT, via the shared histories of "genealogically"-related models, exploit the coupling of HPs and models in an efficient way. We experimentally demonstrate that our method cuts down by 2 to 3 times the computational cost required, generally allows a 1% accuracy improvement on computer vision tasks, and reduces the variance of the results by an order of magnitude, compared to the current algorithms. Our method is search-algorithm agnostic so that the inner search routine can be any search algorithm like TPE, GP, CMA or random search. △ Less

Submitted 9 April, 2023; v1 submitted 30 September, 2021; originally announced September 2021.

arXiv:2109.10538 [pdf, other]

Index $t$-SNE: Tracking Dynamics of High-Dimensional Datasets with Coherent Embeddings

Authors: Gaëlle Candel, David Naccache

Abstract: $t$-SNE is an embedding method that the data science community has widely Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. $t… ▽ More $t$-SNE is an embedding method that the data science community has widely Two interesting characteristics of t-SNE are the structure preservation property and the answer to the crowding problem, where all neighbors in high dimensional space cannot be represented correctly in low dimensional space. $t$-SNE preserves the local neighborhood, and similar items are nicely spaced by adjusting to the local density. These two characteristics produce a meaningful representation, where the cluster area is proportional to its size in number, and relationships between clusters are materialized by closeness on the embedding. This algorithm is non-parametric, therefore two initializations of the algorithm would lead to two different embedding. In a forensic approach, analysts would like to compare two or more datasets using their embedding. An approach would be to learn a parametric model over an embedding built with a subset of data. While this approach is highly scalable, points could be mapped at the same exact position, making them indistinguishable. This type of model would be unable to adapt to new outliers nor concept drift. This paper presents a methodology to reuse an embedding to create a new one, where cluster positions are preserved. The optimization process minimizes two costs, one relative to the embedding shape and the second relative to the support embedding' match. The proposed algorithm has the same complexity than the original $t$-SNE to embed new items, and a lower one when considering the embedding of a dataset sliced into sub-pieces. The method showed promising results on a real-world dataset, allowing to observe the birth, evolution and death of clusters. The proposed approach facilitates identifying significant trends and changes, which empowers the monitoring high dimensional datasets' dynamics. △ Less

Submitted 22 September, 2021; originally announced September 2021.

Comments: International Conference on Big Data Visual Analytics (ICBDVA), Venice, Italy, August 12-13 2021 https://publications.waset.org/pdf/10012177 Best paper award

MSC Class: 90-08 ACM Class: I.m; J.m

Journal ref: International Journal of Computer and Systems Engineering (2021), 15(8), 500 - 512

arXiv:2109.10007 [pdf, other]

Generating Local Maps of Science using Deep Bibliographic Coupling

Authors: Gaëlle Candel, David Naccache

Abstract: Bibliographic and co-citation coupling are two analytical methods widely used to measure the degree of similarity between scientific papers. These approaches are intuitive, easy to put into practice, and computationally cheap. Moreover, they have been used to generate a map of science, allowing visualizing research field interactions. Nonetheless, these methods do not work unless two papers share… ▽ More Bibliographic and co-citation coupling are two analytical methods widely used to measure the degree of similarity between scientific papers. These approaches are intuitive, easy to put into practice, and computationally cheap. Moreover, they have been used to generate a map of science, allowing visualizing research field interactions. Nonetheless, these methods do not work unless two papers share a standard reference, limiting the two papers usability with no direct connection. In this work, we propose to extend bibliographic coupling to the deep neighborhood, by using graph diffusion methods. This method allows defining similarity between any two papers, making it possible to generate a local map of science, highlighting field organization. △ Less

Submitted 21 September, 2021; originally announced September 2021.

Comments: Submitted to the International Conference on Scientometrics and Informetrics. Accepted as a poster. Here, long version

MSC Class: 01A85 ACM Class: H.5; I.m; J.m

Journal ref: July 12-15, 2021 18th International Conference of the International Society for Scientometrics and Informetrics Leuven, Belgium

arXiv:2109.07135 [pdf, other]

Co-Embedding: Discovering Communities on Bipartite Graphs through Projection

Authors: Gaëlle Candel, David Naccache

Abstract: Many datasets take the form of a bipartite graph where two types of nodes are connected by relationships, like the movies watched by a user or the tags associated with a file. The partitioning of the bipartite graph could be used to fasten recommender systems, or reduce the information retrieval system's index size, by identifying groups of items with similar properties. This type of graph is ofte… ▽ More Many datasets take the form of a bipartite graph where two types of nodes are connected by relationships, like the movies watched by a user or the tags associated with a file. The partitioning of the bipartite graph could be used to fasten recommender systems, or reduce the information retrieval system's index size, by identifying groups of items with similar properties. This type of graph is often processed by algorithms using the Vector Space Model representation, where a binary vector represents an item with 0 and 1. The main problem with this representation is the dimension relatedness, like words' synonymity, which is not considered. This article proposes a co-clustering algorithm using items projection, allowing the measurement of features similarity. We evaluated our algorithm on a cluster retrieval task. Over various datasets, our algorithm produced well balanced clusters with coherent items in, leading to high retrieval scores on this task.. △ Less

Submitted 30 September, 2021; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: Submitted and accepted to FICC 2022 (Future of Information and Communication Conference). Long version

MSC Class: 68T99 ACM Class: I.2.6

arXiv:2106.10971 [pdf, ps, other]

Near-Optimal Pool Testing under Urgency Constraints

Authors: Éric Brier, Megi Dervishi, Rémi Géraud-Stewart, David Naccache, Ofer Yifrach-Stav

Abstract: Detection of rare traits or diseases in a large population is challenging. Pool testing allows covering larger swathes of population at a reduced cost, while simplifying logistics. However, testing precision decreases as it becomes unclear which member of a pool made the global test positive. In this paper we discuss testing strategies that provably approach best-possible strategy - optimal in t… ▽ More Detection of rare traits or diseases in a large population is challenging. Pool testing allows covering larger swathes of population at a reduced cost, while simplifying logistics. However, testing precision decreases as it becomes unclear which member of a pool made the global test positive. In this paper we discuss testing strategies that provably approach best-possible strategy - optimal in the sense that no other strategy can give exact results with fewer tests. Our algorithms guarantee that they provide a complete and exact result for every individual, without exceeding $1/0.99$ times the number of tests the optimal strategy would require. This threshold is arbitrary: algorithms closer to the optimal bound can be described, however their complexity increases, making them less practical. Moreover, the way the algorithms process input samples leads to some individuals' status to be known sooner, thus allowing to take urgency into account when assigning individuals to tests. △ Less

Submitted 21 June, 2021; originally announced June 2021.

arXiv:2105.04454 [pdf, other]

doi 10.1016/j.cose.2021.102471

Physical Fault Injection and Side-Channel Attacks on Mobile Devices: A Comprehensive Analysis

Authors: Carlton Shepherd, Konstantinos Markantonakis, Nico van Heijningen, Driss Aboulkassimi, Clément Gaine, Thibaut Heckmann, David Naccache

Abstract: Today's mobile devices contain densely packaged system-on-chips (SoCs) with multi-core, high-frequency CPUs and complex pipelines. In parallel, sophisticated SoC-assisted security mechanisms have become commonplace for protecting device data, such as trusted execution environments, full-disk and file-based encryption. Both advancements have dramatically complicated the use of conventional physical… ▽ More Today's mobile devices contain densely packaged system-on-chips (SoCs) with multi-core, high-frequency CPUs and complex pipelines. In parallel, sophisticated SoC-assisted security mechanisms have become commonplace for protecting device data, such as trusted execution environments, full-disk and file-based encryption. Both advancements have dramatically complicated the use of conventional physical attacks, requiring the development of specialised attacks. In this survey, we consolidate recent developments in physical fault injections and side-channel attacks on modern mobile devices. In total, we comprehensively survey over 50 fault injection and side-channel attack papers published between 2009-2021. We evaluate the prevailing methods, compare existing attacks using a common set of criteria, identify several challenges and shortcomings, and suggest future directions of research. △ Less

Submitted 22 March, 2022; v1 submitted 10 May, 2021; originally announced May 2021.

Journal ref: Computers & Security. 111 (2021) 102471

arXiv:2104.09982 [pdf, other]

Explaining the Entombed Algorithm

Authors: Leon Mächler, David Naccache

Abstract: In \cite{entombed}, John Aycock and Tara Copplestone pose an open question, namely the explanation of the mysterious lookup table used in the Entombed Game's Algorithm for two dimensional maze generation. The question attracted media attention (BBC etc) and was open until today. This paper answers this question, explains the algorithm and even extends it to three dimensions. In \cite{entombed}, John Aycock and Tara Copplestone pose an open question, namely the explanation of the mysterious lookup table used in the Entombed Game's Algorithm for two dimensional maze generation. The question attracted media attention (BBC etc) and was open until today. This paper answers this question, explains the algorithm and even extends it to three dimensions. △ Less

Submitted 21 April, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

arXiv:2103.09031 [pdf]

Evaluation of a Bi-Directional Methodology for Automated Assessment of Compliance to Continuous Application of Clinical Guidelines, in the Type 2 Diabetes-Management Domain

Authors: Avner Hatsek, Irit Hochberg, Deeb Daoud Naccache, Aya Biderman, Yuval Shahar

Abstract: We evaluated the DiscovErr system, in which we had previously implemented a new methodology for assessment of compliance to continuous application of evidence-based clinical guidelines, based on a bidirectional search from the guideline objectives to the patient's longitudinal data, and vice versa. We compared the system comments on 1584 transactions regarding the management, over a mean of 5.23 y… ▽ More We evaluated the DiscovErr system, in which we had previously implemented a new methodology for assessment of compliance to continuous application of evidence-based clinical guidelines, based on a bidirectional search from the guideline objectives to the patient's longitudinal data, and vice versa. We compared the system comments on 1584 transactions regarding the management, over a mean of 5.23 years, of 10 randomly selected Type 2 diabetes patients, to those of two diabetes experts and a senior family practitioner. After providing their own comments, the experts assessed both the correctness (precision) and the importance of each of the DiscovErr system comments. The completeness (recall or coverage) of the system was computed by comparing its comments to those made by the experts. The system made 279 comments. The experts made 181 unique comments. The completeness of the system was 91% compared to comments made by at least two experts, and 98% when compared to comments made by all three. 172 comments were evaluated by the experts for correctness and importance: All 114 medication-related comments, and a random 35% of the 165 monitoring-related comments. The system's correctness was 81% compared to comments judged as correct by both diabetes experts, and 91% compared to comments judged as correct by a diabetes expert and at least as partially correct by the other. 89% of the comments were judged as important by both diabetes experts, 8% were judged as important by one expert, 3% were judged as less important by both experts. The completeness scores of the three experts (compared to the comments of all experts plus the validated system comments) were 75%, 60%, and 55%; the experts' correctness scores (compared to their majority) were respectively 99%, 91%, and 88%. Conclusion: Systems such as DiscovErr can assess the quality of continuous guideline-based care. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: 25 pages; 4 figures, 6 tables

ACM Class: I.2.1

arXiv:2103.08229 [pdf, other]

doi 10.1145/3320269.3384738

Return-Oriented Programming on RISC-V

Authors: Georges-Axel Jaloyan, Konstantinos Markantonakis, Raja Naeem Akram, David Robin, Keith Mayes, David Naccache

Abstract: This paper provides the first analysis on the feasibility of Return-Oriented Programming (ROP) on RISC-V, a new instruction set architecture targeting embedded systems. We show the existence of a new class of gadgets, using several Linear Code Sequences And Jumps (LCSAJ), undetected by current Galileo-based ROP gadget searching tools. We argue that this class of gadgets is rich enough on RISC-V to… ▽ More This paper provides the first analysis on the feasibility of Return-Oriented Programming (ROP) on RISC-V, a new instruction set architecture targeting embedded systems. We show the existence of a new class of gadgets, using several Linear Code Sequences And Jumps (LCSAJ), undetected by current Galileo-based ROP gadget searching tools. We argue that this class of gadgets is rich enough on RISC-V to mount complex ROP attacks, bypassing traditional mitigation like DEP, ASLR, stack canaries, G-Free, as well as some compiler-based backward-edge CFI, by jum** over any guard inserted by a compiler to protect indirect jump instructions. We provide examples of such gadgets, as well as a proof-of-concept ROP chain, using C code injection to leverage a privilege escalation attack on two standard Linux operating systems. Additionally, we discuss some of the required mitigations to prevent such attacks and provide a new ROP gadget finder algorithm that handles this new class of gadgets. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: 27 pages, 8 figures, originally published at AsiaCCS 2020

arXiv:2102.01995 [pdf, ps, other]

Convergence Voting: From Pairwise Comparisons to Consensus

Authors: Gergei Bana, Wojciech Jamroga, David Naccache, Peter Y. A. Ryan

Abstract: An important aspect of AI design and ethics is to create systems that reflect aggregate preferences of the society. To this end, the techniques of social choice theory are often utilized. We propose a new social choice function motivated by the PageRank algorithm. The function ranks voting options based on the Condorcet graph of pairwise comparisons. To this end, we transform the Condorcet graph i… ▽ More An important aspect of AI design and ethics is to create systems that reflect aggregate preferences of the society. To this end, the techniques of social choice theory are often utilized. We propose a new social choice function motivated by the PageRank algorithm. The function ranks voting options based on the Condorcet graph of pairwise comparisons. To this end, we transform the Condorcet graph into a Markov chain whose stationary distribution provides the scores of the options. We show how the values in the stationary distribution can be interpreted as quantified aggregate support for the voting options, to which the community of voters converges through an imaginary sequence of negotiating steps. Because of that, we suggest the name "convergence voting" for the new voting scheme, and "negotiated community support" for the resulting stationary allocation of scores. Our social choice function can be viewed as a consensus voting method, sitting somewhere between Copeland and Borda. On the one hand, it does not necessarily choose the Condorcet winner, as strong support from a part of the society can outweigh mediocre uniform support. On the other hand, the influence of unpopular candidates on the outcome is smaller than in the primary technique of consensus voting, i.e., the Borda count. We achieve that without having to introduce an ad hoc weighting that some other methods do. △ Less

Submitted 1 March, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

arXiv:2007.09085 [pdf, other]

Preservation of DNA Privacy During the Large Scale Detection of COVID-19

Authors: Marcel Hollenstein, David Naccache, Peter B. Rønne, Peter Y A Ryan, Robert Weil, Ofer Yifrach-Stav

Abstract: As humanity struggles to contain the global COVID-19 pandemic, privacy concerns are emerging regarding confinement, tracing and testing. The scientific debate concerning privacy of the COVID-19 tracing efforts has been intense, especially focusing on the choice between centralised and decentralised tracing apps. The privacy concerns regarding COVID-19 testing, however, have not received as much at… ▽ More As humanity struggles to contain the global COVID-19 pandemic, privacy concerns are emerging regarding confinement, tracing and testing. The scientific debate concerning privacy of the COVID-19 tracing efforts has been intense, especially focusing on the choice between centralised and decentralised tracing apps. The privacy concerns regarding COVID-19 testing, however, have not received as much attention even though the privacy at stake is arguably even higher. COVID-19 tests require the collection of samples. Those samples possibly contain viral material but inevitably also human DNA. Patient DNA is not necessary for the test but it is technically impossible to avoid collecting it. The unlawful preservation, or misuse, of such samples at a massive scale may hence disclose patient DNA information with far-reaching privacy consequences. Inspired by the cryptographic concept of "Indistinguishability under Chosen Plaintext Attack", this paper poses the blueprint of novel types of tests allowing to detect viral presence without leaving persisting traces of the patient's DNA. Authors are listed in alphabetical order. △ Less

Submitted 1 August, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

Comments: 10 pages, 1 figure

arXiv:2006.11634 [pdf, other]

A Fractional $3n+1$ Conjecture

Authors: Éric Brier, Rémi Géraud-Stewart, David Naccache

Abstract: In this paper we introduce and discuss the sequence of \emph{real numbers} defined as $u_0 \in \mathbb R$ and $u_{n+1} = Δ(u_n)$ where \begin{equation*} Δ(x) = \begin{cases} \frac{x}{2} &\text{if } \operatorname{frac}(x)<\frac{1}{2} \\[4px] \frac{3x+1}{2} & \text{if } \operatorname{frac}(x)\geq\frac{1}{2} \end{cases} \end{equation*} This sequence is reminiscent of the famous Collatz sequence, and… ▽ More In this paper we introduce and discuss the sequence of \emph{real numbers} defined as $u_0 \in \mathbb R$ and $u_{n+1} = Δ(u_n)$ where \begin{equation*} Δ(x) = \begin{cases} \frac{x}{2} &\text{if } \operatorname{frac}(x)<\frac{1}{2} \\[4px] \frac{3x+1}{2} & \text{if } \operatorname{frac}(x)\geq\frac{1}{2} \end{cases} \end{equation*} This sequence is reminiscent of the famous Collatz sequence, and seems to exhibit an interesting behaviour. Indeed, we conjecture that iterating $Δ$ will eventually either converge to zero, or loop over sequences of real numbers with integer parts $1,2,4,7,11,18,9,4,7,3,5,9,4,7,11,18,9,4,7,3,6,3,1,2,4,7,3,6,3$. We prove this conjecture for $u_0 \in [0, 100]$. Extending the proof to larger fixed values seems to be a matter of computing power. The authors pledge to offer a reward to the first person who proves or refutes the conjecture completely -- with a proof published in a serious refereed mathematical conference or journal. △ Less

Submitted 20 June, 2020; originally announced June 2020.

arXiv:2006.07246 [pdf, ps, other]

The Look-and-Say The Biggest Sequence Eventually Cycles

Authors: Éric Brier, Rémi Géraud-Stewart, David Naccache, Alessandro Pacco, Emanuele Troiani

Abstract: In this paper we consider a variant of Conway's sequence (OEIS A005150, A006715) defined as follows: the next term in the sequence is obtained by considering contiguous runs of digits, and rewriting them as $ab$ where $b$ is the digit and $a$ is the maximum of $b$ and the run's length. We dub this the "look-and-say the biggest" (LSB) sequence. Conway's sequence is very similar ($b$ is just the run… ▽ More In this paper we consider a variant of Conway's sequence (OEIS A005150, A006715) defined as follows: the next term in the sequence is obtained by considering contiguous runs of digits, and rewriting them as $ab$ where $b$ is the digit and $a$ is the maximum of $b$ and the run's length. We dub this the "look-and-say the biggest" (LSB) sequence. Conway's sequence is very similar ($b$ is just the run's length). For any starting value except 22, Conway's sequence grows exponentially: the ration of lengths converges to a known constant $λ$. We show that LSB does not: for every starting value, LSB eventually reaches a cycle. Furthermore, all cycles have a period of at most 9. △ Less

Submitted 12 June, 2020; originally announced June 2020.

arXiv:2006.06837 [pdf, ps, other]

Stuttering Conway Sequences Are Still Conway Sequences

Authors: Éric Brier, Rémi Géraud-Stewart, David Naccache, Alessandro Pacco, Emanuele Troiani

Abstract: A look-and-say sequence is obtained iteratively by reading off the digits of the current value, grou** identical digits together: starting with 1, the sequence reads: 1, 11, 21, 1211, 111221, 312211, etc. (OEIS A005150). Starting with any digit $d \neq 1$ gives Conway's sequence: $d$, $1d$, $111d$, $311d$, $13211d$, etc. (OEIS A006715). Conway popularised these sequences and studied some of thei… ▽ More A look-and-say sequence is obtained iteratively by reading off the digits of the current value, grou** identical digits together: starting with 1, the sequence reads: 1, 11, 21, 1211, 111221, 312211, etc. (OEIS A005150). Starting with any digit $d \neq 1$ gives Conway's sequence: $d$, $1d$, $111d$, $311d$, $13211d$, etc. (OEIS A006715). Conway popularised these sequences and studied some of their properties. In this paper we consider a variant subbed "look-and-say again" where digits are repeated twice. We prove that the look-and-say again sequence contains only the digits $1, 2, 4, 6, d$, where $d$ represents the starting digit. Such sequences decompose and the ratio of successive lengths converges to Conway's constant. In fact, these properties result from a commuting diagram between look-and-say again sequences and "classical" look-and-say sequences. Similar results apply to the "look-and-say three times" sequence. △ Less

Submitted 11 June, 2020; originally announced June 2020.

arXiv:2006.02353 [pdf, ps, other]

At Most 43 Moves, At Least 29: Optimal Strategies and Bounds for Ultimate Tic-Tac-Toe

Authors: Guillaume Bertholon, Rémi Géraud-Stewart, Axel Kugelmann, Théo Lenoir, David Naccache

Abstract: Ultimate Tic-Tac-Toe is a variant of the well known tic-tac-toe (noughts and crosses) board game. Two players compete to win three aligned "fields", each of them being a tic-tac-toe game. Each move determines which field the next player must play in. We show that there exist a winning strategy for the first player, and therefore that there exist an optimal winning strategy taking at most 43 move… ▽ More Ultimate Tic-Tac-Toe is a variant of the well known tic-tac-toe (noughts and crosses) board game. Two players compete to win three aligned "fields", each of them being a tic-tac-toe game. Each move determines which field the next player must play in. We show that there exist a winning strategy for the first player, and therefore that there exist an optimal winning strategy taking at most 43 moves; that the second player can hold on at least 29 rounds; and identify any optimal strategy's first two moves. △ Less

Submitted 6 June, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

arXiv:2005.12227 [pdf, other]

doi 10.1007/978-3-030-36938-5_39

Keyed Non-Parametric Hypothesis Tests

Authors: Yao Cheng, Cheng-Kang Chu, Hsiao-Ying Lin, Marius Lombard-Platet, David Naccache

Abstract: The recent popularity of machine learning calls for a deeper understanding of AI security. Amongst the numerous AI threats published so far, poisoning attacks currently attract considerable attention. In a poisoning attack the opponent partially tampers the dataset used for learning to mislead the classifier during the testing phase. This paper proposes a new protection strategy against poisonin… ▽ More The recent popularity of machine learning calls for a deeper understanding of AI security. Amongst the numerous AI threats published so far, poisoning attacks currently attract considerable attention. In a poisoning attack the opponent partially tampers the dataset used for learning to mislead the classifier during the testing phase. This paper proposes a new protection strategy against poisoning attacks. The technique relies on a new primitive called keyed non-parametric hypothesis tests allowing to evaluate under adversarial conditions the training input's conformance with a previously learned distribution $\mathfrak{D}$. To do so we use a secret key $κ$ unknown to the opponent. Keyed non-parametric hypothesis tests differs from classical tests in that the secrecy of $κ$ prevents the opponent from misleading the keyed test into concluding that a (significantly) tampered dataset belongs to $\mathfrak{D}$. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: Paper published in NSS 2019

arXiv:2005.04740 [pdf, ps, other]

Approaching Optimal Duplicate Detection in a Sliding Window

Authors: Rémi Géraud-Stewart, Marius Lombard-Platet, David Naccache

Abstract: Duplicate detection is the problem of identifying whether a given item has previously appeared in a (possibly infinite) stream of data, when only a limited amount of memory is available. Unfortunately the infinite stream setting is ill-posed, and error rates of duplicate detection filters turn out to be heavily constrained: consequently they appear to provide no advantage, asymptotically, over a… ▽ More Duplicate detection is the problem of identifying whether a given item has previously appeared in a (possibly infinite) stream of data, when only a limited amount of memory is available. Unfortunately the infinite stream setting is ill-posed, and error rates of duplicate detection filters turn out to be heavily constrained: consequently they appear to provide no advantage, asymptotically, over a biased coin toss [8]. In this paper we formalize the sliding window setting introduced by [13,16], and show that a perfect (zero error) solution can be used up to a maximal window size $w_\text{max}$. Above this threshold we show that some existing duplicate detection filters (designed for the $\textit{non-windowed}$ setting) perform better that those targeting the windowed problem. Finally, we introduce a "queuing construction" that improves on the performance of some duplicate detection filters in the windowed setting. We also analyse the security of our filters in an adversarial setting. △ Less

Submitted 10 May, 2020; originally announced May 2020.

arXiv:2005.02940 [pdf, other]

Optimal Covid-19 Pool Testing with a priori Information

Authors: Marc Beunardeau, Éric Brier, Noémie Cartier, Aisling Connolly, Nathanaël Courant, Rémi Géraud-Stewart, David Naccache, Ofer Yifrach-Stav

Abstract: As humanity struggles to contain the global Covid-19 infection, prophylactic actions are grandly slowed down by the shortage of testing kits. Governments have taken several measures to work around this shortage: the FDA has become more liberal on the approval of Covid-19 tests in the US. In the UK emergency measures allowed to increase the daily number of locally produced test kits to 100,000. Chi… ▽ More As humanity struggles to contain the global Covid-19 infection, prophylactic actions are grandly slowed down by the shortage of testing kits. Governments have taken several measures to work around this shortage: the FDA has become more liberal on the approval of Covid-19 tests in the US. In the UK emergency measures allowed to increase the daily number of locally produced test kits to 100,000. China has recently launched a massive test manufacturing program. However, all those efforts are very insufficient and many poor countries are still under threat. A popular method for reducing the number of tests consists in pooling samples, i.e. mixing patient samples and testing the mixed samples once. If all the samples are negative, pooling succeeds at a unitary cost. However, if a single sample is positive, failure does not indicate which patient is infected. This paper describes how to optimally detect infected patients in pools, i.e. using a minimal number of tests to precisely identify them, given the a priori probabilities that each of the patients is healthy. Those probabilities can be estimated using questionnaires, supervised machine learning or clinical examinations. The resulting algorithms, which can be interpreted as informed divide-and-conquer strategies, are non-intuitive and quite surprising. They are patent-free. Co-authors are listed in alphabetical order. △ Less

Submitted 11 May, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

arXiv:1908.03819 [pdf, other]

RISC-V: #AlphanumericShellcoding

Authors: Hadrien Barral, Rémi Géraud-Stewart, Georges-Axel Jaloyan, David Naccache

Abstract: We explain how to design RISC-V shellcodes capable of running arbitrary code, whose ASCII binary representation use only letters a-zA-Z, digits 0-9, and either of the three characters: #, /, '. We explain how to design RISC-V shellcodes capable of running arbitrary code, whose ASCII binary representation use only letters a-zA-Z, digits 0-9, and either of the three characters: #, /, '. △ Less

Submitted 10 August, 2019; originally announced August 2019.

Comments: 25 pages, originally published at WOOT'19

arXiv:1901.04358 [pdf, ps, other]

doi 10.1145/3297280.3297335

Quotient Hash Tables - Efficiently Detecting Duplicates in Streaming Data

Authors: Rémi Géraud, Marius Lombard-Platet, David Naccache

Abstract: This article presents the Quotient Hash Table (QHT) a new data structure for duplicate detection in unbounded streams. QHTs stem from a corrected analysis of streaming quotient filters (SQFs), resulting in a 33\% reduction in memory usage for equal performance. We provide a new and thorough analysis of both algorithms, with results of interest to other existing constructions. We also introduce a… ▽ More This article presents the Quotient Hash Table (QHT) a new data structure for duplicate detection in unbounded streams. QHTs stem from a corrected analysis of streaming quotient filters (SQFs), resulting in a 33\% reduction in memory usage for equal performance. We provide a new and thorough analysis of both algorithms, with results of interest to other existing constructions. We also introduce an optimised version of our new data structure dubbed Queued QHT with Duplicates (QQHTD). Finally we discuss the effect of adversarial inputs for hash-based duplicate filters similar to QHT. △ Less

Submitted 14 January, 2019; originally announced January 2019.

Comments: Shorter version was accepted at SIGAPP SAC '19

arXiv:1709.08357 [pdf, ps, other]

Generating Functionally Equivalent Programs Having Non-Isomorphic Control-Flow Graphs

Authors: Rémi Géraud, Mirko Koscina, Paul Lenczner, David Naccache, David Saulpic

Abstract: One of the big challenges in program obfuscation consists in modifying not only the program's straight-line code (SLC) but also the program's control flow graph (CFG). Indeed, if only SLC is modified, the program's CFG can be extracted and analyzed. Usually, the CFG leaks a considerable amount of information on the program's structure. In this work we propose a method allowing to re-write a code… ▽ More One of the big challenges in program obfuscation consists in modifying not only the program's straight-line code (SLC) but also the program's control flow graph (CFG). Indeed, if only SLC is modified, the program's CFG can be extracted and analyzed. Usually, the CFG leaks a considerable amount of information on the program's structure. In this work we propose a method allowing to re-write a code P into a functionally equivalent code P' such that CFG{P} and CFG{P'} are radically different. △ Less

Submitted 25 September, 2017; originally announced September 2017.

Comments: 16 pages paper, published in NordSec 2017 (conference), Proceedings of the Nordic Conference on Secure IT Systems (Nordic 2017)

arXiv:1608.03415 [pdf, other]

ARMv8 Shellcodes from 'A' to 'Z'

Authors: Hadrien Barral, Houda Ferradi, Rémi Géraud, Georges-Axel Jaloyan, David Naccache

Abstract: We describe a methodology to automatically turn arbitrary ARMv8 programs into alphanumeric executable polymorphic shellcodes. Shellcodes generated in this way can evade detection and bypass filters, broadening the attack surface of ARM-powered devices such as smartphones. We describe a methodology to automatically turn arbitrary ARMv8 programs into alphanumeric executable polymorphic shellcodes. Shellcodes generated in this way can evade detection and bypass filters, broadening the attack surface of ARM-powered devices such as smartphones. △ Less

Submitted 22 June, 2019; v1 submitted 11 August, 2016; originally announced August 2016.

Comments: 28 pages, 2 figures, source code in ARMv8, haskell, php, M4

arXiv:1512.06578 [pdf, other]

doi 10.1007/s10207-014-0272-7

Flexible Attribute-Based Encryption Applicable to Secure E-Healthcare Records

Authors: Bo Qin, Hua Deng, Qianhong Wu, Josep Domingo-Ferrer, David Naccache, Yunya Zhou

Abstract: In e-healthcare record systems (EHRS), attribute-based encryption (ABE) appears as a natural way to achieve fine-grained access control on health records. Some proposals exploit key-policy ABE (KP-ABE) to protect privacy in such a way that all users are associated with specific access policies and only the ciphertexts matching the users' access policies can be decrypted. An issue with KP-ABE is th… ▽ More In e-healthcare record systems (EHRS), attribute-based encryption (ABE) appears as a natural way to achieve fine-grained access control on health records. Some proposals exploit key-policy ABE (KP-ABE) to protect privacy in such a way that all users are associated with specific access policies and only the ciphertexts matching the users' access policies can be decrypted. An issue with KP-ABE is that it requires an a priori formulation of access policies during key generation, which is not always practicable in EHRS because the policies to access health records are sometimes determined after key generation. In this paper, we revisit KPABE and propose a dynamic ABE paradigm, referred to as access policy redefinable ABE (APR-ABE). To address the above issue, APR-ABE allows users to redefine their access policies and delegate keys for the redefined ones; hence a priori precise policies are no longer mandatory. We construct an APR-ABE scheme with short ciphertexts and prove its full security in the standard model under several static assumptions. △ Less

Submitted 21 December, 2015; originally announced December 2015.

Journal ref: International Journal of Information Security, Vol. 14, no. 6, pp. 499-511, 2015

arXiv:1509.00378 [pdf, ps, other]

A Number-Theoretic Error-Correcting Code

Authors: Eric Brier, Jean-Sébastien Coron, Rémi Géraud, Diana Maimut, David Naccache

Abstract: In this paper we describe a new error-correcting code (ECC) inspired by the Naccache-Stern cryptosystem. While by far less efficient than Turbo codes, the proposed ECC happens to be more efficient than some established ECCs for certain sets of parameters. The new ECC adds an appendix to the message. The appendix is the modular product of small primes representing the message bits. The receiver rec… ▽ More In this paper we describe a new error-correcting code (ECC) inspired by the Naccache-Stern cryptosystem. While by far less efficient than Turbo codes, the proposed ECC happens to be more efficient than some established ECCs for certain sets of parameters. The new ECC adds an appendix to the message. The appendix is the modular product of small primes representing the message bits. The receiver recomputes the product and detects transmission errors using modular division and lattice reduction. △ Less

Submitted 1 September, 2015; originally announced September 2015.

arXiv:1405.1402 [pdf, ps, other]

New Algorithmic Approaches to Point Constellation Recognition

Authors: Thomas Bourgeat, Julien Bringer, Herve Chabanne, Robin Champenois, Jeremie Clement, Houda Ferradi, Marc Heinrich, Paul Melotti, David Naccache, Antoine Voizard

Abstract: Point constellation recognition is a common problem with many pattern matching applications. Whilst useful in many contexts, this work is mainly motivated by fingerprint matching. Fingerprints are traditionally modelled as constellations of oriented points called minutiae. The fingerprint verifier's task consists in comparing two point constellations. The compared constellations may differ by rota… ▽ More Point constellation recognition is a common problem with many pattern matching applications. Whilst useful in many contexts, this work is mainly motivated by fingerprint matching. Fingerprints are traditionally modelled as constellations of oriented points called minutiae. The fingerprint verifier's task consists in comparing two point constellations. The compared constellations may differ by rotation and translation or by much more involved transforms such as distortion or occlusion. This paper presents three new constellation matching algorithms. The first two methods generalize an algorithm by Bringer and Despiegel. Our third proposal creates a very interesting analogy between mechanical system simulation and the constellation recognition problem. △ Less

Submitted 24 March, 2014; originally announced May 2014.

Comments: 14 pages, short version submitted to SEC 2014

arXiv:1104.1546 [pdf, other]

Physical Simulation of Inarticulate Robots

Authors: Guillaume Claret, Michaël Mathieu, David Naccache, Guillaume Seguin

Abstract: In this note we study the structure and the behavior of inarticulate robots. We introduce a robot that moves by successive revolvings. The robot's structure is analyzed, simulated and discussed in detail. In this note we study the structure and the behavior of inarticulate robots. We introduce a robot that moves by successive revolvings. The robot's structure is analyzed, simulated and discussed in detail. △ Less

Submitted 8 April, 2011; originally announced April 2011.

arXiv:1104.1533 [pdf, ps, other]

Operand Folding Hardware Multipliers

Authors: Byungchun Chung, Sandra Marcello, Amir-Pasha Mirbaha, David Naccache, Karim Sabeg

Abstract: This paper describes a new accumulate-and-add multiplication algorithm. The method partitions one of the operands and re-combines the results of computations done with each of the partitions. The resulting design turns-out to be both compact and fast. When the operands' bit-length $m$ is 1024, the new algorithm requires only $0.194m+56$ additions (on average), this is about half the number of ad… ▽ More This paper describes a new accumulate-and-add multiplication algorithm. The method partitions one of the operands and re-combines the results of computations done with each of the partitions. The resulting design turns-out to be both compact and fast. When the operands' bit-length $m$ is 1024, the new algorithm requires only $0.194m+56$ additions (on average), this is about half the number of additions required by the classical accumulate-and-add multiplication algorithm ($\frac{m}2$). △ Less

Submitted 8 April, 2011; originally announced April 2011.

arXiv:0810.2067 [pdf, ps, other]

Divisibility, Smoothness and Cryptographic Applications

Authors: David Naccache, Igor E. Shparlinski

Abstract: This paper deals with products of moderate-size primes, familiarly known as smooth numbers. Smooth numbers play a crucial role in information theory, signal processing and cryptography. We present various properties of smooth numbers relating to their enumeration, distribution and occurrence in various integer sequences. We then turn our attention to cryptographic applications in which smooth… ▽ More This paper deals with products of moderate-size primes, familiarly known as smooth numbers. Smooth numbers play a crucial role in information theory, signal processing and cryptography. We present various properties of smooth numbers relating to their enumeration, distribution and occurrence in various integer sequences. We then turn our attention to cryptographic applications in which smooth numbers play a pivotal role. △ Less

Submitted 1 January, 2009; v1 submitted 11 October, 2008; originally announced October 2008.

MSC Class: 11N25; 11Y16; 9460

arXiv:cs/0510042 [pdf, ps, other]

Secure and {\sl Practical} Identity-Based Encryption

Authors: David Naccache

Abstract: In this paper, we present a variant of Waters' Identity-Based Encryption scheme with a much smaller public-key size (only a few kilobytes). We show that this variant is semantically secure against passive adversaries in the standard model.\smallskip In essence, the new scheme divides Waters' public key size by a factor $\ell$ at the cost of (negligibly) reducing security by $\ell$ bits. Theref… ▽ More In this paper, we present a variant of Waters' Identity-Based Encryption scheme with a much smaller public-key size (only a few kilobytes). We show that this variant is semantically secure against passive adversaries in the standard model.\smallskip In essence, the new scheme divides Waters' public key size by a factor $\ell$ at the cost of (negligibly) reducing security by $\ell$ bits. Therefore, our construction settles an open question asked by Waters and constitutes the first fully secure {\sl practical} Identity-Based Encryption scheme △ Less

Submitted 15 October, 2005; originally announced October 2005.

ACM Class: K.6.5

Showing 1–42 of 42 results for author: Naccache, D