Search | arXiv e-print repository

SECOMP: Formally Secure Compilation of Compartmentalized C Programs

Authors: Jérémy Thibault, Roberto Blanco, Dongjae Lee, Sven Argo, Arthur Azevedo de Amorim, Aïna Linn Georges, Catalin Hritcu, Andrew Tolmach

Abstract: Undefined behavior in C often causes devastating security vulnerabilities. One practical mitigation is compartmentalization, which allows developers to structure large programs into mutually distrustful compartments with clearly specified privileges and interactions. In this paper we introduce SECOMP, a compiler for compartmentalized C code that comes with machine-checked proofs guaranteeing that… ▽ More Undefined behavior in C often causes devastating security vulnerabilities. One practical mitigation is compartmentalization, which allows developers to structure large programs into mutually distrustful compartments with clearly specified privileges and interactions. In this paper we introduce SECOMP, a compiler for compartmentalized C code that comes with machine-checked proofs guaranteeing that the scope of undefined behavior is restricted to the compartments that encounter it and become dynamically compromised. These guarantees are formalized as the preservation of safety properties against adversarial contexts, a secure compilation criterion similar to full abstraction, and this is the first time such a strong criterion is proven for a mainstream programming language. To achieve this we extend the languages of the CompCert verified C compiler with isolated compartments that can only interact via procedure calls and returns, as specified by cross-compartment interfaces. We adapt the passes and optimizations of CompCert as well as their correctness proofs to this compartment-aware setting. We then use compiler correctness as an ingredient in a larger secure compilation proof that involves several proof engineering novelties, needed to scale formally secure compilation up to a C compiler. △ Less

Submitted 1 July, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: CCS'24 camera ready version, extended with appendices and a few more references

arXiv:2312.02172 [pdf, other]

doi 10.1016/j.simpat.2019.102037

Mercury: A modeling, simulation, and optimization framework for data stream-oriented IoT applications

Authors: Román Cárdenas, Patricia Arroba, Roberto Blanco, Pedro Malagón, José L. Risco-Martín, José M. Moya

Abstract: The Internet of Things is transforming our society by monitoring users and infrastructures' behavior to enable new services that will improve life quality and resource management. These applications require a vast amount of localized information to be processed in real-time so, the deployment of new fog computing infrastructures that bring computing closer to the data sources is a major concern. I… ▽ More The Internet of Things is transforming our society by monitoring users and infrastructures' behavior to enable new services that will improve life quality and resource management. These applications require a vast amount of localized information to be processed in real-time so, the deployment of new fog computing infrastructures that bring computing closer to the data sources is a major concern. In this context, we present Mercury, a Modeling, Simulation, and Optimization (M&S&O) framework to analyze the dimensioning and the dynamic operation of real-time fog computing scenarios. Our research proposes a location-aware solution that supports data stream analytics applications including FaaS-based computation offloading. Mercury implements a detailed structural and behavioral simulation model, providing fine-grained simulation outputs, and is described using the Discrete Event System Specification (DEVS) mathematical formalism, hel** to validate the model's implementation. Finally, we present a case study using real traces from a driver assistance scenario, offering a detailed comparison with other state-of-the-art simulators. △ Less

Submitted 2 November, 2023; originally announced December 2023.

Journal ref: Simulation Modelling Practice and Theory, 101, 2019

arXiv:2305.11673 [pdf, other]

Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis in Four Languages

Authors: Seraphina Goldfarb-Tarrant, Adam Lopez, Roi Blanco, Diego Marcheggiani

Abstract: Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but import… ▽ More Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but important question that an engineer might need to answer when deploying a system: What biases do systems import from pre-trained models when compared to a baseline with no pre-training? Our evaluation corpus, by virtue of being counterfactual, not only reveals which models have less bias, but also pinpoints changes in model bias behaviour, which enables more targeted mitigation strategies. We release our code and evaluation corpora to facilitate future research. △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: 5 pages, accepted to Findings of ACL 2023

arXiv:2304.12671 [pdf]

doi 10.1109/ICST.2012.87

Test adequacy evaluation for the user-database interaction: a specification-based approach

Authors: Raquel Blanco, Javier Tuya, Ruben V. Seco

Abstract: Testing a database application is a challenging process where both the database and the user interaction have to be considered in the design of test cases. This paper describes a specification-based approach to guide the design of test inputs (both the test database and the user inputs) for a database application and to automatically evaluate the test adequacy. First, the system specification of t… ▽ More Testing a database application is a challenging process where both the database and the user interaction have to be considered in the design of test cases. This paper describes a specification-based approach to guide the design of test inputs (both the test database and the user inputs) for a database application and to automatically evaluate the test adequacy. First, the system specification of the application is modelled: (1) the structure of the database and the user interface are represented in a single model, called Integrated Data Model (IDM), (2) the functional requirements are expressed as a set of business rules, written in terms of the IDM. Then, a MCDC-based criterion is applied over the business rules to automatically derive the situations of interest to be tested (test requirements), which guide the design of the test inputs. Finally, the adequacy of these test inputs is automatically evaluated to determine whether the test requirements are covered. The approach has been applied to the TPC-C benchmark. The results show that it allows designing test cases that are able to detect interesting faults which were located in the procedural code of the implementation. △ Less

Submitted 25 April, 2023; originally announced April 2023.

Comments: IEEE Fifth International Conference on Software Testing, Verification and Validation (ICST 2012), Montreal, QC, Canada, 2012

arXiv:2304.12196 [pdf]

doi 10.1016/j.jss.2023.111647

Can gamification help in software testing education? Findings from an empirical study

Authors: Raquel Blanco, Manuel Trinidad, Maria Jose Suarez-Cabal, Alejandro Calderon, Mercedes Ruiz, Javier Tuya

Abstract: Software testing is an essential knowledge area required by industry for software engineers. However, software engineering students often consider testing less appealing than designing or coding. Consequently, it is difficult to engage students to create effective tests. To encourage students, we explored the use of gamification and investigated whether this technique can help to improve the engag… ▽ More Software testing is an essential knowledge area required by industry for software engineers. However, software engineering students often consider testing less appealing than designing or coding. Consequently, it is difficult to engage students to create effective tests. To encourage students, we explored the use of gamification and investigated whether this technique can help to improve the engagement and performance of software testing students. We conducted a controlled experiment to compare the engagement and performance of two groups of students that took an undergraduate software testing course in different academic years. The experimental group is formed by 135 students from the gamified course whereas the control group is formed by 100 students from the non-gamified course. The data collected were statistically analyzed to answer the research questions of this study. The results show that the students that participated in the gamification experience were more engaged and achieved a better performance. As an additional finding, the analysis of the results reveals that a key aspect to succeed is the gamification experience design. It is important to distribute the motivating stimulus provided by the gamification throughout the whole experience to engage students until the end. Given these results, we plan to readjust the gamification experience design to increase student engagement in the last stage of the experience, as well as to conduct a longitudinal study to evaluate the effects of gamification. △ Less

Submitted 24 April, 2023; originally announced April 2023.

Journal ref: The Journal of Systems & Software, 200, 2023

arXiv:2302.03616 [pdf]

Can gamification reduce the burden of self-reporting in mHealth applications? A feasibility study using machine learning from smartwatch data to estimate cognitive load

Authors: Michal K. Grzeszczyk, Paulina Adamczyk, Sylwia Marek, Ryszard Pręcikowski, Maciej Kuś, M. Patrycja Lelujko, Rosmary Blanco, Tomasz Trzciński, Arkadiusz Sitek, Maciej Malawski, Aneta Lisowska

Abstract: The effectiveness of digital treatments can be measured by requiring patients to self-report their state through applications, however, it can be overwhelming and causes disengagement. We conduct a study to explore the impact of gamification on self-reporting. Our approach involves the creation of a system to assess cognitive load (CL) through the analysis of photoplethysmography (PPG) signals. Th… ▽ More The effectiveness of digital treatments can be measured by requiring patients to self-report their state through applications, however, it can be overwhelming and causes disengagement. We conduct a study to explore the impact of gamification on self-reporting. Our approach involves the creation of a system to assess cognitive load (CL) through the analysis of photoplethysmography (PPG) signals. The data from 11 participants is utilized to train a machine learning model to detect CL. Subsequently, we create two versions of surveys: a gamified and a traditional one. We estimate the CL experienced by other participants (13) while completing surveys. We find that CL detector performance can be enhanced via pre-training on stress detection tasks. For 10 out of 13 participants, a personalized CL detector can achieve an F1 score above 0.7. We find no difference between the gamified and non-gamified surveys in terms of CL but participants prefer the gamified version. △ Less

Submitted 21 December, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

Comments: Accepted for AMIA 2023

arXiv:2210.12022 [pdf, other]

Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks

Authors: Laura Aina, Nikos Voskarides, Roi Blanco

Abstract: Pre-trained language models (LMs) obtain state-of-the-art performance when adapted to text classification tasks. However, when using such models in real-world applications, efficiency considerations are paramount. In this paper, we study how different training procedures that adapt LMs to text classification perform, as we vary model and train set size. More specifically, we compare standard fine-… ▽ More Pre-trained language models (LMs) obtain state-of-the-art performance when adapted to text classification tasks. However, when using such models in real-world applications, efficiency considerations are paramount. In this paper, we study how different training procedures that adapt LMs to text classification perform, as we vary model and train set size. More specifically, we compare standard fine-tuning, prompting, and knowledge distillation (KD) when the teacher was trained with either fine-tuning or prompting. Our findings suggest that even though fine-tuning and prompting work well to train large LMs on large train sets, there are more efficient alternatives that can reduce compute or data cost. Interestingly, we find that prompting combined with KD can reduce compute and data cost at the same time. △ Less

Submitted 21 October, 2022; originally announced October 2022.

Comments: AACL-IJCNLP 2022

arXiv:2110.01439 [pdf, other]

SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Turn-Taking Simulation

Authors: Akram El-Korashy, Roberto Blanco, Jérémy Thibault, Adrien Durier, Deepak Garg, Catalin Hritcu

Abstract: Proving secure compilation of partial programs typically requires back-translating an attack against the compiled program to an attack against the source program. To prove back-translation, one can syntactically translate the target attacker to a source one -- i.e., syntax-directed back-translation -- or show that the interaction traces of the target attacker can also be emitted by source attacker… ▽ More Proving secure compilation of partial programs typically requires back-translating an attack against the compiled program to an attack against the source program. To prove back-translation, one can syntactically translate the target attacker to a source one -- i.e., syntax-directed back-translation -- or show that the interaction traces of the target attacker can also be emitted by source attackers -- i.e., trace-directed back-translation. Syntax-directed back-translation is not suitable when the target attacker may use unstructured control flow that the source language cannot directly represent. Trace-directed back-translation works with such syntactic dissimilarity because only the external interactions of the target attacker have to be mimicked in the source, not its internal control flow. Revealing only external interactions is, however, inconvenient when sharing memory via unforgeable pointers, since information about shared pointers stashed in private memory is not present on the trace. This made prior proofs unnecessarily complex, since the generated attacker had to instead stash all reachable pointers. In this work, we introduce more informative *data-flow traces*, combining the best of syntax- and trace-directed back-translation in a simpler technique that handles both syntactic dissimilarity and memory sharing well, and that is proved correct in Coq. Additionally, we develop a novel *turn-taking simulation* relation and use it to prove a recomposition lemma, which is key to reusing compiler correctness in such secure compilation proofs. We are the first to mechanize such a recomposition lemma in the presence of memory sharing. We use these two innovations in a secure compilation proof for a code generation compiler pass between a source language with structured control flow and a target language with unstructured control flow, both with safe pointers and components. △ Less

Submitted 3 June, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

Comments: CSF 2022 pre-print with extra appendices

arXiv:2105.00417 [pdf, ps, other]

doi 10.1109/CSF57540.2023.00037

Formalizing Stack Safety as a Security Property

Authors: Sean Noble Anderson, Roberto Blanco, Leonidas Lampropoulos, Benjamin C. Pierce, Andrew Tolmach

Abstract: The term stack safety is used to describe a variety of compiler, run-time, and hardware mechanisms for protecting stack memory. Unlike "the heap," the ISA-level stack does not correspond to a single high-level language concept: different compilers use it in different ways to support procedural and functional abstraction mechanisms from a wide range of languages. This protean nature makes it diffic… ▽ More The term stack safety is used to describe a variety of compiler, run-time, and hardware mechanisms for protecting stack memory. Unlike "the heap," the ISA-level stack does not correspond to a single high-level language concept: different compilers use it in different ways to support procedural and functional abstraction mechanisms from a wide range of languages. This protean nature makes it difficult to nail down what it means to correctly enforce stack safety. We propose a new formal characterization of stack safety using concepts from language-based security. Rather than treating stack safety as a monolithic property, we decompose it into an integrity property and a confidentiality property for each of the caller and the callee, plus a control-flow property: five properties in all. This formulation is motivated by a particular class of enforcement mechanisms, the "lazy" stack safety micro-policies studied by Roessler and DeHon, which permit functions to write into one another's frames but taint the changed locations so that the frame's owner cannot access them. No existing characterization of stack safety captures this style of safety; we capture it here by stating our properties in terms of the observable behavior of the system. Our properties go further than previous formal definitions of stack safety, supporting caller- and callee-saved registers, arguments passed on the stack, and tail-call elimination. We validate the properties by using them to distinguish between correct and incorrect implementations of Roessler and DeHon's micro-policies using property-based random testing. Our test harness successfully identifies several broken variants, including Roessler and DeHon's lazy policy; a repaired version of their policy passes our tests. △ Less

Submitted 21 March, 2024; v1 submitted 2 May, 2021; originally announced May 2021.

Journal ref: Proceedings of the 2023 IEEE 36th Computer Security Foundations Symposium (CSF)

arXiv:1910.00856 [pdf, other]

BookQA: Stories of Challenges and Opportunities

Authors: Stefanos Angelidis, Lea Frermann, Diego Marcheggiani, Roi Blanco, Lluís Màrquez

Abstract: We present a system for answering questions based on the full text of books (BookQA), which first selects book passages given a question at hand, and then uses a memory network to reason and predict an answer. To improve generalization, we pretrain our memory network using artificial questions generated from book sentences. We experiment with the recently published NarrativeQA corpus, on the subse… ▽ More We present a system for answering questions based on the full text of books (BookQA), which first selects book passages given a question at hand, and then uses a memory network to reason and predict an answer. To improve generalization, we pretrain our memory network using artificial questions generated from book sentences. We experiment with the recently published NarrativeQA corpus, on the subset of Who questions, which expect book characters as answers. We experimentally show that BERT-based retrieval and pretraining improve over baseline results significantly. At the same time, we confirm that NarrativeQA is a highly challenging data set, and that there is need for novel research in order to achieve high-precision BookQA results. We analyze some of the bottlenecks of the current approach, and we argue that more research is needed on text representation, retrieval of relevant passages, and reasoning, including commonsense knowledge. △ Less

Submitted 2 October, 2019; originally announced October 2019.

Comments: Accepted at 2nd Workshop on Machine Reading for Question Answering (MRQA), EMNLP 2019

arXiv:1907.05320 [pdf, other]

Trace-Relating Compiler Correctness and Secure Compilation

Authors: Carmine Abate, Roberto Blanco, Stefan Ciobaca, Adrien Durier, Deepak Garg, Catalin Hritcu, Marco Patrignani, Éric Tanter, Jérémy Thibault

Abstract: Compiler correctness is, in its simplest form, defined as the inclusion of the set of traces of the compiled program into the set of traces of the original program, which is equivalent to the preservation of all trace properties. Here traces collect, for instance, the externally observable events of each execution. This definition requires, however, the set of traces of the source and target langu… ▽ More Compiler correctness is, in its simplest form, defined as the inclusion of the set of traces of the compiled program into the set of traces of the original program, which is equivalent to the preservation of all trace properties. Here traces collect, for instance, the externally observable events of each execution. This definition requires, however, the set of traces of the source and target languages to be exactly the same, which is not the case when the languages are far apart or when observations are fine-grained. To overcome this issue, we study a generalized compiler correctness definition, which uses source and target traces drawn from potentially different sets and connected by an arbitrary relation. We set out to understand what guarantees this generalized compiler correctness definition gives us when instantiated with a non-trivial relation on traces. When this trace relation is not equality, it is no longer possible to preserve the trace properties of the source program unchanged. Instead, we provide a generic characterization of the target trace property ensured by correctly compiling a program that satisfies a given source property, and dually, of the source trace property one is required to show in order to obtain a certain target property for the compiled code. We show that this view on compiler correctness can naturally account for undefined behavior, resource exhaustion, different source and target values, side-channels, and various abstraction mismatches. Finally, we show that the same generalization also applies to many secure compilation definitions, which characterize the protection of a compiled program against linked adversarial code. △ Less

Submitted 23 February, 2020; v1 submitted 11 July, 2019; originally announced July 2019.

Comments: ESOP'20 camera ready version together with online appendix

arXiv:1810.05421 [pdf, ps, other]

The use of blogs in the education field: A qualitative systematic review

Authors: Carlos R. del Blanco, Ivan García-Magariño

Abstract: Blogs have become one of the most successful tools of the Web 2.0 because of their ease of use and the availability of open platforms. They have quickly spread in the education field thanks to the many attractive qualities that have been attributed to them, such as collaboration, communication, enhancing of professional writing, and the improvement of information-gathering skills. However, many of… ▽ More Blogs have become one of the most successful tools of the Web 2.0 because of their ease of use and the availability of open platforms. They have quickly spread in the education field thanks to the many attractive qualities that have been attributed to them, such as collaboration, communication, enhancing of professional writing, and the improvement of information-gathering skills. However, many of the studies that have addressed this issue were not based on an empirical research, and therefore they are unreliable. On the other hand, the studies that do have conducted an empirical research have usually relied on participant self-reported data (surveys, interviews, and contents of blogs), which can significantly bias the positive results usually reported on the use of blogs. Another source of bias and inaccuracy in the reported results is that most of the studies lacked control group, i.e they do not follow an experimental design. The purpose of this review is to examine the current state of the studies related to the evaluation of the blog effects in the education field. The methods to select the studies and perform the corresponding analysis have followed a qualitative systematic approach. The selection has been restricted to empirical and peer-reviewed studies published between January 2011 and June 2013. The findings have been integrated and compared using the Grounded Theory, giving rise to a set of categories that have structured the results of the review. △ Less

Submitted 12 October, 2018; originally announced October 2018.

arXiv:1807.04603 [pdf, other]

doi 10.1109/CSF.2019.00025

Journey Beyond Full Abstraction: Exploring Robust Property Preservation for Secure Compilation

Authors: Carmine Abate, Roberto Blanco, Deepak Garg, Catalin Hritcu, Marco Patrignani, Jérémy Thibault

Abstract: (CROPPED TO FIT IN ARXIV'S SILLY LIMIT. SEE PDF FOR COMPLETE ABSTRACT.) We are the first to thoroughly explore a large space of formal secure compilation criteria based on robust property preservation, i.e., the preservation of properties satisfied against arbitrary adversarial contexts. We study robustly preserving various classes of trace properties such as safety, of hyperproperties such as n… ▽ More (CROPPED TO FIT IN ARXIV'S SILLY LIMIT. SEE PDF FOR COMPLETE ABSTRACT.) We are the first to thoroughly explore a large space of formal secure compilation criteria based on robust property preservation, i.e., the preservation of properties satisfied against arbitrary adversarial contexts. We study robustly preserving various classes of trace properties such as safety, of hyperproperties such as noninterference, and of relational hyperproperties such as trace equivalence. This leads to many new secure compilation criteria, some of which are easier to practically achieve and prove than full abstraction, and some of which provide strictly stronger security guarantees. For each of the studied criteria we propose an equivalent "property-free" characterization that clarifies which proof techniques apply. For relational properties and hyperproperties, which relate the behaviors of multiple programs, our formal definitions of the property classes themselves are novel. We order our criteria by their relative strength and show several collapses and separation results. Finally, we adapt existing proof techniques to show that even the strongest of our secure compilation criteria, the robust preservation of all relational hyperproperties, is achievable for a simple translation from a statically typed to a dynamically typed language. △ Less

Submitted 17 May, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

Comments: Long version of CSF'19 paper, including online appendix

arXiv:1802.00588 [pdf, other]

When Good Components Go Bad: Formally Secure Compilation Despite Dynamic Compromise

Authors: Carmine Abate, Arthur Azevedo de Amorim, Roberto Blanco, Ana Nora Evans, Guglielmo Fachini, Catalin Hritcu, Théo Laurent, Benjamin C. Pierce, Marco Stronati, Jérémy Thibault, Andrew Tolmach

Abstract: We propose a new formal criterion for evaluating secure compilation schemes for unsafe languages, expressing end-to-end security guarantees for software components that may become compromised after encountering undefined behavior---for example, by accessing an array out of bounds. Our criterion is the first to model dynamic compromise in a system of mutually distrustful components with clearly s… ▽ More We propose a new formal criterion for evaluating secure compilation schemes for unsafe languages, expressing end-to-end security guarantees for software components that may become compromised after encountering undefined behavior---for example, by accessing an array out of bounds. Our criterion is the first to model dynamic compromise in a system of mutually distrustful components with clearly specified privileges. It articulates how each component should be protected from all the others---in particular, from components that have encountered undefined behavior and become compromised. Each component receives secure compilation guarantees---in particular, its internal invariants are protected from compromised components---up to the point when this component itself becomes compromised, after which we assume an attacker can take complete control and use this component's privileges to attack other components. More precisely, a secure compilation chain must ensure that a dynamically compromised component cannot break the safety properties of the system at the target level any more than an arbitrary attacker-controlled component (with the same interface and privileges, but without undefined behaviors) already could at the source level. To illustrate the model, we construct a secure compilation chain for a small unsafe language with buffers, procedures, and components, targeting a simple abstract machine with built-in compartmentalization. We give a machine-checked proof in Coq that this compiler satisfies our secure compilation criterion. Finally, we show that the protection guarantees offered by the compartmentalized abstract machine can be achieved at the machine-code level using either software fault isolation or a tag-based reference monitor. △ Less

Submitted 29 November, 2019; v1 submitted 2 February, 2018; originally announced February 2018.

Comments: CCS paper with significant improvement of the proofs, first step towards a journal version

arXiv:1704.03970 [pdf, other]

doi 10.1145/3159652.3159676

Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems

Authors: Joel Mackenzie, J. Shane Culpepper, Roi Blanco, Matt Crane, Charles L. A. Clarke, Jimmy Lin

Abstract: Scalable web search systems typically employ multi-stage retrieval architectures, where an initial stage generates a set of candidate documents that are then pruned and re-ranked. Since subsequent stages typically exploit a multitude of features of varying costs using machine-learned models, reducing the number of documents that are considered at each stage improves latency. In this work, we propo… ▽ More Scalable web search systems typically employ multi-stage retrieval architectures, where an initial stage generates a set of candidate documents that are then pruned and re-ranked. Since subsequent stages typically exploit a multitude of features of varying costs using machine-learned models, reducing the number of documents that are considered at each stage improves latency. In this work, we propose and validate a unified framework that can be used to predict a wide range of performance-sensitive parameters which minimize effectiveness loss, while simultaneously minimizing query latency, across all stages of a multi-stage search architecture. Furthermore, our framework can be easily applied in large-scale IR systems, can be trained without explicitly requiring relevance judgments, and can target a variety of different efficiency-effectiveness trade-offs, making it well suited to a wide range of search scenarios. Our results show that we can reliably predict a number of different parameters on a per-query basis, while simultaneously detecting and minimizing the likelihood of tail-latency queries that exceed a pre-specified performance budget. As a proof of concept, we use the prediction framework to help alleviate the problem of tail-latency queries in early stage retrieval. On the standard ClueWeb09B collection and 31k queries, we show that our new hybrid system can reliably achieve a maximum query time of 200 ms with a 99.99% response time guarantee without a significant loss in overall effectiveness. The solutions presented are practical, and can easily be used in large-scale distributed search engine deployments with a small amount of additional overhead. △ Less

Submitted 20 April, 2017; v1 submitted 12 April, 2017; originally announced April 2017.

Comments: Update 1: Edited email address

arXiv:1704.01617 [pdf, ps, other]

Part of Speech Based Term Weighting for Information Retrieval

Authors: Christina Lioma, Roi Blanco

Abstract: Automatic language processing tools typically assign to terms so-called weights corresponding to the contribution of terms to information content. Traditionally, term weights are computed from lexical statistics, e.g., term frequencies. We propose a new type of term weight that is computed from part of speech (POS) n-gram statistics. The proposed POS-based term weight represents how informative a… ▽ More Automatic language processing tools typically assign to terms so-called weights corresponding to the contribution of terms to information content. Traditionally, term weights are computed from lexical statistics, e.g., term frequencies. We propose a new type of term weight that is computed from part of speech (POS) n-gram statistics. The proposed POS-based term weight represents how informative a term is in general, based on the POS contexts in which it generally occurs in language. We suggest five different computations of POS-based term weights by extending existing statistical approximations of term information measures. We apply these POS-based term weights to information retrieval, by integrating them into the model that matches documents to queries. Experiments with two TREC collections and 300 queries, using TF-IDF & BM25 as baselines, show that integrating our POS-based term weights to retrieval always leads to gains (up to +33.7% from the baseline). Additional experiments with a different retrieval model as baseline (Language Model with Dirichlet priors smoothing) and our best performing POS-based term weight, show retrieval gains always and consistently across the whole smoothing range of the baseline. △ Less

Submitted 5 April, 2017; originally announced April 2017.

arXiv:1511.04178 [pdf, ps, other]

doi 10.4204/EPTCS.197.2

Proof Outlines as Proof Certificates: A System Description

Authors: Roberto Blanco, Dale Miller

Abstract: We apply the foundational proof certificate (FPC) framework to the problem of designing high-level outlines of proofs. The FPC framework provides a means to formally define and check a wide range of proof evidence. A focused proof system is central to this framework and such a proof system provides an interesting approach to proof reconstruction during the process of proof checking (relying on a… ▽ More We apply the foundational proof certificate (FPC) framework to the problem of designing high-level outlines of proofs. The FPC framework provides a means to formally define and check a wide range of proof evidence. A focused proof system is central to this framework and such a proof system provides an interesting approach to proof reconstruction during the process of proof checking (relying on an underlying logic programming implementation). Here, we illustrate how the FPC framework can be used to design proof outlines and then to exploit proof checkers as a means for expanding outlines into fully detailed proofs. In order to validate this approach to proof outlines, we have built the ACheck system that allows us to take a sequence of theorems and apply the proof outline "do the obvious induction and close the proof using previously proved lemmas". △ Less

Submitted 13 November, 2015; originally announced November 2015.

Comments: In Proceedings WoF'15, arXiv:1511.02529

Journal ref: EPTCS 197, 2015, pp. 7-14

arXiv:1505.06386 [pdf, other]

doi 10.1145/2766462.2767704

Local Ranking Problem on the BrowseGraph

Authors: Michele Trevisiol, Luca Maria Aiello, Paolo Boldi, Roi Blanco

Abstract: The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has receive… ▽ More The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has received more and more attention in many different tasks such as ranking, prediction and recommendation. However, a web-server has only the browsing traffic performed on its pages (local BrowseGraph) and, as a consequence, the local computation can lead to estimation errors, which hinders the increasing number of applications in the state of the art. Also, although the divergence between the local and global ranks has been measured, the possibility of estimating such divergence using only local knowledge has been mainly overlooked. These aspects are of great interest for online service providers who want to: (i) gauge their ability to correctly assess the importance of their resources only based on their local knowledge, and (ii) take into account real user browsing fluxes that better capture the actual user interest than the static hyperlink network. We study the LRP problem on a BrowseGraph from a large news provider, considering as subgraphs the aggregations of browsing traces of users coming from different domains. We show that the distance between rankings can be accurately predicted based only on structural information of the local graph, being able to achieve an average rank correlation as high as 0.8. △ Less

Submitted 23 May, 2015; originally announced May 2015.

ACM Class: H.4; E.1

arXiv:1407.7930 [pdf, ps, other]

doi 10.4204/EPTCS.159.4

Entity-Linking via Graph-Distance Minimization

Authors: Roi Blanco, Paolo Boldi, Andrea Marino

Abstract: Entity-linking is a natural-language-processing task that consists in identifying the entities mentioned in a piece of text, linking each to an appropriate item in some knowledge base; when the knowledge base is Wikipedia, the problem comes to be known as wikification (in this case, items are wikipedia articles). One instance of entity-linking can be formalized as an optimization problem on the un… ▽ More Entity-linking is a natural-language-processing task that consists in identifying the entities mentioned in a piece of text, linking each to an appropriate item in some knowledge base; when the knowledge base is Wikipedia, the problem comes to be known as wikification (in this case, items are wikipedia articles). One instance of entity-linking can be formalized as an optimization problem on the underlying concept graph, where the quantity to be optimized is the average distance between chosen items. Inspired by this application, we define a new graph problem which is a natural variant of the Maximum Capacity Representative Set. We prove that our problem is NP-hard for general graphs; nonetheless, under some restrictive assumptions, it turns out to be solvable in linear time. For the general case, we propose two heuristics: one tries to enforce the above assumptions and another one is based on the notion of hitting distance; we show experimentally how these approaches perform with respect to some baselines on a real-world dataset. △ Less

Submitted 29 July, 2014; originally announced July 2014.

Comments: In Proceedings GRAPHITE 2014, arXiv:1407.7671. The second and third authors were supported by the EU-FET grant NADINE (GA 288956)

ACM Class: G.2.2; G.2.3; F.2.m

Journal ref: EPTCS 159, 2014, pp. 30-43

arXiv:1211.6496 [pdf, other]

TwitterPaul: Extracting and Aggregating Twitter Predictions

Authors: Naushad UzZaman, Roi Blanco, Michael Matthews

Abstract: This paper introduces TwitterPaul, a system designed to make use of Social Media data to help to predict game outcomes for the 2010 FIFA World Cup tournament. To this end, we extracted over 538K mentions to football games from a large sample of tweets that occurred during the World Cup, and we classified into different types with a precision of up to 88%. The different mentions were aggregated in… ▽ More This paper introduces TwitterPaul, a system designed to make use of Social Media data to help to predict game outcomes for the 2010 FIFA World Cup tournament. To this end, we extracted over 538K mentions to football games from a large sample of tweets that occurred during the World Cup, and we classified into different types with a precision of up to 88%. The different mentions were aggregated in order to make predictions about the outcomes of the actual games. We attempt to learn which Twitter users are accurate predictors and explore several techniques in order to exploit this information to make more accurate predictions. We compare our results to strong baselines and against the betting line (prediction market) and found that the quality of extractions is more important than the quantity, suggesting that high precision methods working on a medium-sized dataset are preferable over low precision methods that use a larger amount of data. Finally, by aggregating some classes of predictions, the system performance is close to the one of the betting line. Furthermore, we believe that this domain independent framework can help to predict other sports, elections, product release dates and other future events that people talk about in social media. △ Less

Submitted 30 November, 2012; v1 submitted 27 November, 2012; originally announced November 2012.

Comments: Check out the blog post with a summary and Prediction Retrieval information here: http://bitly.com/TwitterPaul

Showing 1–20 of 20 results for author: Blanco, R