-
SECOMP: Formally Secure Compilation of Compartmentalized C Programs
Authors:
Jérémy Thibault,
Roberto Blanco,
Dongjae Lee,
Sven Argo,
Arthur Azevedo de Amorim,
Aïna Linn Georges,
Catalin Hritcu,
Andrew Tolmach
Abstract:
Undefined behavior in C often causes devastating security vulnerabilities. One practical mitigation is compartmentalization, which allows developers to structure large programs into mutually distrustful compartments with clearly specified privileges and interactions. In this paper we introduce SECOMP, a compiler for compartmentalized C code that comes with machine-checked proofs guaranteeing that…
▽ More
Undefined behavior in C often causes devastating security vulnerabilities. One practical mitigation is compartmentalization, which allows developers to structure large programs into mutually distrustful compartments with clearly specified privileges and interactions. In this paper we introduce SECOMP, a compiler for compartmentalized C code that comes with machine-checked proofs guaranteeing that the scope of undefined behavior is restricted to the compartments that encounter it and become dynamically compromised. These guarantees are formalized as the preservation of safety properties against adversarial contexts, a secure compilation criterion similar to full abstraction, and this is the first time such a strong criterion is proven for a mainstream programming language. To achieve this we extend the languages of the CompCert verified C compiler with isolated compartments that can only interact via procedure calls and returns, as specified by cross-compartment interfaces. We adapt the passes and optimizations of CompCert as well as their correctness proofs to this compartment-aware setting. We then use compiler correctness as an ingredient in a larger secure compilation proof that involves several proof engineering novelties, needed to scale formally secure compilation up to a C compiler.
△ Less
Submitted 1 July, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Mercury: A modeling, simulation, and optimization framework for data stream-oriented IoT applications
Authors:
Román Cárdenas,
Patricia Arroba,
Roberto Blanco,
Pedro Malagón,
José L. Risco-Martín,
José M. Moya
Abstract:
The Internet of Things is transforming our society by monitoring users and infrastructures' behavior to enable new services that will improve life quality and resource management. These applications require a vast amount of localized information to be processed in real-time so, the deployment of new fog computing infrastructures that bring computing closer to the data sources is a major concern. I…
▽ More
The Internet of Things is transforming our society by monitoring users and infrastructures' behavior to enable new services that will improve life quality and resource management. These applications require a vast amount of localized information to be processed in real-time so, the deployment of new fog computing infrastructures that bring computing closer to the data sources is a major concern. In this context, we present Mercury, a Modeling, Simulation, and Optimization (M&S&O) framework to analyze the dimensioning and the dynamic operation of real-time fog computing scenarios. Our research proposes a location-aware solution that supports data stream analytics applications including FaaS-based computation offloading. Mercury implements a detailed structural and behavioral simulation model, providing fine-grained simulation outputs, and is described using the Discrete Event System Specification (DEVS) mathematical formalism, hel** to validate the model's implementation. Finally, we present a case study using real traces from a driver assistance scenario, offering a detailed comparison with other state-of-the-art simulators.
△ Less
Submitted 2 November, 2023;
originally announced December 2023.
-
Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis in Four Languages
Authors:
Seraphina Goldfarb-Tarrant,
Adam Lopez,
Roi Blanco,
Diego Marcheggiani
Abstract:
Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but import…
▽ More
Sentiment analysis (SA) systems are used in many products and hundreds of languages. Gender and racial biases are well-studied in English SA systems, but understudied in other languages, with few resources for such studies. To remedy this, we build a counterfactual evaluation corpus for gender and racial/migrant bias in four languages. We demonstrate its usefulness by answering a simple but important question that an engineer might need to answer when deploying a system: What biases do systems import from pre-trained models when compared to a baseline with no pre-training? Our evaluation corpus, by virtue of being counterfactual, not only reveals which models have less bias, but also pinpoints changes in model bias behaviour, which enables more targeted mitigation strategies. We release our code and evaluation corpora to facilitate future research.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Test adequacy evaluation for the user-database interaction: a specification-based approach
Authors:
Raquel Blanco,
Javier Tuya,
Ruben V. Seco
Abstract:
Testing a database application is a challenging process where both the database and the user interaction have to be considered in the design of test cases. This paper describes a specification-based approach to guide the design of test inputs (both the test database and the user inputs) for a database application and to automatically evaluate the test adequacy. First, the system specification of t…
▽ More
Testing a database application is a challenging process where both the database and the user interaction have to be considered in the design of test cases. This paper describes a specification-based approach to guide the design of test inputs (both the test database and the user inputs) for a database application and to automatically evaluate the test adequacy. First, the system specification of the application is modelled: (1) the structure of the database and the user interface are represented in a single model, called Integrated Data Model (IDM), (2) the functional requirements are expressed as a set of business rules, written in terms of the IDM. Then, a MCDC-based criterion is applied over the business rules to automatically derive the situations of interest to be tested (test requirements), which guide the design of the test inputs. Finally, the adequacy of these test inputs is automatically evaluated to determine whether the test requirements are covered. The approach has been applied to the TPC-C benchmark. The results show that it allows designing test cases that are able to detect interesting faults which were located in the procedural code of the implementation.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
Can gamification help in software testing education? Findings from an empirical study
Authors:
Raquel Blanco,
Manuel Trinidad,
Maria Jose Suarez-Cabal,
Alejandro Calderon,
Mercedes Ruiz,
Javier Tuya
Abstract:
Software testing is an essential knowledge area required by industry for software engineers. However, software engineering students often consider testing less appealing than designing or coding. Consequently, it is difficult to engage students to create effective tests. To encourage students, we explored the use of gamification and investigated whether this technique can help to improve the engag…
▽ More
Software testing is an essential knowledge area required by industry for software engineers. However, software engineering students often consider testing less appealing than designing or coding. Consequently, it is difficult to engage students to create effective tests. To encourage students, we explored the use of gamification and investigated whether this technique can help to improve the engagement and performance of software testing students. We conducted a controlled experiment to compare the engagement and performance of two groups of students that took an undergraduate software testing course in different academic years. The experimental group is formed by 135 students from the gamified course whereas the control group is formed by 100 students from the non-gamified course. The data collected were statistically analyzed to answer the research questions of this study. The results show that the students that participated in the gamification experience were more engaged and achieved a better performance. As an additional finding, the analysis of the results reveals that a key aspect to succeed is the gamification experience design. It is important to distribute the motivating stimulus provided by the gamification throughout the whole experience to engage students until the end. Given these results, we plan to readjust the gamification experience design to increase student engagement in the last stage of the experience, as well as to conduct a longitudinal study to evaluate the effects of gamification.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Can gamification reduce the burden of self-reporting in mHealth applications? A feasibility study using machine learning from smartwatch data to estimate cognitive load
Authors:
Michal K. Grzeszczyk,
Paulina Adamczyk,
Sylwia Marek,
Ryszard Pręcikowski,
Maciej Kuś,
M. Patrycja Lelujko,
Rosmary Blanco,
Tomasz Trzciński,
Arkadiusz Sitek,
Maciej Malawski,
Aneta Lisowska
Abstract:
The effectiveness of digital treatments can be measured by requiring patients to self-report their state through applications, however, it can be overwhelming and causes disengagement. We conduct a study to explore the impact of gamification on self-reporting. Our approach involves the creation of a system to assess cognitive load (CL) through the analysis of photoplethysmography (PPG) signals. Th…
▽ More
The effectiveness of digital treatments can be measured by requiring patients to self-report their state through applications, however, it can be overwhelming and causes disengagement. We conduct a study to explore the impact of gamification on self-reporting. Our approach involves the creation of a system to assess cognitive load (CL) through the analysis of photoplethysmography (PPG) signals. The data from 11 participants is utilized to train a machine learning model to detect CL. Subsequently, we create two versions of surveys: a gamified and a traditional one. We estimate the CL experienced by other participants (13) while completing surveys. We find that CL detector performance can be enhanced via pre-training on stress detection tasks. For 10 out of 13 participants, a personalized CL detector can achieve an F1 score above 0.7. We find no difference between the gamified and non-gamified surveys in terms of CL but participants prefer the gamified version.
△ Less
Submitted 21 December, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
Performance-Efficiency Trade-Offs in Adapting Language Models to Text Classification Tasks
Authors:
Laura Aina,
Nikos Voskarides,
Roi Blanco
Abstract:
Pre-trained language models (LMs) obtain state-of-the-art performance when adapted to text classification tasks. However, when using such models in real-world applications, efficiency considerations are paramount. In this paper, we study how different training procedures that adapt LMs to text classification perform, as we vary model and train set size. More specifically, we compare standard fine-…
▽ More
Pre-trained language models (LMs) obtain state-of-the-art performance when adapted to text classification tasks. However, when using such models in real-world applications, efficiency considerations are paramount. In this paper, we study how different training procedures that adapt LMs to text classification perform, as we vary model and train set size. More specifically, we compare standard fine-tuning, prompting, and knowledge distillation (KD) when the teacher was trained with either fine-tuning or prompting. Our findings suggest that even though fine-tuning and prompting work well to train large LMs on large train sets, there are more efficient alternatives that can reduce compute or data cost. Interestingly, we find that prompting combined with KD can reduce compute and data cost at the same time.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
SecurePtrs: Proving Secure Compilation with Data-Flow Back-Translation and Turn-Taking Simulation
Authors:
Akram El-Korashy,
Roberto Blanco,
Jérémy Thibault,
Adrien Durier,
Deepak Garg,
Catalin Hritcu
Abstract:
Proving secure compilation of partial programs typically requires back-translating an attack against the compiled program to an attack against the source program. To prove back-translation, one can syntactically translate the target attacker to a source one -- i.e., syntax-directed back-translation -- or show that the interaction traces of the target attacker can also be emitted by source attacker…
▽ More
Proving secure compilation of partial programs typically requires back-translating an attack against the compiled program to an attack against the source program. To prove back-translation, one can syntactically translate the target attacker to a source one -- i.e., syntax-directed back-translation -- or show that the interaction traces of the target attacker can also be emitted by source attackers -- i.e., trace-directed back-translation.
Syntax-directed back-translation is not suitable when the target attacker may use unstructured control flow that the source language cannot directly represent. Trace-directed back-translation works with such syntactic dissimilarity because only the external interactions of the target attacker have to be mimicked in the source, not its internal control flow. Revealing only external interactions is, however, inconvenient when sharing memory via unforgeable pointers, since information about shared pointers stashed in private memory is not present on the trace. This made prior proofs unnecessarily complex, since the generated attacker had to instead stash all reachable pointers.
In this work, we introduce more informative *data-flow traces*, combining the best of syntax- and trace-directed back-translation in a simpler technique that handles both syntactic dissimilarity and memory sharing well, and that is proved correct in Coq. Additionally, we develop a novel *turn-taking simulation* relation and use it to prove a recomposition lemma, which is key to reusing compiler correctness in such secure compilation proofs. We are the first to mechanize such a recomposition lemma in the presence of memory sharing.
We use these two innovations in a secure compilation proof for a code generation compiler pass between a source language with structured control flow and a target language with unstructured control flow, both with safe pointers and components.
△ Less
Submitted 3 June, 2022; v1 submitted 4 October, 2021;
originally announced October 2021.
-
Formalizing Stack Safety as a Security Property
Authors:
Sean Noble Anderson,
Roberto Blanco,
Leonidas Lampropoulos,
Benjamin C. Pierce,
Andrew Tolmach
Abstract:
The term stack safety is used to describe a variety of compiler, run-time, and hardware mechanisms for protecting stack memory. Unlike "the heap," the ISA-level stack does not correspond to a single high-level language concept: different compilers use it in different ways to support procedural and functional abstraction mechanisms from a wide range of languages. This protean nature makes it diffic…
▽ More
The term stack safety is used to describe a variety of compiler, run-time, and hardware mechanisms for protecting stack memory. Unlike "the heap," the ISA-level stack does not correspond to a single high-level language concept: different compilers use it in different ways to support procedural and functional abstraction mechanisms from a wide range of languages. This protean nature makes it difficult to nail down what it means to correctly enforce stack safety.
We propose a new formal characterization of stack safety using concepts from language-based security. Rather than treating stack safety as a monolithic property, we decompose it into an integrity property and a confidentiality property for each of the caller and the callee, plus a control-flow property: five properties in all. This formulation is motivated by a particular class of enforcement mechanisms, the "lazy" stack safety micro-policies studied by Roessler and DeHon, which permit functions to write into one another's frames but taint the changed locations so that the frame's owner cannot access them. No existing characterization of stack safety captures this style of safety; we capture it here by stating our properties in terms of the observable behavior of the system.
Our properties go further than previous formal definitions of stack safety, supporting caller- and callee-saved registers, arguments passed on the stack, and tail-call elimination. We validate the properties by using them to distinguish between correct and incorrect implementations of Roessler and DeHon's micro-policies using property-based random testing. Our test harness successfully identifies several broken variants, including Roessler and DeHon's lazy policy; a repaired version of their policy passes our tests.
△ Less
Submitted 21 March, 2024; v1 submitted 2 May, 2021;
originally announced May 2021.
-
BookQA: Stories of Challenges and Opportunities
Authors:
Stefanos Angelidis,
Lea Frermann,
Diego Marcheggiani,
Roi Blanco,
Lluís Màrquez
Abstract:
We present a system for answering questions based on the full text of books (BookQA), which first selects book passages given a question at hand, and then uses a memory network to reason and predict an answer. To improve generalization, we pretrain our memory network using artificial questions generated from book sentences. We experiment with the recently published NarrativeQA corpus, on the subse…
▽ More
We present a system for answering questions based on the full text of books (BookQA), which first selects book passages given a question at hand, and then uses a memory network to reason and predict an answer. To improve generalization, we pretrain our memory network using artificial questions generated from book sentences. We experiment with the recently published NarrativeQA corpus, on the subset of Who questions, which expect book characters as answers. We experimentally show that BERT-based retrieval and pretraining improve over baseline results significantly. At the same time, we confirm that NarrativeQA is a highly challenging data set, and that there is need for novel research in order to achieve high-precision BookQA results. We analyze some of the bottlenecks of the current approach, and we argue that more research is needed on text representation, retrieval of relevant passages, and reasoning, including commonsense knowledge.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Trace-Relating Compiler Correctness and Secure Compilation
Authors:
Carmine Abate,
Roberto Blanco,
Stefan Ciobaca,
Adrien Durier,
Deepak Garg,
Catalin Hritcu,
Marco Patrignani,
Éric Tanter,
Jérémy Thibault
Abstract:
Compiler correctness is, in its simplest form, defined as the inclusion of the set of traces of the compiled program into the set of traces of the original program, which is equivalent to the preservation of all trace properties. Here traces collect, for instance, the externally observable events of each execution. This definition requires, however, the set of traces of the source and target langu…
▽ More
Compiler correctness is, in its simplest form, defined as the inclusion of the set of traces of the compiled program into the set of traces of the original program, which is equivalent to the preservation of all trace properties. Here traces collect, for instance, the externally observable events of each execution. This definition requires, however, the set of traces of the source and target languages to be exactly the same, which is not the case when the languages are far apart or when observations are fine-grained. To overcome this issue, we study a generalized compiler correctness definition, which uses source and target traces drawn from potentially different sets and connected by an arbitrary relation. We set out to understand what guarantees this generalized compiler correctness definition gives us when instantiated with a non-trivial relation on traces. When this trace relation is not equality, it is no longer possible to preserve the trace properties of the source program unchanged. Instead, we provide a generic characterization of the target trace property ensured by correctly compiling a program that satisfies a given source property, and dually, of the source trace property one is required to show in order to obtain a certain target property for the compiled code. We show that this view on compiler correctness can naturally account for undefined behavior, resource exhaustion, different source and target values, side-channels, and various abstraction mismatches. Finally, we show that the same generalization also applies to many secure compilation definitions, which characterize the protection of a compiled program against linked adversarial code.
△ Less
Submitted 23 February, 2020; v1 submitted 11 July, 2019;
originally announced July 2019.
-
The use of blogs in the education field: A qualitative systematic review
Authors:
Carlos R. del Blanco,
Ivan García-Magariño
Abstract:
Blogs have become one of the most successful tools of the Web 2.0 because of their ease of use and the availability of open platforms. They have quickly spread in the education field thanks to the many attractive qualities that have been attributed to them, such as collaboration, communication, enhancing of professional writing, and the improvement of information-gathering skills. However, many of…
▽ More
Blogs have become one of the most successful tools of the Web 2.0 because of their ease of use and the availability of open platforms. They have quickly spread in the education field thanks to the many attractive qualities that have been attributed to them, such as collaboration, communication, enhancing of professional writing, and the improvement of information-gathering skills. However, many of the studies that have addressed this issue were not based on an empirical research, and therefore they are unreliable. On the other hand, the studies that do have conducted an empirical research have usually relied on participant self-reported data (surveys, interviews, and contents of blogs), which can significantly bias the positive results usually reported on the use of blogs. Another source of bias and inaccuracy in the reported results is that most of the studies lacked control group, i.e they do not follow an experimental design. The purpose of this review is to examine the current state of the studies related to the evaluation of the blog effects in the education field. The methods to select the studies and perform the corresponding analysis have followed a qualitative systematic approach. The selection has been restricted to empirical and peer-reviewed studies published between January 2011 and June 2013. The findings have been integrated and compared using the Grounded Theory, giving rise to a set of categories that have structured the results of the review.
△ Less
Submitted 12 October, 2018;
originally announced October 2018.
-
Journey Beyond Full Abstraction: Exploring Robust Property Preservation for Secure Compilation
Authors:
Carmine Abate,
Roberto Blanco,
Deepak Garg,
Catalin Hritcu,
Marco Patrignani,
Jérémy Thibault
Abstract:
(CROPPED TO FIT IN ARXIV'S SILLY LIMIT. SEE PDF FOR COMPLETE ABSTRACT.)
We are the first to thoroughly explore a large space of formal secure compilation criteria based on robust property preservation, i.e., the preservation of properties satisfied against arbitrary adversarial contexts. We study robustly preserving various classes of trace properties such as safety, of hyperproperties such as n…
▽ More
(CROPPED TO FIT IN ARXIV'S SILLY LIMIT. SEE PDF FOR COMPLETE ABSTRACT.)
We are the first to thoroughly explore a large space of formal secure compilation criteria based on robust property preservation, i.e., the preservation of properties satisfied against arbitrary adversarial contexts. We study robustly preserving various classes of trace properties such as safety, of hyperproperties such as noninterference, and of relational hyperproperties such as trace equivalence. This leads to many new secure compilation criteria, some of which are easier to practically achieve and prove than full abstraction, and some of which provide strictly stronger security guarantees. For each of the studied criteria we propose an equivalent "property-free" characterization that clarifies which proof techniques apply. For relational properties and hyperproperties, which relate the behaviors of multiple programs, our formal definitions of the property classes themselves are novel. We order our criteria by their relative strength and show several collapses and separation results. Finally, we adapt existing proof techniques to show that even the strongest of our secure compilation criteria, the robust preservation of all relational hyperproperties, is achievable for a simple translation from a statically typed to a dynamically typed language.
△ Less
Submitted 17 May, 2019; v1 submitted 12 July, 2018;
originally announced July 2018.
-
When Good Components Go Bad: Formally Secure Compilation Despite Dynamic Compromise
Authors:
Carmine Abate,
Arthur Azevedo de Amorim,
Roberto Blanco,
Ana Nora Evans,
Guglielmo Fachini,
Catalin Hritcu,
Théo Laurent,
Benjamin C. Pierce,
Marco Stronati,
Jérémy Thibault,
Andrew Tolmach
Abstract:
We propose a new formal criterion for evaluating secure compilation schemes for unsafe languages, expressing end-to-end security guarantees for software components that may become compromised after encountering undefined behavior---for example, by accessing an array out of bounds.
Our criterion is the first to model dynamic compromise in a system of mutually distrustful components with clearly s…
▽ More
We propose a new formal criterion for evaluating secure compilation schemes for unsafe languages, expressing end-to-end security guarantees for software components that may become compromised after encountering undefined behavior---for example, by accessing an array out of bounds.
Our criterion is the first to model dynamic compromise in a system of mutually distrustful components with clearly specified privileges. It articulates how each component should be protected from all the others---in particular, from components that have encountered undefined behavior and become compromised. Each component receives secure compilation guarantees---in particular, its internal invariants are protected from compromised components---up to the point when this component itself becomes compromised, after which we assume an attacker can take complete control and use this component's privileges to attack other components. More precisely, a secure compilation chain must ensure that a dynamically compromised component cannot break the safety properties of the system at the target level any more than an arbitrary attacker-controlled component (with the same interface and privileges, but without undefined behaviors) already could at the source level.
To illustrate the model, we construct a secure compilation chain for a small unsafe language with buffers, procedures, and components, targeting a simple abstract machine with built-in compartmentalization. We give a machine-checked proof in Coq that this compiler satisfies our secure compilation criterion. Finally, we show that the protection guarantees offered by the compartmentalized abstract machine can be achieved at the machine-code level using either software fault isolation or a tag-based reference monitor.
△ Less
Submitted 29 November, 2019; v1 submitted 2 February, 2018;
originally announced February 2018.
-
Efficient and Effective Tail Latency Minimization in Multi-Stage Retrieval Systems
Authors:
Joel Mackenzie,
J. Shane Culpepper,
Roi Blanco,
Matt Crane,
Charles L. A. Clarke,
Jimmy Lin
Abstract:
Scalable web search systems typically employ multi-stage retrieval architectures, where an initial stage generates a set of candidate documents that are then pruned and re-ranked. Since subsequent stages typically exploit a multitude of features of varying costs using machine-learned models, reducing the number of documents that are considered at each stage improves latency. In this work, we propo…
▽ More
Scalable web search systems typically employ multi-stage retrieval architectures, where an initial stage generates a set of candidate documents that are then pruned and re-ranked. Since subsequent stages typically exploit a multitude of features of varying costs using machine-learned models, reducing the number of documents that are considered at each stage improves latency. In this work, we propose and validate a unified framework that can be used to predict a wide range of performance-sensitive parameters which minimize effectiveness loss, while simultaneously minimizing query latency, across all stages of a multi-stage search architecture. Furthermore, our framework can be easily applied in large-scale IR systems, can be trained without explicitly requiring relevance judgments, and can target a variety of different efficiency-effectiveness trade-offs, making it well suited to a wide range of search scenarios. Our results show that we can reliably predict a number of different parameters on a per-query basis, while simultaneously detecting and minimizing the likelihood of tail-latency queries that exceed a pre-specified performance budget. As a proof of concept, we use the prediction framework to help alleviate the problem of tail-latency queries in early stage retrieval. On the standard ClueWeb09B collection and 31k queries, we show that our new hybrid system can reliably achieve a maximum query time of 200 ms with a 99.99% response time guarantee without a significant loss in overall effectiveness. The solutions presented are practical, and can easily be used in large-scale distributed search engine deployments with a small amount of additional overhead.
△ Less
Submitted 20 April, 2017; v1 submitted 12 April, 2017;
originally announced April 2017.
-
Part of Speech Based Term Weighting for Information Retrieval
Authors:
Christina Lioma,
Roi Blanco
Abstract:
Automatic language processing tools typically assign to terms so-called weights corresponding to the contribution of terms to information content. Traditionally, term weights are computed from lexical statistics, e.g., term frequencies. We propose a new type of term weight that is computed from part of speech (POS) n-gram statistics. The proposed POS-based term weight represents how informative a…
▽ More
Automatic language processing tools typically assign to terms so-called weights corresponding to the contribution of terms to information content. Traditionally, term weights are computed from lexical statistics, e.g., term frequencies. We propose a new type of term weight that is computed from part of speech (POS) n-gram statistics. The proposed POS-based term weight represents how informative a term is in general, based on the POS contexts in which it generally occurs in language. We suggest five different computations of POS-based term weights by extending existing statistical approximations of term information measures. We apply these POS-based term weights to information retrieval, by integrating them into the model that matches documents to queries. Experiments with two TREC collections and 300 queries, using TF-IDF & BM25 as baselines, show that integrating our POS-based term weights to retrieval always leads to gains (up to +33.7% from the baseline). Additional experiments with a different retrieval model as baseline (Language Model with Dirichlet priors smoothing) and our best performing POS-based term weight, show retrieval gains always and consistently across the whole smoothing range of the baseline.
△ Less
Submitted 5 April, 2017;
originally announced April 2017.
-
Proof Outlines as Proof Certificates: A System Description
Authors:
Roberto Blanco,
Dale Miller
Abstract:
We apply the foundational proof certificate (FPC) framework to the problem of designing high-level outlines of proofs. The FPC framework provides a means to formally define and check a wide range of proof evidence. A focused proof system is central to this framework and such a proof system provides an interesting approach to proof reconstruction during the process of proof checking (relying on a…
▽ More
We apply the foundational proof certificate (FPC) framework to the problem of designing high-level outlines of proofs. The FPC framework provides a means to formally define and check a wide range of proof evidence. A focused proof system is central to this framework and such a proof system provides an interesting approach to proof reconstruction during the process of proof checking (relying on an underlying logic programming implementation). Here, we illustrate how the FPC framework can be used to design proof outlines and then to exploit proof checkers as a means for expanding outlines into fully detailed proofs. In order to validate this approach to proof outlines, we have built the ACheck system that allows us to take a sequence of theorems and apply the proof outline "do the obvious induction and close the proof using previously proved lemmas".
△ Less
Submitted 13 November, 2015;
originally announced November 2015.
-
Local Ranking Problem on the BrowseGraph
Authors:
Michele Trevisiol,
Luca Maria Aiello,
Paolo Boldi,
Roi Blanco
Abstract:
The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has receive…
▽ More
The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has received more and more attention in many different tasks such as ranking, prediction and recommendation. However, a web-server has only the browsing traffic performed on its pages (local BrowseGraph) and, as a consequence, the local computation can lead to estimation errors, which hinders the increasing number of applications in the state of the art. Also, although the divergence between the local and global ranks has been measured, the possibility of estimating such divergence using only local knowledge has been mainly overlooked. These aspects are of great interest for online service providers who want to: (i) gauge their ability to correctly assess the importance of their resources only based on their local knowledge, and (ii) take into account real user browsing fluxes that better capture the actual user interest than the static hyperlink network. We study the LRP problem on a BrowseGraph from a large news provider, considering as subgraphs the aggregations of browsing traces of users coming from different domains. We show that the distance between rankings can be accurately predicted based only on structural information of the local graph, being able to achieve an average rank correlation as high as 0.8.
△ Less
Submitted 23 May, 2015;
originally announced May 2015.
-
Entity-Linking via Graph-Distance Minimization
Authors:
Roi Blanco,
Paolo Boldi,
Andrea Marino
Abstract:
Entity-linking is a natural-language-processing task that consists in identifying the entities mentioned in a piece of text, linking each to an appropriate item in some knowledge base; when the knowledge base is Wikipedia, the problem comes to be known as wikification (in this case, items are wikipedia articles). One instance of entity-linking can be formalized as an optimization problem on the un…
▽ More
Entity-linking is a natural-language-processing task that consists in identifying the entities mentioned in a piece of text, linking each to an appropriate item in some knowledge base; when the knowledge base is Wikipedia, the problem comes to be known as wikification (in this case, items are wikipedia articles). One instance of entity-linking can be formalized as an optimization problem on the underlying concept graph, where the quantity to be optimized is the average distance between chosen items. Inspired by this application, we define a new graph problem which is a natural variant of the Maximum Capacity Representative Set. We prove that our problem is NP-hard for general graphs; nonetheless, under some restrictive assumptions, it turns out to be solvable in linear time. For the general case, we propose two heuristics: one tries to enforce the above assumptions and another one is based on the notion of hitting distance; we show experimentally how these approaches perform with respect to some baselines on a real-world dataset.
△ Less
Submitted 29 July, 2014;
originally announced July 2014.
-
TwitterPaul: Extracting and Aggregating Twitter Predictions
Authors:
Naushad UzZaman,
Roi Blanco,
Michael Matthews
Abstract:
This paper introduces TwitterPaul, a system designed to make use of Social Media data to help to predict game outcomes for the 2010 FIFA World Cup tournament. To this end, we extracted over 538K mentions to football games from a large sample of tweets that occurred during the World Cup, and we classified into different types with a precision of up to 88%. The different mentions were aggregated in…
▽ More
This paper introduces TwitterPaul, a system designed to make use of Social Media data to help to predict game outcomes for the 2010 FIFA World Cup tournament. To this end, we extracted over 538K mentions to football games from a large sample of tweets that occurred during the World Cup, and we classified into different types with a precision of up to 88%. The different mentions were aggregated in order to make predictions about the outcomes of the actual games. We attempt to learn which Twitter users are accurate predictors and explore several techniques in order to exploit this information to make more accurate predictions. We compare our results to strong baselines and against the betting line (prediction market) and found that the quality of extractions is more important than the quantity, suggesting that high precision methods working on a medium-sized dataset are preferable over low precision methods that use a larger amount of data. Finally, by aggregating some classes of predictions, the system performance is close to the one of the betting line. Furthermore, we believe that this domain independent framework can help to predict other sports, elections, product release dates and other future events that people talk about in social media.
△ Less
Submitted 30 November, 2012; v1 submitted 27 November, 2012;
originally announced November 2012.