Search | arXiv e-print repository

HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments

Authors: Akari Asai, Sara Evensen, Behzad Golshan, Alon Halevy, Vivian Li, Andrei Lopatenko, Daniela Stepanov, Yoshihiko Suhara, Wang-Chiew Tan, Yinzhan Xu

Abstract: The science of happiness is an area of positive psychology concerned with understanding what behaviors make people happy in a sustainable fashion. Recently, there has been interest in develo** technologies that help incorporate the findings of the science of happiness into users' daily lives by steering them towards behaviors that increase happiness. With the goal of building technology that can… ▽ More The science of happiness is an area of positive psychology concerned with understanding what behaviors make people happy in a sustainable fashion. Recently, there has been interest in develo** technologies that help incorporate the findings of the science of happiness into users' daily lives by steering them towards behaviors that increase happiness. With the goal of building technology that can understand how people express their happy moments in text, we crowd-sourced HappyDB, a corpus of 100,000 happy moments that we make publicly available. This paper describes HappyDB and its properties, and outlines several important NLP problems that can be studied with the help of the corpus. We also apply several state-of-the-art analysis techniques to analyze HappyDB. Our results demonstrate the need for deeper NLP techniques to be developed which makes HappyDB an exciting resource for follow-on research. △ Less

Submitted 25 January, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

Comments: Typos fixed

arXiv:1708.00481 [pdf, other]

A Lightweight Front-end Tool for Interactive Entity Population

Authors: Hidekazu Oiwa, Yoshihiko Suhara, Jiyu Komiya, Andrei Lopatenko

Abstract: Entity population, a task of collecting entities that belong to a particular category, has attracted attention from vertical domains. There is still a high demand for creating entity dictionaries in vertical domains, which are not covered by existing knowledge bases. We develop a lightweight front-end tool for facilitating interactive entity population. We implement key components necessary for ef… ▽ More Entity population, a task of collecting entities that belong to a particular category, has attracted attention from vertical domains. There is still a high demand for creating entity dictionaries in vertical domains, which are not covered by existing knowledge bases. We develop a lightweight front-end tool for facilitating interactive entity population. We implement key components necessary for effective interactive entity population: 1) GUI-based dashboards to quickly modify an entity dictionary, and 2) entity highlighting on documents for quickly viewing the current progress. We aim to reduce user cost from beginning to end, including package installation and maintenance. The implementation enables users to use this tool on their web browsers without any additional packages --- users can focus on their missions to create entity dictionaries. Moreover, an entity expansion module is implemented as external APIs. This design makes it easy to continuously improve interactive entity population pipelines. We are making our demo publicly available (http://bit.ly/luwak-demo). △ Less

Submitted 1 August, 2017; originally announced August 2017.

Comments: ICML Workshop on Interactive Machine Learning

arXiv:1605.07159 [pdf, other]

Complexity of Consistent Query Answering in Databases under Cardinality-Based and Incremental Repair Semantics (extended version)

Authors: Andrei Lopatenko, Leopoldo Bertossi

Abstract: A database D may be inconsistent wrt a given set IC of integrity constraints. Consistent Query Answering (CQA) is the problem of computing from D the answers to a query that are consistent wrt IC . Consistent answers are invariant under all the repairs of D, i.e. the consistent instances that minimally depart from D. Three classes of repair have been considered in the literature: those that minimi… ▽ More A database D may be inconsistent wrt a given set IC of integrity constraints. Consistent Query Answering (CQA) is the problem of computing from D the answers to a query that are consistent wrt IC . Consistent answers are invariant under all the repairs of D, i.e. the consistent instances that minimally depart from D. Three classes of repair have been considered in the literature: those that minimize set-theoretically the set of tuples in the symmetric difference; those that minimize the changes of attribute values, and those that minimize the cardinality of the set of tuples in the symmetric difference. The latter class has not been systematically investigated. In this paper we obtain algorithmic and complexity theoretic results for CQA under this cardinality-based repair semantics. We do this in the usual, static setting, but also in a dynamic framework where a consistent database is affected by a sequence of updates, which may make it inconsistent. We also establish comparative results with the other two kinds of repairs in the dynamic case. △ Less

Submitted 23 May, 2016; originally announced May 2016.

Comments: This paper, without the proofs provided here, arXiv:cs/0604002, appeared in the Proc. of ICDT 2007. This version contains all the proofs in correlation with the results reported in the ICDT paper (as opposed to a previous Arkiv Corr posting related to the same paper). One proof was corrected, and a corollary was added

arXiv:cs/0604002 [pdf, ps, other]

Complexity of Consistent Query Answering in Databases under Cardinality-Based and Incremental Repair Semantics

Authors: Andrei Lopatenko, Leopoldo Bertossi

Abstract: Consistent Query Answering (CQA) is the problem of computing from a database the answers to a query that are consistent with respect to certain integrity constraints that the database, as a whole, may fail to satisfy. Consistent answers have been characterized as those that are invariant under certain minimal forms of restoration of the database consistency. We investigate algorithmic and comple… ▽ More Consistent Query Answering (CQA) is the problem of computing from a database the answers to a query that are consistent with respect to certain integrity constraints that the database, as a whole, may fail to satisfy. Consistent answers have been characterized as those that are invariant under certain minimal forms of restoration of the database consistency. We investigate algorithmic and complexity theoretic issues of CQA under database repairs that minimally depart -wrt the cardinality of the symmetric difference- from the original database. We obtain first tight complexity bounds. We also address the problem of incremental complexity of CQA, that naturally occurs when an originally consistent database becomes inconsistent after the execution of a sequence of update operations. Tight bounds on incremental complexity are provided for various semantics under denial constraints. Fixed parameter tractability is also investigated in this dynamic context, where the size of the update sequence becomes the relevant parameter. △ Less

Submitted 1 April, 2006; originally announced April 2006.

Comments: 26 pages, 2 figures

arXiv:cs/0503032 [pdf, ps, other]

Complexity and Approximation of Fixing Numerical Attributes in Databases Under Integrity Constraints

Authors: L. Bertossi, L. Bravo, E. Franconi, A. Lopatenko

Abstract: Consistent query answering is the problem of computing the answers from a database that are consistent with respect to certain integrity constraints that the database as a whole may fail to satisfy. Those answers are characterized as those that are invariant under minimal forms of restoring the consistency of the database. In this context, we study the problem of repairing databases by fixing in… ▽ More Consistent query answering is the problem of computing the answers from a database that are consistent with respect to certain integrity constraints that the database as a whole may fail to satisfy. Those answers are characterized as those that are invariant under minimal forms of restoring the consistency of the database. In this context, we study the problem of repairing databases by fixing integer numerical values at the attribute level with respect to denial and aggregation constraints. We introduce a quantitative definition of database fix, and investigate the complexity of several decision and optimization problems, including DFP, i.e. the existence of fixes within a given distance from the original instance, and CQA, i.e. deciding consistency of answers to aggregate conjunctive queries under different semantics. We provide sharp complexity bounds, identify relevant tractable cases; and introduce approximation algorithms for some of those that are intractable. More specifically, we obtain results like undecidability of existence of fixes for aggregation constraints; MAXSNP-hardness of DFP, but a good approximation algorithm for a relevant special case; and intractability but good approximation for CQA for aggregate queries for one database atom denials (plus built-ins). △ Less

Submitted 28 October, 2005; v1 submitted 14 March, 2005; originally announced March 2005.

Comments: 35 pages. Extended version of the camera ready version to appear in Proc. of the Databases Programming Languages Conference (DBPL 05), Springer LNCS volume 3774

arXiv:cs/0308013 [pdf, ps, other]

A Robust and Computational Characterisation of Peer-to-Peer Database Systems

Authors: Enrico Franconi, Gabriel Kuper, Andrei Lopatenko, Luciano Serafini

Abstract: In this paper we give a robust logical and computational characterisation of peer-to-peer database systems. We first define a pre- cise model-theoretic semantics of a peer-to-peer system, which allows for local inconsistency handling. We then characterise the general computa- tional properties for the problem of answering queries to such a peer-to- peer system. Finally, we devise tight complexit… ▽ More In this paper we give a robust logical and computational characterisation of peer-to-peer database systems. We first define a pre- cise model-theoretic semantics of a peer-to-peer system, which allows for local inconsistency handling. We then characterise the general computa- tional properties for the problem of answering queries to such a peer-to- peer system. Finally, we devise tight complexity bounds and distributed procedures for the problem of answering queries in few relevant special cases. △ Less

Submitted 6 August, 2003; originally announced August 2003.

Comments: 13 pages

ACM Class: H.2.4; H.2.5; C.2.4

Journal ref: "International Workshop On Databases, Information Systems and Peer-to-Peer Computing", 2003

arXiv:cs/0110026 [pdf]

Information retrieval in Current Research Information Systems

Authors: Andrei Lopatenko

Abstract: In this paper we describe the requirements for research information systems and problems which arise in the development of such system. Here is shown which problems could be solved by using of knowledge markup technologies. Ontology for Research Information System offered. Architecture for collecting research data and providing access to it is described. In this paper we describe the requirements for research information systems and problems which arise in the development of such system. Here is shown which problems could be solved by using of knowledge markup technologies. Ontology for Research Information System offered. Architecture for collecting research data and providing access to it is described. △ Less

Submitted 10 October, 2001; originally announced October 2001.

Comments: 8 pages, ontology description included, position paper at the Workshop on Knowledge Markup and Semantic Annotation at K-CAP'2001

ACM Class: H.3.3; H.3.4; H.3.7

arXiv:cs/0107035 [pdf]

Semantic Web Content Accessibility Guidelines for Current Research Information Systems (CRIS)

Authors: A. Lopatenko

Abstract: The most exciting challenge for CRIS is to create a service for research information which should be wide-spread, distributed and actual like Google, but at the same time structured, trusted, with a complex search and navigation similar to today CRIS application. The core technology for such a "new" CRIS is the semantic web technology to integrate database contents with HTML and XML web pages fo… ▽ More The most exciting challenge for CRIS is to create a service for research information which should be wide-spread, distributed and actual like Google, but at the same time structured, trusted, with a complex search and navigation similar to today CRIS application. The core technology for such a "new" CRIS is the semantic web technology to integrate database contents with HTML and XML web pages for being provided to the research interested public. One (at the moment the best) possible way is to use RDF (Resource Description Framework) which is also recommended by the W3 consortium. △ Less

Submitted 29 July, 2001; originally announced July 2001.

Comments: 25 pages

ACM Class: D.2.12; E.2; H.2.4

Journal ref: Second Interim Report of Extencion Centre, Vienna University of Technology, 2001

Showing 1–8 of 8 results for author: Lopatenko, A