Search | arXiv e-print repository

A Probabilistic Approach to Personalize Type-based Facet Ranking for POI Suggestion

Authors: Esraa Ali, Annalina Caputo, Séamus Lawless, Owen Conlan

Abstract: Faceted Search Systems (FSS) have become one of the main search interfaces used in vertical search systems, offering users meaningful facets to refine their search query and narrow down the results quickly to find the intended search target. This work focuses on the problem of ranking type-based facets. In a structured information space, type-based facets (t-facets) indicate the category to which… ▽ More Faceted Search Systems (FSS) have become one of the main search interfaces used in vertical search systems, offering users meaningful facets to refine their search query and narrow down the results quickly to find the intended search target. This work focuses on the problem of ranking type-based facets. In a structured information space, type-based facets (t-facets) indicate the category to which each object belongs. When they belong to a large multi-level taxonomy, it is desirable to rank them separately before ranking other facet groups. This helps the searcher in filtering the results according to their type first. This also makes it easier to rank the rest of the facets once the type of the intended search target is selected. Existing research employs the same ranking methods for different facet groups. In this research, we propose a two-step approach to personalize t-facet ranking. The first step assigns a relevance score to each individual leaf-node t-facet. The score is generated using probabilistic models and it reflects t-facet relevance to the query and the user profile. In the second step, this score is used to re-order and select the sub-tree to present to the user. We investigate the usefulness of the proposed method to a Point Of Interest (POI) suggestion task. Our evaluation aims at capturing the user effort required to fulfil her search needs by using the ranked facets. The proposed approach achieved better results than other existing personalized baselines. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: Accepted at ICWE 2021

arXiv:1912.07441 [pdf]

Multi-stream Data Analytics for Enhanced Performance Prediction in Fantasy Football

Authors: Nicholas Bonello, Joeran Beel, Seamus Lawless, Jeremy Debattista

Abstract: Fantasy Premier League (FPL) performance predictors tend to base their algorithms purely on historical statistical data. The main problems with this approach is that external factors such as injuries, managerial decisions and other tournament match statistics can never be factored into the final predictions. In this paper, we present a new method for predicting future player performances by automa… ▽ More Fantasy Premier League (FPL) performance predictors tend to base their algorithms purely on historical statistical data. The main problems with this approach is that external factors such as injuries, managerial decisions and other tournament match statistics can never be factored into the final predictions. In this paper, we present a new method for predicting future player performances by automatically incorporating human feedback into our model. Through statistical data analysis such as previous performances, upcoming fixture difficulty ratings, betting market analysis, opinions of the general-public and experts alike via social media and web articles, we can improve our understanding of who is likely to perform well in upcoming matches. When tested on the English Premier League 2018/19 season, the model outperformed regular statistical predictors by over 300 points, an average of 11 points per week, ranking within the top 0.5% of players rank 30,000 out of over 6.5 million players. △ Less

Submitted 16 December, 2019; originally announced December 2019.

Journal ref: 27th AIAI Irish Conference on Artificial Intelligence and Cognitive Science. 2019

arXiv:1712.08360 [pdf]

Triple Scoring Using Paragraph Vector - The Gailan Triple Scorer at WSDM Cup 2017

Authors: Esraa Ali, Annalina Caputo, Séamus Lawless

Abstract: In this paper we describe our solution to the WSDM Cup 2017 Triple Scoring task. Our approach generates a relevance score based on the textual description of the triple's subject and value (Object). It measures how similar (related) the text description of the subject is to the text description of its values. The generated similarity score can then be used to rank the multiple values associated wi… ▽ More In this paper we describe our solution to the WSDM Cup 2017 Triple Scoring task. Our approach generates a relevance score based on the textual description of the triple's subject and value (Object). It measures how similar (related) the text description of the subject is to the text description of its values. The generated similarity score can then be used to rank the multiple values associated with this subject. We utilize the Paragraph Vector algorithm to represent the unstructured text into fixed length vectors. The fixed length representation is then employed to calculate the similarity (relevance) score between the subject and its multiple values. Our experimental results have shown that the suggested approach is promising and suitable to solve this problem. △ Less

Submitted 22 December, 2017; originally announced December 2017.

Comments: Triple Scorer at WSDM Cup 2017, see arXiv:1712.08081

ACM Class: H.3

arXiv:1511.08411 [pdf]

doi 10.1109/ICDMW.2015.6

OntoSeg: a Novel Approach to Text Segmentation using Ontological Similarity

Authors: Mostafa Bayomi, Killian Levacher, M. Rami Ghorab, Séamus Lawless

Abstract: Text segmentation (TS) aims at dividing long text into coherent segments which reflect the subtopic structure of the text. It is beneficial to many natural language processing tasks, such as Information Retrieval (IR) and document summarisation. Current approaches to text segmentation are similar in that they all use word-frequency metrics to measure the similarity between two regions of text, so… ▽ More Text segmentation (TS) aims at dividing long text into coherent segments which reflect the subtopic structure of the text. It is beneficial to many natural language processing tasks, such as Information Retrieval (IR) and document summarisation. Current approaches to text segmentation are similar in that they all use word-frequency metrics to measure the similarity between two regions of text, so that a document is segmented based on the lexical cohesion between its words. Various NLP tasks are now moving towards the semantic web and ontologies, such as ontology-based IR systems, to capture the conceptualizations associated with user needs and contents. Text segmentation based on lexical cohesion between words is hence not sufficient anymore for such tasks. This paper proposes OntoSeg, a novel approach to text segmentation based on the ontological similarity between text blocks. The proposed method uses ontological similarity to explore conceptual relations between text segments and a Hierarchical Agglomerative Clustering (HAC) algorithm to represent the text as a tree-like hierarchy that is conceptually structured. The rich structure of the created tree further allows the segmentation of text in a linear fashion at various levels of granularity. The proposed method was evaluated on a wellknown dataset, and the results show that using ontological similarity in text segmentation is very promising. Also we enhance the proposed method by combining ontological similarity with lexical similarity and the results show an enhancement of the segmentation quality. △ Less

Submitted 26 November, 2015; originally announced November 2015.

Comments: 10 pages, IEEE ICDMW 2015 (SENTIRE Workshop)

Showing 1–4 of 4 results for author: Lawless, S