Skip to main content

Showing 1–50 of 51 results for author: Bie, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12953  [pdf, other

    cs.GR cs.HC cs.LG

    Pattern or Artifact? Interactively Exploring Embedding Quality with TRACE

    Authors: Edith Heiter, Liesbet Martens, Ruth Seurinck, Martin Guilliams, Tijl De Bie, Yvan Saeys, Jefrey Lijffijt

    Abstract: This paper presents TRACE, a tool to analyze the quality of 2D embeddings generated through dimensionality reduction techniques. Dimensionality reduction methods often prioritize preserving either local neighborhoods or global distances, but insights from visual structures can be misleading if the objective has not been achieved uniformly. TRACE addresses this challenge by providing a scalable and… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 4 pages, 3 figures, Accepted at ECML-PKDD 2024. For a demo video, see https://youtu.be/mtyFzXt51Jw. Code is available at https://github.com/aida-ugent/TRACE

  2. arXiv:2405.18941  [pdf, other

    cs.IR cs.LG

    Content-Agnostic Moderation for Stance-Neutral Recommendation

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: Personalized recommendation systems often drive users towards more extreme content, exacerbating opinion polarization. While (content-aware) moderation has been proposed to mitigate these effects, such approaches risk curtailing the freedom of speech and of information. To address this concern, we propose and explore the feasibility of \emph{content-agnostic} moderation as an alternative approach… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  3. Gaussian Embedding of Temporal Networks

    Authors: Raphaël Romero, Jefrey Lijffijt, Riccardo Rastelli, Marco Corneli, Tijl De Bie

    Abstract: Representing the nodes of continuous-time temporal graphs in a low-dimensional latent space has wide-ranging applications, from prediction to visualization. Yet, analyzing continuous-time relational data with timestamped interactions introduces unique challenges due to its sparsity. Merely embedding nodes as trajectories in the latent space overlooks this sparsity, emphasizing the need to quantify… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Journal ref: IEEE Access ( Volume: 11, 2023) Page(s): 117971 - 117983

  4. Exploring the Performance of Continuous-Time Dynamic Link Prediction Algorithms

    Authors: Raphaël Romero, Maarten Buyl, Tijl De Bie, Jefrey Lijffijt

    Abstract: Dynamic Link Prediction (DLP) addresses the prediction of future links in evolving networks. However, accurately portraying the performance of DLP algorithms poses challenges that might impede progress in the field. Importantly, common evaluation pipelines usually calculate ranking or binary classification metrics, where the scores of observed interactions (positives) are compared with those of ra… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Journal ref: Appl. Sci. 2024, 14(8), 3516

  5. arXiv:2404.17597  [pdf, other

    cs.IR

    KamerRaad: Enhancing Information Retrieval in Belgian National Politics through Hierarchical Summarization and Conversational Interfaces

    Authors: Alexander Rogiers, Maarten Buyl, Bo Kang, Tijl De Bie

    Abstract: KamerRaad is an AI tool that leverages large language models to help citizens interactively engage with Belgian political information. The tool extracts and concisely summarizes key excerpts from parliamentary proceedings, followed by the potential for interaction based on generative AI that allows users to steadily build up their understanding. KamerRaad's front-end, built with Streamlit, facilit… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 4 pages, 2 figures, submitted to 2024 ECML-PKDD demo track

    ACM Class: H.3.3

  6. arXiv:2311.18486  [pdf, other

    cs.SI cs.AI

    New Perspectives on the Evaluation of Link Prediction Algorithms for Dynamic Graphs

    Authors: Raphaël Romero, Tijl De Bie, Jefrey Lijffijt

    Abstract: There is a fast-growing body of research on predicting future links in dynamic networks, with many new algorithms. Some benchmark data exists, and performance evaluations commonly rely on comparing the scores of observed network events (positives) with those of randomly generated ones (negatives). These evaluation measures depend on both the predictive ability of the model and, crucially, the type… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  7. arXiv:2311.04542  [pdf, other

    cs.IR cs.LG

    FEIR: Quantifying and Reducing Envy and Inferiority for Fair Recommendation of Limited Resources

    Authors: Nan Li, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: In settings such as e-recruitment and online dating, recommendation involves distributing limited opportunities, calling for novel approaches to quantify and enforce fairness. We introduce \emph{inferiority}, a novel (un)fairness measure quantifying a user's competitive disadvantage for their recommended items. Inferiority complements \emph{envy}, a fairness notion measuring preference for others'… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  8. arXiv:2310.17256  [pdf, other

    cs.LG

    fairret: a Framework for Differentiable Fairness Regularization Terms

    Authors: Maarten Buyl, MaryBeth Defrance, Tijl De Bie

    Abstract: Current fairness toolkits in machine learning only admit a limited range of fairness definitions and have seen little integration with automatic differentiation libraries, despite the central role these libraries play in modern machine learning pipelines. We introduce a framework of fairness regularization terms (fairrets) which quantify bias as modular, flexible objectives that are easily integ… ▽ More

    Submitted 10 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: Presented at ICLR 2024

  9. arXiv:2309.09708  [pdf, other

    cs.CL cs.AI

    LLM4Jobs: Unsupervised occupation extraction and standardization leveraging Large Language Models

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: Automated occupation extraction and standardization from free-text job postings and resumes are crucial for applications like job recommendation and labor market policy formation. This paper introduces LLM4Jobs, a novel unsupervised methodology that taps into the capabilities of large language models (LLMs) for occupation coding. LLM4Jobs uniquely harnesses both the natural language understanding… ▽ More

    Submitted 19 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

  10. arXiv:2308.09516  [pdf, other

    cs.IR

    ReCon: Reducing Congestion in Job Recommendation using Optimal Transport

    Authors: Yoosof Mashayekhi, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Recommender systems may suffer from congestion, meaning that there is an unequal distribution of the items in how often they are recommended. Some items may be recommended much more than others. Recommenders are increasingly used in domains where items have limited availability, such as the job market, where congestion is especially problematic: Recommending a vacancy -- for which typically only o… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

  11. arXiv:2304.11060  [pdf, other

    cs.CL cs.AI

    SkillGPT: a RESTful API service for skill extraction and standardization using a Large Language Model

    Authors: Nan Li, Bo Kang, Tijl De Bie

    Abstract: We present SkillGPT, a tool for skill extraction and standardization (SES) from free-style job descriptions and user profiles with an open-source Large Language Model (LLM) as backbone. Most previous methods for similar tasks either need supervision or rely on heavy data-preprocessing and feature engineering. Directly prompting the latest conversational LLM for standard skills, however, is slow, c… ▽ More

    Submitted 18 October, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

  12. arXiv:2304.06057  [pdf, other

    cs.CY cs.LG

    Maximal Fairness

    Authors: MaryBeth Defrance, Tijl De Bie

    Abstract: Fairness in AI has garnered quite some attention in research, and increasingly also in society. The so-called "Impossibility Theorem" has been one of the more striking research results with both theoretical and practical consequences, as it states that satisfying a certain combination of fairness measures is impossible. To date, this negative result has not yet been complemented with a positive on… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted at FAccT 2023

  13. arXiv:2301.03338  [pdf, other

    cs.LG

    Topologically Regularized Data Embeddings

    Authors: Edith Heiter, Robin Vandaele, Tijl De Bie, Yvan Saeys, Jefrey Lijffijt

    Abstract: Unsupervised representation learning methods are widely used for gaining insight into high-dimensional, unstructured, or structured data. In some cases, users may have prior topological knowledge about the data, such as a known cluster structure or the fact that the data is known to lie along a tree- or graph-structured topology. However, generic methods to ensure such structure is salient in the… ▽ More

    Submitted 7 November, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

    Comments: 52 pages, preprint, under review

  14. Inherent Limitations of AI Fairness

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: As the real-world impact of Artificial Intelligence (AI) systems has been steadily growing, so too have these systems come under increasing scrutiny. In response, the study of AI fairness has rapidly developed into a rich field of research with links to computer science, social science, law, and philosophy. Many technical solutions for measuring and achieving AI fairness have been proposed, yet th… ▽ More

    Submitted 9 June, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted for publication at the Communications of the ACM

  15. arXiv:2211.04734  [pdf, other

    cs.LG cs.CR q-bio.QM

    Framework Construction of an Adversarial Federated Transfer Learning Classifier

    Authors: Hang Yi, Tongxuan Bie, Tongjiang Yan

    Abstract: As the Internet grows in popularity, more and more classification jobs, such as IoT, finance industry and healthcare field, rely on mobile edge computing to advance machine learning. In the medical industry, however, good diagnostic accuracy necessitates the combination of large amounts of labeled data to train the model, which is difficult and expensive to collect and risks jeopardizing patients'… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  16. arXiv:2209.08064  [pdf, other

    cs.LG cs.SI

    A Systematic Evaluation of Node Embedding Robustness

    Authors: Alexandru Mara, Jefrey Lijffijt, Stephan Günnemann, Tijl De Bie

    Abstract: Node embedding methods map network nodes to low dimensional vectors that can be subsequently used in a variety of downstream prediction tasks. The popularity of these methods has grown significantly in recent years, yet, their robustness to perturbations of the input data is still poorly understood. In this paper, we assess the empirical robustness of node embedding models to random and adversaria… ▽ More

    Submitted 30 November, 2022; v1 submitted 16 September, 2022; originally announced September 2022.

  17. arXiv:2209.05112  [pdf, ps, other

    cs.IR

    A challenge-based survey of e-recruitment recommendation systems

    Authors: Yoosof Mashayekhi, Nan Li, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: E-recruitment recommendation systems recommend jobs to job seekers and job seekers to recruiters. The recommendations are generated based on the suitability of the job seekers for the positions as well as the job seekers' and the recruiters' preferences. Therefore, e-recruitment recommendation systems could greatly impact job seekers' careers. Moreover, by affecting the hiring processes of the com… ▽ More

    Submitted 20 October, 2023; v1 submitted 12 September, 2022; originally announced September 2022.

  18. arXiv:2203.07260  [pdf, other

    cs.LG

    Graph-Survival: A Survival Analysis Framework for Machine Learning on Temporal Networks

    Authors: Raphaël Romero, Bo Kang, Tijl De Bie

    Abstract: Continuous time temporal networks are attracting increasing attention due their omnipresence in real-world datasets and they manifold applications. While static network models have been successful in capturing static topological regularities, they often fail to model effects coming from the causal nature that explain the generation of networks. Exploiting the temporal aspect of networks has thus b… ▽ More

    Submitted 15 March, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

  19. arXiv:2202.12270  [pdf, other

    cs.CV cs.LG

    Evaluating Feature Attribution Methods in the Image Domain

    Authors: Arne Gevaert, Axel-Jan Rousseau, Thijs Becker, Dirk Valkenborg, Tijl De Bie, Yvan Saeys

    Abstract: Feature attribution maps are a popular approach to highlight the most important pixels in an image for a given prediction of a model. Despite a recent growth in popularity and available methods, little attention is given to the objective evaluation of such attribution maps. Building on previous work in this domain, we investigate existing metrics and propose new variants of metrics for the evaluat… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  20. arXiv:2202.03814  [pdf, other

    cs.LG stat.ML

    Optimal Transport of Classifiers to Fairness

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: In past work on fairness in machine learning, the focus has been on forcing the prediction of classifiers to have similar statistical properties for people of different demographics. To reduce the violation of these properties, fairness methods usually simply rescale the classifier scores, ignoring similarities and dissimilarities between members of different groups. Yet, we hypothesize that such… ▽ More

    Submitted 29 November, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  21. An Earth Mover's Distance Based Graph Distance Metric For Financial Statements

    Authors: Sander Noels, Benjamin Vandermarliere, Ken Bastiaensen, Tijl De Bie

    Abstract: Quantifying the similarity between a group of companies has proven to be useful for several purposes, including company benchmarking, fraud detection, and searching for investment opportunities. This exercise can be done using a variety of data sources, such as company activity data and financial data. However, ledger account data is widely available and is standardized to a large extent. Such led… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: 8 pages, 5 figures

    Journal ref: 2022 IEEE Symposium on Computational Intelligence for Financial Engineering and Economics (CIFEr)

  22. arXiv:2110.09193  [pdf, other

    cs.LG stat.ML

    Topologically Regularized Data Embeddings

    Authors: Robin Vandaele, Bo Kang, Jefrey Lijffijt, Tijl De Bie, Yvan Saeys

    Abstract: Unsupervised feature learning often finds low-dimensional embeddings that capture the structure of complex data. For tasks for which prior expert topological knowledge is available, incorporating this into the learned representation may lead to higher quality embeddings. For example, this may help one to embed the data into a given number of clusters, or to accommodate for noise that prevents one… ▽ More

    Submitted 7 March, 2022; v1 submitted 18 October, 2021; originally announced October 2021.

  23. arXiv:2109.10569  [pdf, other

    cs.LG stat.ML

    The Curse Revisited: When are Distances Informative for the Ground Truth in Noisy High-Dimensional Data?

    Authors: Robin Vandaele, Bo Kang, Tijl De Bie, Yvan Saeys

    Abstract: Distances between data points are widely used in machine learning applications. Yet, when corrupted by noise, these distances -- and thus the models based upon them -- may lose their usefulness in high dimensions. Indeed, the small marginal effects of the noise may then accumulate quickly, shifting empirical closest and furthest neighbors away from the ground truth. In this paper, we exactly chara… ▽ More

    Submitted 7 March, 2022; v1 submitted 22 September, 2021; originally announced September 2021.

  24. arXiv:2107.01936  [pdf, other

    cs.SI cs.LG

    Adversarial Robustness of Probabilistic Network Embedding for Link Prediction

    Authors: Xi Chen, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: In today's networked society, many real-world problems can be formalized as predicting links in networks, such as Facebook friendship suggestions, e-commerce recommendations, and the prediction of scientific collaborations in citation networks. Increasingly often, link prediction problem is tackled by means of network embedding methods, owing to their state-of-the-art performance. However, these m… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

  25. arXiv:2105.05699  [pdf, other

    cs.DB cs.LG

    Automating Data Science: Prospects and Challenges

    Authors: Tijl De Bie, Luc De Raedt, José Hernández-Orallo, Holger H. Hoos, Padhraic Smyth, Christopher K. I. Williams

    Abstract: Given the complexity of typical data science projects and the associated demand for human expertise, automation has the potential to transform the data science process. Key insights: * Automation in data science aims to facilitate and transform the work of data scientists, not to replace them. * Important parts of data science are already being automated, especially in the modeling stages, w… ▽ More

    Submitted 28 February, 2022; v1 submitted 12 May, 2021; originally announced May 2021.

    Comments: 19 pages, 3 figures. v1 accepted for publication (April 2021) in Communications of the ACM

    Journal ref: Communications of the ACM 65(3) 76-87 (2022)

  26. arXiv:2103.01846  [pdf, other

    cs.LG

    The KL-Divergence between a Graph Model and its Fair I-Projection as a Fairness Regularizer

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: Learning and reasoning over graphs is increasingly done by means of probabilistic models, e.g. exponential random graph models, graph embedding models, and graph neural networks. When graphs are modeling relations between people, however, they will inevitably reflect biases, prejudices, and other forms of inequity and inequality. An important challenge is thus to design accurate graph modeling app… ▽ More

    Submitted 27 June, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

  27. arXiv:2005.10701  [pdf, other

    cs.SI cs.LG stat.ML

    CSNE: Conditional Signed Network Embedding

    Authors: Alexandru Mara, Yoosof Mashayekhi, Jefrey Lijffijt, Tijl De Bie

    Abstract: Signed networks are mathematical structures that encode positive and negative relations between entities such as friend/foe or trust/distrust. Recently, several papers studied the construction of useful low-dimensional representations (embeddings) of these networks for the prediction of missing relations or signs. Existing embedding methods for sign prediction generally enforce different notions o… ▽ More

    Submitted 25 May, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

  28. Benchmarking Network Embedding Models for Link Prediction: Are We Making Progress?

    Authors: Alexandru Mara, Jefrey Lijffijt, Tijl De Bie

    Abstract: Network embedding methods map a network's nodes to vectors in an embedding space, in such a way that these representations are useful for estimating some notion of similarity or proximity between pairs of nodes in the network. The quality of these node representations is then showcased through results of downstream prediction tasks. Commonly used benchmark tasks such as link prediction, however, p… ▽ More

    Submitted 3 September, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

  29. arXiv:2002.11442  [pdf, other

    cs.LG stat.ML

    DeBayes: a Bayesian Method for Debiasing Network Embeddings

    Authors: Maarten Buyl, Tijl De Bie

    Abstract: As machine learning algorithms are increasingly deployed for high-impact automated decision making, ethical and increasingly also legal standards demand that they treat all individuals fairly, without discrimination based on their age, gender, race or other sensitive traits. In recent years much progress has been made on ensuring fairness and reducing bias in standard machine learning settings. Ye… ▽ More

    Submitted 30 April, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

  30. arXiv:2002.10127  [pdf, other

    cs.LG cs.CL stat.ML

    FONDUE: A Framework for Node Disambiguation Using Network Embeddings

    Authors: Ahmad Mel, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Real-world data often presents itself in the form of a network. Examples include social networks, citation networks, biological networks, and knowledge graphs. In their simplest form, networks represent real-life entities (e.g. people, papers, proteins, concepts) as nodes, and describe them in terms of their relations with other entities by means of edges between these nodes. This can be valuable… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: 11 pages, 3 figures

  31. Block-Approximated Exponential Random Graphs

    Authors: Florian Adriaens, Alexandru Mara, Jefrey Lijffijt, Tijl De Bie

    Abstract: An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs. By utilizing fast matrix block-approximation techniques, we propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions, while being able to meaningfully model both local information of the graph (e.g.,… ▽ More

    Submitted 26 August, 2020; v1 submitted 14 February, 2020; originally announced February 2020.

    Comments: Accepted for DSAA 2020 conference

  32. arXiv:2002.01227  [pdf, other

    cs.LG cs.IT stat.ML

    ALPINE: Active Link Prediction using Network Embedding

    Authors: Xi Chen, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on… ▽ More

    Submitted 4 February, 2020; originally announced February 2020.

  33. arXiv:2002.00793  [pdf, other

    cs.SI cs.LG stat.ML

    Explainable Subgraphs with Surprising Densities: A Subgroup Discovery Approach

    Authors: Junning Deng, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: The connectivity structure of graphs is typically related to the attributes of the nodes. In social networks for example, the probability of a friendship between two people depends on their attributes, such as their age, address, and hobbies. The connectivity of a graph can thus possibly be understood in terms of patterns of the form 'the subgroup of individuals with properties X are often (or rar… ▽ More

    Submitted 10 January, 2020; originally announced February 2020.

  34. FACE: Feasible and Actionable Counterfactual Explanations

    Authors: Rafael Poyiadzi, Kacper Sokol, Raul Santos-Rodriguez, Tijl De Bie, Peter Flach

    Abstract: Work in Counterfactual Explanations tends to focus on the principle of "the closest possible world" that identifies small changes leading to the desired outcome. In this paper we argue that while this approach might initially seem intuitively appealing it exhibits shortcomings not addressed in the current literature. First, a counterfactual example generated by the state-of-the-art systems is not… ▽ More

    Submitted 24 February, 2020; v1 submitted 20 September, 2019; originally announced September 2019.

    Comments: Presented at AAAI/ACM Conference on AI, Ethics, and Society 2020

  35. Discovering Interesting Cycles in Directed Graphs

    Authors: Florian Adriaens, Cigdem Aslay, Tijl De Bie, Aristides Gionis, Jefrey Lijffijt

    Abstract: Cycles in graphs often signify interesting processes. For example, cyclic trading patterns can indicate inefficiencies or economic dependencies in trade networks, cycles in food webs can identify fragile dependencies in ecosystems, and cycles in financial transaction networks can be an indication of money laundering. Identifying such interesting cycles, which can also be constrained to contain a g… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: Accepted for CIKM'19

  36. arXiv:1905.10086  [pdf, other

    cs.LG stat.ML

    Conditional t-SNE: Complementary t-SNE embeddings through factoring out prior information

    Authors: Bo Kang, Darío García García, Jefrey Lijffijt, Raúl Santos-Rodríguez, Tijl De Bie

    Abstract: Dimensionality reduction and manifold learning methods such as t-Distributed Stochastic Neighbor Embedding (t-SNE) are routinely used to map high-dimensional data into a 2-dimensional space to visualize and explore the data. However, two dimensions are typically insufficient to capture all structure in the data, the salient structure is often already known, and it is not obvious how to extract the… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  37. arXiv:1905.03040  [pdf, other

    cs.SI

    Mining Subjectively Interesting Attributed Subgraphs

    Authors: Anes Bendimerad, Ahmad Mel, Jefrey Lijffijt, Marc Plantevit, Céline Robardet, Tijl De Bie

    Abstract: Community detection in graphs, data clustering, and local pattern mining are three mature fields of data mining and machine learning. In recent years, attributed subgraph mining is emerging as a new powerful data mining task in the intersection of these areas. Given a graph and a set of attributes for each vertex, attributed subgraph mining aims to find cohesive subgraphs for which (a subset of) t… ▽ More

    Submitted 19 April, 2019; originally announced May 2019.

    Comments: International Workshop On Mining And Learning With Graphs, held with SIGKDD 2018

  38. arXiv:1904.12694  [pdf, other

    cs.LG stat.ML

    ExplaiNE: An Approach for Explaining Network Embedding-based Link Predictions

    Authors: Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Networks are powerful data structures, but are challenging to work with for conventional machine learning methods. Network Embedding (NE) methods attempt to resolve this by learning vector representations for the nodes, for subsequent use in downstream machine learning tasks. Link Prediction (LP) is one such downstream machine learning task that is an important use case and popular benchmark for… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

  39. arXiv:1903.11535  [pdf, other

    cs.SI

    Opinion Dynamics with Backfire Effect and Biased Assimilation

    Authors: Xi Chen, Panayiotis Tsaparas, Jefrey Lijffijt, Tijl De Bie

    Abstract: The democratization of AI tools for content generation, combined with unrestricted access to mass media for all (e.g. through microblogging and social media), makes it increasingly hard for people to distinguish fact from fiction. This raises the question of how individual opinions evolve in such a networked environment without grounding in a known reality. The dominant approach to studying this p… ▽ More

    Submitted 27 March, 2019; originally announced March 2019.

  40. EvalNE: A Framework for Evaluating Network Embeddings on Link Prediction

    Authors: Alexandru Mara, Jefrey Lijffijt, Tijl De Bie

    Abstract: In this paper we present EvalNE, a Python toolbox for evaluating network embedding methods on link prediction tasks. Link prediction is one of the most popular choices for evaluating the quality of network embeddings. However, the complexity of this task requires a carefully designed evaluation pipeline in order to provide consistent, reproducible and comparable results. EvalNE simplifies this pro… ▽ More

    Submitted 22 January, 2019; originally announced January 2019.

  41. arXiv:1805.07544  [pdf, other

    stat.ML cs.IT cs.LG

    Conditional Network Embeddings

    Authors: Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Network Embeddings (NEs) map the nodes of a given network into $d$-dimensional Euclidean space $\mathbb{R}^d$. Ideally, this map** is such that `similar' nodes are mapped onto nearby points, such that the NE can be used for purposes such as link prediction (if `similar' means being `more likely to be connected') or classification (if `similar' means `being more likely to have the same label'). I… ▽ More

    Submitted 16 October, 2018; v1 submitted 19 May, 2018; originally announced May 2018.

  42. From acquaintance to best friend forever: robust and fine-grained inference of social tie strengths

    Authors: Florian Adriaens, Tijl De Bie, Aristides Gionis, Jefrey Lijffijt, Polina Rozenshtein

    Abstract: Social networks often provide only a binary perspective on social ties: two individuals are either connected or not. While sometimes external information can be used to infer the strength of social ties, access to such information may be restricted or impractical. Sintos and Tsaparas (KDD 2014) first suggested to infer the strength of social ties from the topology of the network alone, by leveragi… ▽ More

    Submitted 18 September, 2018; v1 submitted 10 February, 2018; originally announced February 2018.

    Journal ref: Data Min. Knowl. Discov. 34(3): 611-651 (2020)

  43. arXiv:1710.08167  [pdf, other

    stat.ML cs.IT cs.LG

    Interactive Visual Data Exploration with Subjective Feedback: An Information-Theoretic Approach

    Authors: Kai Puolamäki, Emilia Oikarinen, Bo Kang, Jefrey Lijffijt, Tijl De Bie

    Abstract: Visual exploration of high-dimensional real-valued datasets is a fundamental task in exploratory data analysis (EDA). Existing methods use predefined criteria to choose the representation of data. There is a lack of methods that (i) elicit from the user what she has learned from the data and (ii) show patterns that she does not know yet. We construct a theoretical model where identified patterns c… ▽ More

    Submitted 23 October, 2017; originally announced October 2017.

    Comments: 12 pages, 9 figures, 2 tables, conference submission

    Journal ref: Data Mining and Knowledge Discovery 34 (2020) 21-49

  44. Subjectively Interesting Subgroup Discovery on Real-valued Targets

    Authors: Jefrey Lijffijt, Bo Kang, Wouter Duivesteijn, Kai Puolamäki, Emilia Oikarinen, Tijl De Bie

    Abstract: Deriving insights from high-dimensional data is one of the core problems in data mining. The difficulty mainly stems from the fact that there are exponentially many variable combinations to potentially consider, and there are infinitely many if we consider weighted combinations, even for linear combinations. Hence, an obvious question is whether we can automate the search for interesting patterns… ▽ More

    Submitted 12 October, 2017; originally announced October 2017.

    Comments: 12 pages, 10 figures, 2 tables, conference submission

  45. arXiv:1511.08762  [pdf, other

    cs.LG cs.IR math.ST

    Informative Data Projections: A Framework and Two Examples

    Authors: Tijl De Bie, Jefrey Lijffijt, Raul Santos-Rodriguez, Bo Kang

    Abstract: Methods for Projection Pursuit aim to facilitate the visual exploration of high-dimensional data by identifying interesting low-dimensional projections. A major challenge is the design of a suitable quality metric of projections, commonly referred to as the projection index, to be maximized by the Projection Pursuit algorithm. In this paper, we introduce a new information-theoretic strategy for ta… ▽ More

    Submitted 27 November, 2015; originally announced November 2015.

  46. arXiv:1109.0420  [pdf, ps, other

    cs.IR

    Meta-song evaluation for chord recognition

    Authors: Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez, Tijl De Bie

    Abstract: We present a new approach to evaluate chord recognition systems on songs which do not have full annotations. The principle is to use online chord databases to generate high accurate "pseudo annotations" for these songs and compute "pseudo accuracies" of test systems. Statistical models that model the relationship between "pseudo accuracy" and real performance are then applied to estimate test syst… ▽ More

    Submitted 2 September, 2011; originally announced September 2011.

    Comments: technique report and preparation for conference

  47. arXiv:1107.4969  [pdf, ps, other

    cs.SD cs.AI cs.MM

    An end-to-end machine learning system for harmonic analysis of music

    Authors: Yizhao Ni, Matt Mcvicar, Raul Santos-Rodriguez, Tijl De Bie

    Abstract: We present a new system for simultaneous estimation of keys, chords, and bass notes from music audio. It makes use of a novel chromagram representation of audio that takes perception of loudness into account. Furthermore, it is fully based on machine learning (instead of expert knowledge), such that it is potentially applicable to a wider range of genres as long as training data is available. As c… ▽ More

    Submitted 25 July, 2011; originally announced July 2011.

    Comments: MIREX report and preparation of Journal submission

  48. arXiv:1106.4475  [pdf, ps, other

    cs.DB cs.DS cs.SI

    Interesting Multi-Relational Patterns

    Authors: Eirini Spyropoulou, Tijl De Bie

    Abstract: Mining patterns from multi-relational data is a problem attracting increasing interest within the data mining community. Traditional data mining approaches are typically developed for highly simplified types of data, such as an attribute-value table or a binary database, such that those methods are not directly applicable to multi-relational data. Nevertheless, multi-relational data is a more trut… ▽ More

    Submitted 12 September, 2011; v1 submitted 22 June, 2011; originally announced June 2011.

    Comments: Accepted at ICDM'11

  49. arXiv:1008.3314  [pdf, ps, other

    cs.AI

    Maximum entropy models and subjective interestingness: an application to tiles in binary databases

    Authors: Tijl De Bie

    Abstract: Recent research has highlighted the practical benefits of subjective interestingness measures, which quantify the novelty or unexpectedness of a pattern when contrasted with any prior information of the data miner (Silberschatz and Tuzhilin, 1995; Geng and Hamilton, 2006). A key challenge here is the formalization of this prior information in a way that lends itself to the definition of an inter… ▽ More

    Submitted 19 August, 2010; originally announced August 2010.

    Comments: 43 pages, submitted

    Report number: University of Bristol Tech. Rep. 125861

  50. arXiv:1006.0849  [pdf, ps, other

    cs.DS stat.ML

    Reconstruction of Causal Networks by Set Covering

    Authors: Nick Fyson, Tijl De Bie, Nello Cristianini

    Abstract: We present a method for the reconstruction of networks, based on the order of nodes visited by a stochastic branching process. Our algorithm reconstructs a network of minimal size that ensures consistency with the data. Crucially, we show that global consistency with the data can be achieved through purely local considerations, inferring the neighbourhood of each node in turn. The optimisation pro… ▽ More

    Submitted 4 June, 2010; originally announced June 2010.

    Comments: Under consideration for the ECML PKDD 2010 conference