Search | arXiv e-print repository

Deep learning-based auto-segmentation of paraganglioma for growth monitoring

Authors: E. M. C. Sijben, J. C. Jansen, M. de Ridder, P. A. N. Bosman, T. Alderliesten

Abstract: Volume measurement of a paraganglioma (a rare neuroendocrine tumor that typically forms along major blood vessels and nerve pathways in the head and neck region) is crucial for monitoring and modeling tumor growth in the long term. However, in clinical practice, using available tools to do these measurements is time-consuming and suffers from tumor-shape assumptions and observer-to-observer variat… ▽ More Volume measurement of a paraganglioma (a rare neuroendocrine tumor that typically forms along major blood vessels and nerve pathways in the head and neck region) is crucial for monitoring and modeling tumor growth in the long term. However, in clinical practice, using available tools to do these measurements is time-consuming and suffers from tumor-shape assumptions and observer-to-observer variation. Growth modeling could play a significant role in solving a decades-old dilemma (stemming from uncertainty regarding how the tumor will develop over time). By giving paraganglioma patients treatment, severe symptoms can be prevented. However, treating patients who do not actually need it, comes at the cost of unnecessary possible side effects and complications. Improved measurement techniques could enable growth model studies with a large amount of tumor volume data, possibly giving valuable insights into how these tumors develop over time. Therefore, we propose an automated tumor volume measurement method based on a deep learning segmentation model using no-new-UNnet (nnUNet). We assess the performance of the model based on visual inspection by a senior otorhinolaryngologist and several quantitative metrics by comparing model outputs with manual delineations, including a comparison with variation in manual delineation by multiple observers. Our findings indicate that the automatic method performs (at least) equal to manual delineation. Finally, using the created model, and a linking procedure that we propose to track the tumor over time, we show how additional volume measurements affect the fit of known growth functions. △ Less

Submitted 19 March, 2024; originally announced April 2024.

arXiv:2402.18144 [pdf]

Random Silicon Sampling: Simulating Human Sub-Population Opinion Using a Large Language Model Based on Group-Level Demographic Information

Authors: Seungjong Sun, Eungu Lee, Dongyan Nan, Xiangying Zhao, Wonbyung Lee, Bernard J. Jansen, Jang Hyun Kim

Abstract: Large language models exhibit societal biases associated with demographic information, including race, gender, and others. Endowing such language models with personalities based on demographic data can enable generating opinions that align with those of humans. Building on this idea, we propose "random silicon sampling," a method to emulate the opinions of the human population sub-group. Our study… ▽ More Large language models exhibit societal biases associated with demographic information, including race, gender, and others. Endowing such language models with personalities based on demographic data can enable generating opinions that align with those of humans. Building on this idea, we propose "random silicon sampling," a method to emulate the opinions of the human population sub-group. Our study analyzed 1) a language model that generates the survey responses that correspond with a human group based solely on its demographic distribution and 2) the applicability of our methodology across various demographic subgroups and thematic questions. Through random silicon sampling and using only group-level demographic information, we discovered that language models can generate response distributions that are remarkably similar to the actual U.S. public opinion polls. Moreover, we found that the replicability of language models varies depending on the demographic group and topic of the question, and this can be attributed to inherent societal biases in the models. Our findings demonstrate the feasibility of mirroring a group's opinion using only demographic distribution and elucidate the effect of social biases in language models on such simulations. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 25 pages, 4 figures, 19 Tables

ACM Class: I.2.7

arXiv:2402.12510 [pdf, ps, other]

Function Class Learning with Genetic Programming: Towards Explainable Meta Learning for Tumor Growth Functionals

Authors: E. M. C. Sijben, J. C. Jansen, P. A. N. Bosman, T. Alderliesten

Abstract: Paragangliomas are rare, primarily slow-growing tumors for which the underlying growth pattern is unknown. Therefore, determining the best care for a patient is hard. Currently, if no significant tumor growth is observed, treatment is often delayed, as treatment itself is not without risk. However, by doing so, the risk of (irreversible) adverse effects due to tumor growth may increase. Being able… ▽ More Paragangliomas are rare, primarily slow-growing tumors for which the underlying growth pattern is unknown. Therefore, determining the best care for a patient is hard. Currently, if no significant tumor growth is observed, treatment is often delayed, as treatment itself is not without risk. However, by doing so, the risk of (irreversible) adverse effects due to tumor growth may increase. Being able to predict the growth accurately could assist in determining whether a patient will need treatment during their lifetime and, if so, the timing of this treatment. The aim of this work is to learn the general underlying growth pattern of paragangliomas from multiple tumor growth data sets, in which each data set contains a tumor's volume over time. To do so, we propose a novel approach based on genetic programming to learn a function class, i.e., a parameterized function that can be fit anew for each tumor. We do so in a unique, multi-modal, multi-objective fashion to find multiple potentially interesting function classes in a single run. We evaluate our approach on a synthetic and a real-world data set. By analyzing the resulting function classes, we can effectively explain the general patterns in the data. △ Less

Submitted 9 April, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

arXiv:2309.12329 [pdf]

Mono/Multi-material Characterization Using Hyperspectral Images and Multi-Block Non-Negative Matrix Factorization

Authors: Mahdiyeh Ghaffari, Gerjen H. Tinnevelt, Marcel C. P. van Eijk, Stanislav Podchezertsev, Geert J. Postma, Jeroen J. Jansen

Abstract: Plastic sorting is a very essential step in waste management, especially due to the presence of multilayer plastics. These monomaterial and multimaterial plastics are widely employed to enhance the functional properties of packaging, combining beneficial properties in thickness, mechanical strength, and heat tolerance. However, materials containing multiple polymer species need to be pretreated be… ▽ More Plastic sorting is a very essential step in waste management, especially due to the presence of multilayer plastics. These monomaterial and multimaterial plastics are widely employed to enhance the functional properties of packaging, combining beneficial properties in thickness, mechanical strength, and heat tolerance. However, materials containing multiple polymer species need to be pretreated before they can be recycled as monomaterials and therefore should not end up in monomaterial streams. Industry 4.0 has significantly improved materials sorting of plastic packaging in speed and accuracy compared to manual sorting, specifically through Near Infrared Hyperspectral Imaging (NIRHSI) that provides an automated, fast, and accurate material characterization, without sample preparation. Identification of multimaterials with HSI however requires novel dedicated approaches for chemical pattern recognition. Non negative Matrix Factorization, NMF, is widely used for the chemical resolution of hyperspectral images. Chemically relevant model constraints may make it specifically valuable to identify multilayer plastics through HSI. Specifically, Multi Block Non Negative Matrix Factorization (MBNMF) with correspondence among different chemical species constraint may be used to evaluate the presence or absence of particular polymer species. To translate the MBNMF model into an evidence based sorting decision, we extended the model with an F test to distinguish between monomaterial and multimaterial objects. The benefits of our new approach, MBNMF, were illustrated by the identification of several plastic waste objects. △ Less

Submitted 15 August, 2023; originally announced September 2023.

arXiv:2308.14776 [pdf]

Systematic reduction of Hyperspectral Images for high-throughput Plastic Characterization

Authors: Mahdiyeh Ghaffari, Mickey C. J. Lukkien, Nematollah Omidikia, Gerjen H. Tinnevelt, Marcel C. P. van Eijk, Jeroen J. Jansen

Abstract: Hyperspectral Imaging (HSI) combines microscopy and spectroscopy to assess the spatial distribution of spectroscopically active compounds in objects, and has diverse applications in food quality control, pharmaceutical processes, and waste sorting. However, due to the large size of HSI datasets, it can be challenging to analyze and store them within a reasonable digital infrastructure, especially… ▽ More Hyperspectral Imaging (HSI) combines microscopy and spectroscopy to assess the spatial distribution of spectroscopically active compounds in objects, and has diverse applications in food quality control, pharmaceutical processes, and waste sorting. However, due to the large size of HSI datasets, it can be challenging to analyze and store them within a reasonable digital infrastructure, especially in waste sorting where speed and data storage resources are limited. Additionally, as with most spectroscopic data, there is significant redundancy, making pixel and variable selection crucial for retaining chemical information. Recent high-tech developments in chemometrics enable automated and evidence-based data reduction, which can substantially enhance the speed and performance of Non-Negative Matrix Factorization (NMF), a widely used algorithm for chemical resolution of HSI data. By recovering the pure contribution maps and spectral profiles of distributed compounds, NMF can provide evidence-based sorting decisions for efficient waste management. To improve the quality and efficiency of data analysis on hyperspectral imaging (HSI) data, we apply a convex-hull method to select essential pixels and wavelengths and remove uninformative and redundant information. This process minimizes computational strain and effectively eliminates highly mixed pixels. By reducing data redundancy, data investigation and analysis become more straightforward, as demonstrated in both simulated and real HSI data for plastic sorting. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2306.02984 [pdf, other]

A Deep Learning Approach Utilizing Covariance Matrix Analysis for the ISBI Edited MRS Reconstruction Challenge

Authors: Julian P. Merkofer, Dennis M. J. van de Sande, Sina Amirrajab, Gerhard S. Drenthen, Mitko Veta, Jacobus F. A. Jansen, Marcel Breeuwer, Ruud J. G. van Sloun

Abstract: This work proposes a method to accelerate the acquisition of high-quality edited magnetic resonance spectroscopy (MRS) scans using machine learning models taking the sample covariance matrix as input. The method is invariant to the number of transients and robust to noisy input data for both synthetic as well as in-vivo scenarios. This work proposes a method to accelerate the acquisition of high-quality edited magnetic resonance spectroscopy (MRS) scans using machine learning models taking the sample covariance matrix as input. The method is invariant to the number of transients and robust to noisy input data for both synthetic as well as in-vivo scenarios. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2211.00558 [pdf, other]

Contextual Mixture of Experts: Integrating Knowledge into Predictive Modeling

Authors: Francisco Souza, Tim Offermans, Ruud Barendse, Geert Postma, Jeroen Jansen

Abstract: This work proposes a new data-driven model devised to integrate process knowledge into its structure to increase the human-machine synergy in the process industry. The proposed Contextual Mixture of Experts (cMoE) explicitly uses process knowledge along the model learning stage to mold the historical data to represent operators' context related to the process through possibility distributions. Thi… ▽ More This work proposes a new data-driven model devised to integrate process knowledge into its structure to increase the human-machine synergy in the process industry. The proposed Contextual Mixture of Experts (cMoE) explicitly uses process knowledge along the model learning stage to mold the historical data to represent operators' context related to the process through possibility distributions. This model was evaluated in two real case studies for quality prediction, including a sulfur recovery unit and a polymerization process. The contextual mixture of experts was employed to represent different contexts in both experiments. The results indicate that integrating process knowledge has increased predictive performance while improving interpretability by providing insights into the variables affecting the process's different regimes. △ Less

Submitted 1 November, 2022; originally announced November 2022.

arXiv:2205.04906 [pdf, other]

Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication

Authors: Shishir Subramanyam, Irene Viola, Jack Jansen, Evangelos Alexiou, Alan Hanjalic, Pablo Cesar

Abstract: Remote communication has rapidly become a part of everyday life in both professional and personal contexts. However, popular video conferencing applications present limitations in terms of quality of communication, immersion and social meaning. VR remote communication applications offer a greater sense of co-presence and mutual sensing of emotions between remote users. Previous research on these a… ▽ More Remote communication has rapidly become a part of everyday life in both professional and personal contexts. However, popular video conferencing applications present limitations in terms of quality of communication, immersion and social meaning. VR remote communication applications offer a greater sense of co-presence and mutual sensing of emotions between remote users. Previous research on these applications has shown that realistic point cloud user reconstructions offer better immersion and communication as compared to synthetic user avatars. However, photorealistic point clouds require a large volume of data per frame and are challenging to transmit over bandwidth-limited networks. Recent research has demonstrated significant improvements to perceived quality by optimizing the usage of bandwidth based on the position and orientation of the user's viewport with user-adaptive streaming. In this work, we developed a real-time VR communication application with an adaptation engine that features tiled user-adaptive streaming based on user behaviour. The application also supports traditional network adaptive streaming. The contribution of this work is to evaluate the impact of tiled user-adaptive streaming on quality of communication, visual quality, system performance and task completion in a functional live VR remote communication system. We perform a subjective evaluation with 33 users to compare the different streaming conditions with a neck exercise training task. As a baseline, we use uncompressed streaming requiring ca. 300Mbps and our solution achieves similar visual quality with tiled adaptive streaming at 14Mbps. We also demonstrate statistically significant gains to the quality of interaction and improvements to system performance and CPU consumption with tiled adaptive streaming as compared to the more traditional network adaptive streaming. △ Less

Submitted 10 May, 2022; originally announced May 2022.

arXiv:2203.02200 [pdf]

doi 10.1108/IntR-10-2017-0377

Aggregate effects of advertising decisions: a complex systems look at search engine advertising via an experimental study

Authors: Yanwu Yang, Xin Li, Bernard J. Jansen, Daniel Zeng

Abstract: Purpose: We model group advertising decisions, which are the collective decisions of every single advertiser within the set of advertisers who are competing in the same auction or vertical industry, and examine resulting market outcomes, via a proposed simulation framework named EXP-SEA (Experimental Platform for Search Engine Advertising) supporting experimental studies of collective behaviors in… ▽ More Purpose: We model group advertising decisions, which are the collective decisions of every single advertiser within the set of advertisers who are competing in the same auction or vertical industry, and examine resulting market outcomes, via a proposed simulation framework named EXP-SEA (Experimental Platform for Search Engine Advertising) supporting experimental studies of collective behaviors in the context of search engine advertising. Design: We implement the EXP-SEA to validate the proposed simulation framework, also conduct three experimental studies on the aggregate impact of electronic word-of-mouth, the competition level, and strategic bidding behaviors. EXP-SEA supports heterogeneous participants, various auction mechanisms, and also ranking and pricing algorithms. Findings: Findings from our three experiments show that (a) both the market profit and advertising indexes such as number of impressions and number of clicks are larger when the eWOM effect presents, meaning social media certainly has some effect on search engine advertising outcomes, (b) the competition level has a monotonic increasing effect on the market performance, thus search engines have an incentive to encourage both the eWOM among search users and competition among advertisers, and (c) given the market-level effect of the percentage of advertisers employing a dynamic greedy bidding strategy, there is a cut-off point for strategic bidding behaviors. Originality: This is one of the first research works to explore collective group decisions and resulting phenomena in the complex context of search engine advertising via develo** and validating a simulation framework that supports assessments of various advertising strategies and estimations of the impact of mechanisms on the search market. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: 26 pages, 7 figures, 5 tables

MSC Class: 68Txx ACM Class: I.2.6

Journal ref: Internet Research, 28(4), 1079-1102 (2018)

arXiv:2202.13506 [pdf]

doi 10.1109/MIS.2019.2893590

Keyword Optimization in Sponsored Search Advertising: A Multi-Level Computational Framework

Authors: Yanwu Yang, Bernard J. Jansen, Yinghui Yang, Xunhua Guo, Daniel Zeng

Abstract: In sponsored search advertising, keywords serve as an essential bridge linking advertisers, search users and search engines. Advertisers have to deal with a series of keyword decisions throughout the entire lifecycle of search advertising campaigns. This paper proposes a multi-level and closed-form computational framework for keyword optimization (MKOF) to support various keyword decisions. Based… ▽ More In sponsored search advertising, keywords serve as an essential bridge linking advertisers, search users and search engines. Advertisers have to deal with a series of keyword decisions throughout the entire lifecycle of search advertising campaigns. This paper proposes a multi-level and closed-form computational framework for keyword optimization (MKOF) to support various keyword decisions. Based on this framework, we develop corresponding optimization strategies for keyword targeting, keyword assignment and keyword grou** at different levels (e.g., market, campaign and adgroup). With two real-world datasets obtained from past search advertising campaigns, we conduct computational experiments to evaluate our keyword optimization framework and instantiated strategies. Experimental results show that our method can approach the optimal solution in a steady way, and it outperforms two baseline keyword strategies commonly used in practice. The proposed MKOF framework also provides a valid experimental environment to implement and assess various keyword strategies in sponsored search advertising. △ Less

Submitted 27 February, 2022; originally announced February 2022.

Comments: 21 pages, 3 figures,1 table

MSC Class: 68Txx ACM Class: I.2.6

Journal ref: IEEE Intelligent Systems, 34(1), 32 - 42 (2019)

arXiv:2008.06809 [pdf, other]

How Search Engine Advertising Affects Sales over Time: An Empirical Investigation

Authors: Yanwu Yang, Kang Zhao, Daniel Zeng, Bernard Jim Jansen

Abstract: As a mainstream marketing channel on the Internet, Search Engine Advertising (SEA) has a huge business impact and attracts a plethora of attention from both academia and industry. One important goal of advertising is to increase sales. Nevertheless, while previous research has studied multiple factors that are potentially related to the outcome of SEA campaigns, effects of these factors on actual… ▽ More As a mainstream marketing channel on the Internet, Search Engine Advertising (SEA) has a huge business impact and attracts a plethora of attention from both academia and industry. One important goal of advertising is to increase sales. Nevertheless, while previous research has studied multiple factors that are potentially related to the outcome of SEA campaigns, effects of these factors on actual sales generated by SEA remain understudied. It is also unclear whether and how such effects change over time in highly dynamic SEA campaigns. As the first empirical investigation of the dynamic advertisement-sales relationship in SEA, this study builds an advertising response model within a time-varying coefficient (TVC) modeling framework, and estimates the model using a unique dataset from a large E-Commerce retailer in the United States. Results reveal the effects of the advertising expenditure, consumer behaviors and advertisement characteristics on realized sales, and demonstrate that such effects on sales do change over time in non-linear ways. More importantly, we find that carryover has a stronger effect in generating sales than direct response does, conversion rate is much more important than click-through rate, and ad position does not have significant effects on sales. These findings have direct implications for advertisers to launch more effective SEA campaigns. △ Less

Submitted 15 August, 2020; originally announced August 2020.

arXiv:1910.06635 [pdf, other]

doi 10.1117/1.JMI.6.4.044003

Liver segmentation and metastases detection in MR images using convolutional neural networks

Authors: Mariëlle J. A. Jansen, Hugo J. Kuijf, Maarten Niekel, Wouter B. Veldhuis, Frank J. Wessels, Max A. Viergever, Josien P. W. Pluim

Abstract: Primary tumors have a high likelihood of develo** metastases in the liver and early detection of these metastases is crucial for patient outcome. We propose a method based on convolutional neural networks (CNN) to detect liver metastases. First, the liver was automatically segmented using the six phases of abdominal dynamic contrast enhanced (DCE) MR images. Next, DCE-MR and diffusion weighted (… ▽ More Primary tumors have a high likelihood of develo** metastases in the liver and early detection of these metastases is crucial for patient outcome. We propose a method based on convolutional neural networks (CNN) to detect liver metastases. First, the liver was automatically segmented using the six phases of abdominal dynamic contrast enhanced (DCE) MR images. Next, DCE-MR and diffusion weighted (DW) MR images are used for metastases detection within the liver mask. The liver segmentations have a median Dice similarity coefficient of 0.95 compared with manual annotations. The metastases detection method has a sensitivity of 99.8% with a median of 2 false positives per image. The combination of the two MR sequences in a dual pathway network is proven valuable for the detection of liver metastases. In conclusion, a high quality liver segmentation can be obtained in which we can successfully detect liver metastases. △ Less

Submitted 15 October, 2019; originally announced October 2019.

Journal ref: J. Med. Imag. 6(4), 044003 (2019)

arXiv:1908.08254 [pdf, other]

doi 10.1117/12.2253842

Motion correction of dynamic contrast enhanced MRI of the liver

Authors: Mariëlle J. A. Jansen, Wouter B. Veldhuis, Maarten S. van Leeuwen, Josien P. W. Pluim

Abstract: Motion correction of dynamic contrast enhanced magnetic resonance images (DCE-MRI) is a challenging task, due to changes in image appearance. In this study a groupwise registration, using a principle component analysis (PCA) based metric,1 is evaluated for clinical DCE MRI of the liver. The groupwise registration transforms the images to a common space, rather than to a reference volume as convent… ▽ More Motion correction of dynamic contrast enhanced magnetic resonance images (DCE-MRI) is a challenging task, due to changes in image appearance. In this study a groupwise registration, using a principle component analysis (PCA) based metric,1 is evaluated for clinical DCE MRI of the liver. The groupwise registration transforms the images to a common space, rather than to a reference volume as conventional pairwise methods do, and computes the similarity metric on all volumes simultaneously. This groupwise registration method is compared to a pairwise approach using a mutual information metric. Clinical DCE MRI of the abdomen of eight patients were included. Per patient one lesion in the liver was manually segmented in all temporal images (N=16). The registered images were compared for accuracy, spatial and temporal smoothness after transformation, and lesion volume change. Compared to a pairwise method or no registration, groupwise registration provided better alignment. In our recently started clinical study groupwise registered clinical DCE MRI of the abdomen of nine patients were scored by three radiologists. Groupwise registration increased the assessed quality of alignment. The gain in reading time for the radiologist was estimated to vary from no difference to almost a minute. A slight increase in reader confidence was also observed. Registration had no added value for images with little motion. In conclusion, the groupwise registration of DCE MR images results in better alignment than achieved by pairwise registration, which is beneficial for clinical assessment. △ Less

Submitted 22 August, 2019; originally announced August 2019.

arXiv:1908.08251 [pdf, other]

doi 10.1117/12.2506770

Optimal input configuration of dynamic contrast enhanced MRI in convolutional neural networks for liver segmentation

Authors: Mariëlle J. A. Jansen, Hugo J. Kuijf, Josien P. W. Pluim

Abstract: Most MRI liver segmentation methods use a structural 3D scan as input, such as a T1 or T2 weighted scan. Segmentation performance may be improved by utilizing both structural and functional information, as contained in dynamic contrast enhanced (DCE) MR series. Dynamic information can be incorporated in a segmentation method based on convolutional neural networks in a number of ways. In this study… ▽ More Most MRI liver segmentation methods use a structural 3D scan as input, such as a T1 or T2 weighted scan. Segmentation performance may be improved by utilizing both structural and functional information, as contained in dynamic contrast enhanced (DCE) MR series. Dynamic information can be incorporated in a segmentation method based on convolutional neural networks in a number of ways. In this study, the optimal input configuration of DCE MR images for convolutional neural networks (CNNs) is studied. The performance of three different input configurations for CNNs is studied for a liver segmentation task. The three configurations are I) one phase image of the DCE-MR series as input image; II) the separate phases of the DCE-MR as input images; and III) the separate phases of the DCE-MR as channels of one input image. The three input configurations are fed into a dilated fully convolutional network and into a small U-net. The CNNs were trained using 19 annotated DCE-MR series and tested on another 19 annotated DCE-MR series. The performance of the three input configurations for both networks is evaluated against manual annotations. The results show that both neural networks perform better when the separate phases of the DCE-MR series are used as channels of an input image in comparison to one phase as input image or the separate phases as input images. No significant difference between the performances of the two network architectures was found for the separate phases as channels of an input image. △ Less

Submitted 22 August, 2019; originally announced August 2019.

Comments: Submitted to SPIE Medical Imaging 2019

arXiv:1902.08899 [pdf, other]

The ARIEL-CMU Systems for LoReHLT18

Authors: Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W Black, Jaime Carbonell, Graham V. Horwood , et al. (5 additional authors not shown)

Abstract: This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech). This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech). △ Less

Submitted 24 February, 2019; originally announced February 2019.

arXiv:1810.00358 [pdf]

Use Cases and Outlooks for Automatic Analytics

Authors: Joni Salminen, Bernard J. Jansen

Abstract: The landscape of analytics is changing rapidly. Much of online user analytics, however, is based on collection of various user analytics numbers. Understanding these numbers, and then relating them to higher numerical analysis for the evaluation of key performance indicators (KPIs) can be quite challenging, especially with large volumes of data. There is a plethora of tools and software packages t… ▽ More The landscape of analytics is changing rapidly. Much of online user analytics, however, is based on collection of various user analytics numbers. Understanding these numbers, and then relating them to higher numerical analysis for the evaluation of key performance indicators (KPIs) can be quite challenging, especially with large volumes of data. There is a plethora of tools and software packages that one can employ. However, these tools and packages require a quantitative competence and analytical sophistication that average end users often do not possess. Additionally, they often do little to reduce the complexity of numerical data in a manner that allows ease of use in decision making and communication. Dealing with numbers poses cognitive challenges for individuals who often do cannot recall many numbers at a time. Here, we explore the concept of automatic analytics by demonstrating use case examples and discussion on the current state and future of automated insights. △ Less

Submitted 30 September, 2018; originally announced October 2018.

arXiv:1802.06437 [pdf, other]

What We Read, What We Search: Media Attention and Public Attention Among 193 Countries

Authors: Haewoon Kwak, Jisun An, Joni Salminen, Soon-Gyo Jung, Bernard J. Jansen

Abstract: We investigate the alignment of international attention of news media organizations within 193 countries with the expressed international interests of the public within those same countries from March 7, 2016 to April 14, 2017. We collect fourteen months of longitudinal data of online news from Unfiltered News and web search volume data from Google Trends and build a multiplex network of media att… ▽ More We investigate the alignment of international attention of news media organizations within 193 countries with the expressed international interests of the public within those same countries from March 7, 2016 to April 14, 2017. We collect fourteen months of longitudinal data of online news from Unfiltered News and web search volume data from Google Trends and build a multiplex network of media attention and public attention in order to study its structural and dynamic properties. Structurally, the media attention and the public attention are both similar and different depending on the resolution of the analysis. For example, we find that 63.2% of the country-specific media and the public pay attention to different countries, but local attention flow patterns, which are measured by network motifs, are very similar. We also show that there are strong regional similarities with both media and public attention that is only disrupted by significantly major worldwide incidents (e.g., Brexit). Using Granger causality, we show that there are a substantial number of countries where media attention and public attention are dissimilar by topical interest. Our findings show that the media and public attention toward specific countries are often at odds, indicating that the public within these countries may be ignoring their country-specific news outlets and seeking other online sources to address their media needs and desires. △ Less

Submitted 18 February, 2018; originally announced February 2018.

Comments: Will be presented in the Web Conference 2018 (WWW2018)

arXiv:1608.05609 [pdf, ps, other]

Implementing a Relevance Tracker Module

Authors: Joachim Jansen, Jo Devriendt, Bart Bogaerts, Gerda Janssens, Marc Denecker

Abstract: PC(ID) extends propositional logic with inductive definitions: rule sets under the well-founded semantics. Recently, a notion of relevance was introduced for this language. This notion determines the set of undecided literals that can still influence the satisfiability of a PC(ID) formula in a given partial assignment. The idea is that the PC(ID) solver can make decisions only on relevant literals… ▽ More PC(ID) extends propositional logic with inductive definitions: rule sets under the well-founded semantics. Recently, a notion of relevance was introduced for this language. This notion determines the set of undecided literals that can still influence the satisfiability of a PC(ID) formula in a given partial assignment. The idea is that the PC(ID) solver can make decisions only on relevant literals without losing soundness and thus safely ignore irrelevant literals. One important insight that the relevance of a literal is completely determined by the current solver state. During search, the solver state changes have an effect on the relevance of literals. In this paper, we discuss an incremental, lightweight implementation of a relevance tracker module that can be added to and interact with an out-of-the-box SAT(ID) solver. △ Less

Submitted 19 August, 2016; originally announced August 2016.

Comments: Paper presented at the 9th Workshop on Answer Set Programming and Other Computing Paradigms (ASPOCP 2016), New York City, USA, 16 October 2016

arXiv:1405.4206 [pdf, other]

Model revision inference for extensions of first order logic

Authors: Joachim Jansen

Abstract: I am Joachim Jansen and this is my research summary, part of my application to the Doctoral Consortium at ICLP'14. I am a PhD student in the Knowledge Representation and Reasoning (KRR) research group, a subgroup of the Declarative Languages and Artificial Intelligence (DTAI) group at the department of Computer Science at KU Leuven. I started my PhD in September 2012. My promotor is prof. dr. ir.… ▽ More I am Joachim Jansen and this is my research summary, part of my application to the Doctoral Consortium at ICLP'14. I am a PhD student in the Knowledge Representation and Reasoning (KRR) research group, a subgroup of the Declarative Languages and Artificial Intelligence (DTAI) group at the department of Computer Science at KU Leuven. I started my PhD in September 2012. My promotor is prof. dr. ir. Gerda Janssens and my co-promotor is prof. dr. Marc Denecker. I can be contacted at [email protected] or at: Room 01.167 Celestijnenlaan 200A 3001 Heverlee Belgium An extended abstract / full version of a paper accepted to be presented at the Doctoral Consortium of the 30th International Conference on Logic Programming (ICLP 2014), July 19-22, Vienna, Austria △ Less

Submitted 16 May, 2014; originally announced May 2014.

arXiv:1405.1523 [pdf, ps, other]

doi 10.1017/S1471068414000155

Simulating dynamic systems using Linear Time Calculus theories

Authors: Bart Bogaerts, Joachim Jansen, Maurice Bruynooghe, Broes De Cat, Joost Vennekens, Marc Denecker

Abstract: To appear in Theory and Practice of Logic Programming (TPLP). Dynamic systems play a central role in fields such as planning, verification, and databases. Fragmented throughout these fields, we find a multitude of languages to formally specify dynamic systems and a multitude of systems to reason on such specifications. Often, such systems are bound to one specific language and one specific infer… ▽ More To appear in Theory and Practice of Logic Programming (TPLP). Dynamic systems play a central role in fields such as planning, verification, and databases. Fragmented throughout these fields, we find a multitude of languages to formally specify dynamic systems and a multitude of systems to reason on such specifications. Often, such systems are bound to one specific language and one specific inference task. It is troublesome that performing several inference tasks on the same knowledge requires translations of your specification to other languages. In this paper we study whether it is possible to perform a broad set of well-studied inference tasks on one specification. More concretely, we extend IDP3 with several inferences from fields concerned with dynamic specifications. △ Less

Submitted 17 June, 2024; v1 submitted 7 May, 2014; originally announced May 2014.

Journal ref: Theory and Practice of Logic Programming 14 (2014) 477-492

arXiv:1309.6883 [pdf, other]

doi 10.1017/S147106841400009X

Predicate Logic as a Modeling Language: Modeling and Solving some Machine Learning and Data Mining Problems with IDP3

Authors: Maurice Bruynooghe, Hendrik Blockeel, Bart Bogaerts, Broes De Cat, Stef De Pooter, Joachim Jansen, Anthony Labarre, Jan Ramon, Marc Denecker, Sicco Verwer

Abstract: This paper provides a gentle introduction to problem solving with the IDP3 system. The core of IDP3 is a finite model generator that supports first order logic enriched with types, inductive definitions, aggregates and partial functions. It offers its users a modeling language that is a slight extension of predicate logic and allows them to solve a wide range of search problems. Apart from a small… ▽ More This paper provides a gentle introduction to problem solving with the IDP3 system. The core of IDP3 is a finite model generator that supports first order logic enriched with types, inductive definitions, aggregates and partial functions. It offers its users a modeling language that is a slight extension of predicate logic and allows them to solve a wide range of search problems. Apart from a small introductory example, applications are selected from problems that arose within machine learning and data mining research. These research areas have recently shown a strong interest in declarative modeling and constraint solving as opposed to algorithmic approaches. The paper illustrates that the IDP3 system can be a valuable tool for researchers with such an interest. The first problem is in the domain of stemmatology, a domain of philology concerned with the relationship between surviving variant versions of text. The second problem is about a somewhat related problem within biology where phylogenetic trees are used to represent the evolution of species. The third and final problem concerns the classical problem of learning a minimal automaton consistent with a given set of strings. For this last problem, we show that the performance of our solution comes very close to that of a state-of-the art solution. For each of these applications, we analyze the problem, illustrate the development of a logic-based model and explore how alternatives can affect the performance. △ Less

Submitted 28 March, 2014; v1 submitted 26 September, 2013; originally announced September 2013.

Comments: To appear in Theory and Practice of Logic Programming (TPLP)

Journal ref: Theory and Practice of Logic Programming 15 (2014) 783-817

arXiv:0908.0764 [pdf]

Learning about Potential Users of Collaborative Information Retrieval Systems

Authors: Madhu Reddy, Bernard J. Jansen

Abstract: One of the key components of designing usable and useful collaborative information retrieval systems is to understand the needs of the users of these systems. Our research team has been exploring collaborative information behavior in a variety of organizational settings. Our research goals have been two-fold: First, to develop a conceptual understanding of collaborative information behavior and… ▽ More One of the key components of designing usable and useful collaborative information retrieval systems is to understand the needs of the users of these systems. Our research team has been exploring collaborative information behavior in a variety of organizational settings. Our research goals have been two-fold: First, to develop a conceptual understanding of collaborative information behavior and second, gather requirements for the design of collaborative information retrieval systems. In this paper, we present a brief overview of our fieldwork in a three different organizational settings, discuss our methodology for collecting data on collaborative information behavior, and highlight some lessons that we are learning about potential users of collaborative information retrieval systems in these domains. △ Less

Submitted 5 August, 2009; originally announced August 2009.

Comments: Presented at 1st Intl Workshop on Collaborative Information Seeking, 2008 (arXiv:0908.0583)

Report number: JCDL2008CIRWS/2008/redjan ACM Class: H.3.3; H.5.2; H.5.3

Showing 1–22 of 22 results for author: Jansen, J