Search | arXiv e-print repository

Prediction of Solar Radiation Based on Spatial and Temporal Embeddings for Solar Generation Forecast

Authors: Mohammad Alqudah, Tatjana Dokic, Mladen Kezunovic, Zoran Obradovic

Abstract: A novel method for real-time solar generation forecast using weather data, while exploiting both spatial and temporal structural dependencies is proposed. The network observed over time is projected to a lower-dimensional representation where a variety of weather measurements are used to train a structured regression model while weather forecast is used at the inference stage. Experiments were con… ▽ More A novel method for real-time solar generation forecast using weather data, while exploiting both spatial and temporal structural dependencies is proposed. The network observed over time is projected to a lower-dimensional representation where a variety of weather measurements are used to train a structured regression model while weather forecast is used at the inference stage. Experiments were conducted at 288 locations in the San Antonio, TX area on obtained from the National Solar Radiation Database. The model predicts solar irradiance with a good accuracy (R2 0.91 for the summer, 0.85 for the winter, and 0.89 for the global model). The best accuracy was obtained by the Random Forest Regressor. Multiple experiments were conducted to characterize influence of missing data and different time horizons providing evidence that the new algorithm is robust for data missing not only completely at random but also when the mechanism is spatial, and temporal. △ Less

Submitted 17 June, 2022; originally announced June 2022.

Comments: Proceedings of the 53rd IEEE Hawaii International Conference on System Sciences (HICSS 2020)

Report number: https://hdl.handle.net/10125/64105

arXiv:2103.06130 [pdf, other]

Stay on Topic, Please: Aligning User Comments to the Content of a News Article

Authors: Jumanah Alshehri, Marija Stanojevic, Eduard Dragut, Zoran Obradovic

Abstract: Social scientists have shown that up to 50% if the content posted to a news article have no relation to its journalistic content. In this study we propose a classification algorithm to categorize user comments posted to a new article base don their alignment to its content. The alignment seek to match user comments to an article based on similarity off content, entities in discussion, and topic. W… ▽ More Social scientists have shown that up to 50% if the content posted to a news article have no relation to its journalistic content. In this study we propose a classification algorithm to categorize user comments posted to a new article base don their alignment to its content. The alignment seek to match user comments to an article based on similarity off content, entities in discussion, and topic. We proposed a BERTAC, BAERT-based approach that learn jointly article-comment embeddings and infers the relevance class of comments. We introduce an ordinal classification loss that penalizes the difference between the predicted and true label. We conduct a thorough study to show influence of the proposed loss on the learning process. The results on five representative news outlets show that our approach can learn the comment class with up to 36% average accuracy improvement compering to the baselines, and up to 25% compering to the BA-BC model. BA-BC is out approach that consists of two models aimed to capture dis-jointly the formal language of news articles and the informal language of comments. We also conduct a user study to evaluate human labeling performance to understand the difficulty of the classification task. The user agreement on comment-article alignment is "moderate" per Krippendorff's alpha score, which suggests that the classification task is difficult. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Comments: Accepted as a full paper at the 43rd European Conference on Information Retrieval

arXiv:2102.02924 [pdf, ps, other]

Another estimation of Laplacian spectrum of the Kronecker product of graphs

Authors: Milan Bašić, Branko Arsić, Zoran Obradović

Abstract: The relationships between eigenvalues and eigenvectors of a product graph and those of its factor graphs have been known for the standard products, while characterization of Laplacian eigenvalues and eigenvectors of the Kronecker product of graphs using the Laplacian spectra and eigenvectors of the factors turned out to be quite challenging and has remained an open problem to date. Several approac… ▽ More The relationships between eigenvalues and eigenvectors of a product graph and those of its factor graphs have been known for the standard products, while characterization of Laplacian eigenvalues and eigenvectors of the Kronecker product of graphs using the Laplacian spectra and eigenvectors of the factors turned out to be quite challenging and has remained an open problem to date. Several approaches for the estimation of Laplacian spectrum of the Kronecker product of graphs have been proposed in recent years. However, it turns out that not all the methods are practical to apply in network science models, particularly in the context of multilayer networks. Here we develop a practical and computationally efficient method to estimate Laplacian spectra of this graph product from spectral properties of their factor graphs which is more stable than the alternatives proposed in the literature. We emphasize that a median of the percentage errors of our estimated Laplacian spectrum almost coincides with the $x$-axis, unlike the alternatives which have sudden jumps at the beginning followed by a gradual decrease for the percentage errors. The percentage errors confined (confidence of the estimations) up to $\pm$10% for all considered approximations, depending on a graph density. Moreover, we theoretically prove that the percentage errors becomes smaller when the network grows or the edge density level increases. Additionally, some novel theoretical results considering the exact formulas and lower bounds related to the certain correlation coefficients corresponding to the estimated eigenvectors are presented. △ Less

Submitted 4 February, 2021; originally announced February 2021.

arXiv:2005.00950 [pdf]

Extracting Entities and Topics from News and Connecting Criminal Records

Authors: Quang Pham, Marija Stanojevic, Zoran Obradovic

Abstract: The goal of this paper is to summarize methodologies used in extracting entities and topics from a database of criminal records and from a database of newspapers. Statistical models had successfully been used in studying the topics of roughly 300,000 New York Times articles. In addition, these models had also been used to successfully analyze entities related to people, organizations, and places (… ▽ More The goal of this paper is to summarize methodologies used in extracting entities and topics from a database of criminal records and from a database of newspapers. Statistical models had successfully been used in studying the topics of roughly 300,000 New York Times articles. In addition, these models had also been used to successfully analyze entities related to people, organizations, and places (D Newman, 2006). Additionally, analytical approaches, especially in hotspot map**, were used in some researches with an aim to predict crime locations and circumstances in the future, and those approaches had been tested quite successfully (S Chainey, 2008). Based on the two above notions, this research was performed with the intention to apply data science techniques in analyzing a big amount of data, selecting valuable intelligence, clustering violations depending on their types of crime, and creating a crime graph that changes through time. In this research, the task was to download criminal datasets from Kaggle and a collection of news articles from Kaggle and EAGER project databases, and then to merge these datasets into one general dataset. The most important goal of this project was performing statistical and natural language processing methods to extract entities and topics as well as to group similar data points into correct clusters, in order to understand public data about U.S related crimes better. △ Less

Submitted 2 May, 2020; originally announced May 2020.

Comments: This is a report submitted by an undergraduate student as preliminary work on this problem

arXiv:1909.00853 [pdf, ps, other]

Further results on structured regression for multi-scale networks

Authors: Milan Bašić, Branko Arsić, Zoran Obradović

Abstract: Gaussian Conditional Random Fields (GCRF), as a structured regression model, is designed to achieve higher regression accuracy than unstructured predictors at the expense of execution time, taking into account the objects similarities and the outputs of unstructured predictors simultaneously. As most structural models, the GCRF model does not scale well with large networks. One of the approaches c… ▽ More Gaussian Conditional Random Fields (GCRF), as a structured regression model, is designed to achieve higher regression accuracy than unstructured predictors at the expense of execution time, taking into account the objects similarities and the outputs of unstructured predictors simultaneously. As most structural models, the GCRF model does not scale well with large networks. One of the approaches consists of performing calculations on factor graphs (if it is possible) rather than on the full graph, which is more computationally efficient. The Kronecker product of the graphs appears to be a natural choice for a graph decomposition. However, this idea is not straightforwardly applicable for GCRF, since characterizing a Laplacian spectrum of the Kronecker product of graphs, which GCRF is based on, from spectra of its factor graphs has remained an open problem. In this paper we apply new estimations for the Laplacian eigenvalues and eigenvectors, and achieve high prediction accuracy of the proposed models, while the computational complexity of the models, compared to the original GCRF model, is improved from $O(n_{1}^{3}n_{2}^{3})$ to $O(n_{1}^{3} + n_{2}^{3})$. Furthermore, we study the GCRF model with a non-Kronecker graph, where the model consists of finding the nearest Kronecker product of graph for an initial graph. Although the proposed models are more complex, they achieve high prediction accuracy too, while the execution time is still much better compare to the original GCRF model. The effectiveness of the proposed models is characterized on three types of random networks where the proposed models were consistently away more accurate than the previously presented GCRF model for multiscale networks [Jesse Glass and Zoran Obradovic. Structured regression on multiscale networks. IEEE Intelligent Systems, 32(2):23-30, 2017.]. △ Less

Submitted 2 September, 2019; originally announced September 2019.

arXiv:1804.03240 [pdf, other]

Deep Attention Model for Triage of Emergency Department Patients

Authors: Djordje Gligorijevic, Jelena Stojanovic, Wayne Satz, Ivan Stojkovic, Kathrin Schreyer, Daniel Del Portal, Zoran Obradovic

Abstract: Optimization of patient throughput and wait time in emergency departments (ED) is an important task for hospital systems. For that reason, Emergency Severity Index (ESI) system for patient triage was introduced to help guide manual estimation of acuity levels, which is used by nurses to rank the patients and organize hospital resources. However, despite improvements that it brought to managing med… ▽ More Optimization of patient throughput and wait time in emergency departments (ED) is an important task for hospital systems. For that reason, Emergency Severity Index (ESI) system for patient triage was introduced to help guide manual estimation of acuity levels, which is used by nurses to rank the patients and organize hospital resources. However, despite improvements that it brought to managing medical resources, such triage system greatly depends on nurse's subjective judgment and is thus prone to human errors. Here, we propose a novel deep model based on the word attention mechanism designed for predicting a number of resources an ED patient would need. Our approach incorporates routinely available continuous and nominal (structured) data with medical text (unstructured) data, including patient's chief complaint, past medical history, medication list, and nurse assessment collected for 338,500 ED visits over three years in a large urban hospital. Using both structured and unstructured data, the proposed approach achieves the AUC of $\sim 88\%$ for the task of identifying resource intensive patients (binary classification), and the accuracy of $\sim 44\%$ for predicting exact category of number of resources (multi-class classification task), giving an estimated lift over nurses' performance by 16\% in accuracy. Furthermore, the attention mechanism of the proposed model provides interpretability by assigning attention scores for nurses' notes which is crucial for decision making and implementation of such approaches in the real systems working on human health. △ Less

Submitted 28 March, 2018; originally announced April 2018.

Comments: Proceedings of the 2018 SIAM International Conference on Data Mining (SDM 2018), San Diego, CA, May 2018. *Authors contributed equally

arXiv:1803.11462 [pdf, other]

Improving confidence while predicting trends in temporal disease networks

Authors: Djordje Gligorijevic, Jelena Stojanovic, Zoran Obradovic

Abstract: For highly sensitive real-world predictive analytic applications such as healthcare and medicine, having good prediction accuracy alone is often not enough. These kinds of applications require a decision making process which uses uncertainty estimation as input whenever possible. Quality of uncertainty estimation is a subject of over or under confident prediction, which is often not addressed in m… ▽ More For highly sensitive real-world predictive analytic applications such as healthcare and medicine, having good prediction accuracy alone is often not enough. These kinds of applications require a decision making process which uses uncertainty estimation as input whenever possible. Quality of uncertainty estimation is a subject of over or under confident prediction, which is often not addressed in many models. In this paper we show several extensions to the Gaussian Conditional Random Fields model, which aim to provide higher quality uncertainty estimation. These extensions are applied to the temporal disease graph built from the State Inpatient Database (SID) of California, acquired from the HCUP. Our experiments demonstrate benefits of using graph information in modeling temporal disease properties as well as improvements in uncertainty estimation provided by given extensions of the Gaussian Conditional Random Fields method. △ Less

Submitted 28 March, 2018; originally announced March 2018.

Comments: Proceedings of the 4th Workshop on Data Mining for Medicine and Healthcare, 2015 SIAM International Conference on Data Mining, Vancouver, Canada, April 30 - May 02, 2015

arXiv:1803.10799 [pdf, other]

Modeling Customer Engagement from Partial Observations

Authors: Jelena Stojanovic, Djordje Gligorijevic, Zoran Obradovic

Abstract: It is of high interest for a company to identify customers expected to bring the largest profit in the upcoming period. Knowing as much as possible about each customer is crucial for such predictions. However, their demographic data, preferences, and other information that might be useful for building loyalty programs is often missing. Additionally, modeling relations among different customers as… ▽ More It is of high interest for a company to identify customers expected to bring the largest profit in the upcoming period. Knowing as much as possible about each customer is crucial for such predictions. However, their demographic data, preferences, and other information that might be useful for building loyalty programs is often missing. Additionally, modeling relations among different customers as a network can be beneficial for predictions at an individual level, as similar customers tend to have similar purchasing patterns. We address this problem by proposing a robust framework for structured regression on deficient data in evolving networks with a supervised representation learning based on neural features embedding. The new method is compared to several unstructured and structured alternatives for predicting customer behavior (e.g. purchasing frequency and customer ticket) on user networks generated from customer databases of two companies from different industries. The obtained results show $4\%$ to $130\%$ improvement in accuracy over alternatives when all customer information is known. Additionally, the robustness of our method is demonstrated when up to $80\%$ of demographic information was missing where it was up to several folds more accurate as compared to alternatives that are either ignoring cases with missing values or learn their feature representation in an unsupervised manner. △ Less

Submitted 28 March, 2018; originally announced March 2018.

Comments: Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 2016), Indianapolis, United States October 24 - 28, 2016

arXiv:1803.10739 [pdf, other]

Deeply Supervised Semantic Model for Click-Through Rate Prediction in Sponsored Search

Authors: Jelena Gligorijevic, Djordje Gligorijevic, Ivan Stojkovic, Xiao Bai, Amit Goyal, Zoran Obradovic

Abstract: In sponsored search it is critical to match ads that are relevant to a query and to accurately predict their likelihood of being clicked. Commercial search engines typically use machine learning models for both query-ad relevance matching and click-through-rate (CTR) prediction. However, matching models are based on the similarity between a query and an ad, ignoring the fact that a retrieved ad ma… ▽ More In sponsored search it is critical to match ads that are relevant to a query and to accurately predict their likelihood of being clicked. Commercial search engines typically use machine learning models for both query-ad relevance matching and click-through-rate (CTR) prediction. However, matching models are based on the similarity between a query and an ad, ignoring the fact that a retrieved ad may not attract clicks, while click models rely on click history, being of limited use for new queries and ads. We propose a deeply supervised architecture that jointly learns the semantic embeddings of a query and an ad as well as their corresponding CTR.We also propose a novel cohort negative sampling technique for learning implicit negative signals. We trained the proposed architecture using one billion query-ad pairs from a major commercial web search engine. This architecture improves the best-performing baseline deep neural architectures by 2\% of AUC for CTR prediction and by statistically significant 0.5\% of NDCG for query-ad matching. △ Less

Submitted 28 March, 2018; originally announced March 2018.

Comments: The first and second authors listed are co-first author

arXiv:1803.10705 [pdf, other]

Semi-supervised learning for structured regression on partially observed attributed graphs

Authors: Jelena Stojanovic, Milos Jovanovic, Djordje Gligorijevic, Zoran Obradovic

Abstract: Conditional probabilistic graphical models provide a powerful framework for structured regression in spatio-temporal datasets with complex correlation patterns. However, in real-life applications a large fraction of observations is often missing, which can severely limit the representational power of these models. In this paper we propose a Marginalized Gaussian Conditional Random Fields (m-GCRF)… ▽ More Conditional probabilistic graphical models provide a powerful framework for structured regression in spatio-temporal datasets with complex correlation patterns. However, in real-life applications a large fraction of observations is often missing, which can severely limit the representational power of these models. In this paper we propose a Marginalized Gaussian Conditional Random Fields (m-GCRF) structured regression model for dealing with missing labels in partially observed temporal attributed graphs. This method is aimed at learning with both labeled and unlabeled parts and effectively predicting future values in a graph. The method is even capable of learning from nodes for which the response variable is never observed in history, which poses problems for many state-of-the-art models that can handle missing data. The proposed model is characterized for various missingness mechanisms on 500 synthetic graphs. The benefits of the new method are also demonstrated on a challenging application for predicting precipitation based on partial observations of climate variables in a temporal graph that spans the entire continental US. We also show that the method can be useful for optimizing the costs of data collection in climate applications via active reduction of the number of weather stations to consider. In experiments on these real-world and synthetic datasets we show that the proposed model is consistently more accurate than alternative semi-supervised structured models, as well as models that either use imputation to deal with missing values or simply ignore them altogether. △ Less

Submitted 28 March, 2018; originally announced March 2018.

Comments: Proceedings of the 2015 SIAM International Conference on Data Mining (SDM 2015) Vancouver, Canada, April 30 - May 02, 2015

Showing 1–10 of 10 results for author: Obradovic, Z