Search | arXiv e-print repository

doi 10.1145/3626183.3659964

Cost-Driven Data Replication with Predictions

Authors: Tianyu Zuo, Xueyan Tang, Bu Sung Lee

Abstract: This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. W… ▽ More This paper studies an online replication problem for distributed data access. The goal is to dynamically create and delete data copies in a multi-server system as time passes to minimize the total storage and network cost of serving access requests. We study the problem in the emergent learning-augmented setting, assuming simple binary predictions about inter-request times at individual servers. We develop an online algorithm and prove that it is ($\frac{5+α}{3}$)-consistent (competitiveness under perfect predictions) and ($1 + \frac{1}α$)-robust (competitiveness under terrible predictions), where $α\in (0, 1]$ is a hyper-parameter representing the level of distrust in the predictions. We also study the impact of mispredictions on the competitive ratio of the proposed algorithm and adapt it to achieve a bounded robustness while retaining its consistency. We further establish a lower bound of $\frac{3}{2}$ on the consistency of any deterministic learning-augmented algorithm. Experimental evaluations are carried out to evaluate our algorithms using real data access traces. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: The formal version of this draft will appear in ACM SPAA'24 conference

arXiv:2311.18061 [pdf, other]

TransNAS-TSAD: Harnessing Transformers for Multi-Objective Neural Architecture Search in Time Series Anomaly Detection

Authors: Ijaz Ul Haq, Byung Suk Lee, Donna M. Rizzo

Abstract: The surge in real-time data collection across various industries has underscored the need for advanced anomaly detection in both univariate and multivariate time series data. This paper introduces TransNAS-TSAD, a framework that synergizes the transformer architecture with neural architecture search (NAS), enhanced through NSGA-II algorithm optimization. This approach effectively tackles the compl… ▽ More The surge in real-time data collection across various industries has underscored the need for advanced anomaly detection in both univariate and multivariate time series data. This paper introduces TransNAS-TSAD, a framework that synergizes the transformer architecture with neural architecture search (NAS), enhanced through NSGA-II algorithm optimization. This approach effectively tackles the complexities of time series data, balancing computational efficiency with detection accuracy. Our evaluation reveals that TransNAS-TSAD surpasses conventional anomaly detection models due to its tailored architectural adaptability and the efficient exploration of complex search spaces, leading to marked improvements in diverse data scenarios. We also introduce the Efficiency-Accuracy-Complexity Score (EACS) as a new metric for assessing model performance, emphasizing the balance between accuracy and computational resources. TransNAS-TSAD sets a new benchmark in time series anomaly detection, offering a versatile, efficient solution for complex real-world applications. This research highlights the TransNAS-TSAD potential across a wide range of industry applications and paves the way for future developments in the field. △ Less

Submitted 4 March, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: 32 pages , 4 figures, It will submitted to a journal

arXiv:2309.07992 [pdf, other]

An Automated Machine Learning Approach for Detecting Anomalous Peak Patterns in Time Series Data from a Research Watershed in the Northeastern United States Critical Zone

Authors: Ijaz Ul Haq, Byung Suk Lee, Donna M. Rizzo, Julia N Perdrial

Abstract: This paper presents an automated machine learning framework designed to assist hydrologists in detecting anomalies in time series data generated by sensors in a research watershed in the northeastern United States critical zone. The framework specifically focuses on identifying peak-pattern anomalies, which may arise from sensor malfunctions or natural phenomena. However, the use of classification… ▽ More This paper presents an automated machine learning framework designed to assist hydrologists in detecting anomalies in time series data generated by sensors in a research watershed in the northeastern United States critical zone. The framework specifically focuses on identifying peak-pattern anomalies, which may arise from sensor malfunctions or natural phenomena. However, the use of classification methods for anomaly detection poses challenges, such as the requirement for labeled data as ground truth and the selection of the most suitable deep learning model for the given task and dataset. To address these challenges, our framework generates labeled datasets by injecting synthetic peak patterns into synthetically generated time series data and incorporates an automated hyperparameter optimization mechanism. This mechanism generates an optimized model instance with the best architectural and training parameters from a pool of five selected models, namely Temporal Convolutional Network (TCN), InceptionTime, MiniRocket, Residual Networks (ResNet), and Long Short-Term Memory (LSTM). The selection is based on the user's preferences regarding anomaly detection accuracy and computational cost. The framework employs Time-series Generative Adversarial Networks (TimeGAN) as the synthetic dataset generator. The generated model instances are evaluated using a combination of accuracy and computational cost metrics, including training time and memory, during the anomaly detection process. Performance evaluation of the framework was conducted using a dataset from a watershed, demonstrating consistent selection of the most fitting model instance that satisfies the user's preferences. △ Less

Submitted 5 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

Comments: This document is the results of the research project funded by the National Science Foundation. Preprint submitted to Machine Learning with Applications, December 5 2023

arXiv:2308.10918 [pdf, other]

Label-based Graph Augmentation with Metapath for Graph Anomaly Detection

Authors: Hwan Kim, Junghoon Kim, Byung Suk Lee, Sungsu Lim

Abstract: Graph anomaly detection has attracted considerable attention from various domain ranging from network security to finance in recent years. Due to the fact that labeling is very costly, existing methods are predominately developed in an unsupervised manner. However, the detected anomalies may be found out uninteresting instances due to the absence of prior knowledge regarding the anomalies looking… ▽ More Graph anomaly detection has attracted considerable attention from various domain ranging from network security to finance in recent years. Due to the fact that labeling is very costly, existing methods are predominately developed in an unsupervised manner. However, the detected anomalies may be found out uninteresting instances due to the absence of prior knowledge regarding the anomalies looking for. This issue may be solved by using few labeled anomalies as prior knowledge. In real-world scenarios, we can easily obtain few labeled anomalies. Efficiently leveraging labelled anomalies as prior knowledge is crucial for graph anomaly detection; however, this process remains challenging due to the inherently limited number of anomalies available. To address the problem, we propose a novel approach that leverages metapath to embed actual connectivity patterns between anomalous and normal nodes. To further efficiently exploit context information from metapath-based anomaly subgraph, we present a new framework, Metapath-based Graph Anomaly Detection (MGAD), incorporating GCN layers in both the dual-encoders and decoders to efficiently propagate context information between abnormal and normal nodes. Specifically, MGAD employs GNN-based graph autoencoder as its backbone network. Moreover, dual encoders capture the complex interactions and metapath-based context information between labeled and unlabeled nodes both globally and locally. Through a comprehensive set of experiments conducted on seven real-world networks, this paper demonstrates the superiority of the MGAD method compared to state-of-the-art techniques. The code is available at https://github.com/missinghwan/MGAD. △ Less

Submitted 11 April, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

arXiv:2308.03929 [pdf, other]

Fact-Checking Generative AI: Ontology-Driven Biological Graphs for Disease-Gene Link Verification

Authors: Ahmed Abdeen Hamed, Byung Suk Lee, Alessandro Crimi, Magdalena M. Misiak

Abstract: Since the launch of various generative AI tools, scientists have been striving to evaluate their capabilities and contents, in the hope of establishing trust in their generative abilities. Regulations and guidelines are emerging to verify generated contents and identify novel uses. we aspire to demonstrate how ChatGPT claims are checked computationally using the rigor of network models. We aim to… ▽ More Since the launch of various generative AI tools, scientists have been striving to evaluate their capabilities and contents, in the hope of establishing trust in their generative abilities. Regulations and guidelines are emerging to verify generated contents and identify novel uses. we aspire to demonstrate how ChatGPT claims are checked computationally using the rigor of network models. We aim to achieve fact-checking of the knowledge embedded in biological graphs that were contrived from ChatGPT contents at the aggregate level. We adopted a biological networks approach that enables the systematic interrogation of ChatGPT's linked entities. We designed an ontology-driven fact-checking algorithm that compares biological graphs constructed from approximately 200,000 PubMed abstracts with counterparts constructed from a dataset generated using the ChatGPT-3.5 Turbo model. In 10-samples of 250 randomly selected records a ChatGPT dataset of 1000 "simulated" articles , the fact-checking link accuracy ranged from 70% to 86%. This study demonstrated high accuracy of aggregate disease-gene links relationships found in ChatGPT-generated texts. △ Less

Submitted 8 April, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: Accepted in the 24th International Conference on Computational Science (ICCS'24), in April 1st 2024. Will appear in the Springer LNCS proceeding as a short paper

ACM Class: I.2

arXiv:2209.14930 [pdf, other]

Graph Anomaly Detection with Graph Neural Networks: Current Status and Challenges

Authors: Hwan Kim, Byung Suk Lee, Won-Yong Shin, Sungsu Lim

Abstract: Graphs are used widely to model complex systems, and detecting anomalies in a graph is an important task in the analysis of complex systems. Graph anomalies are patterns in a graph that do not conform to normal patterns expected of the attributes and/or structures of the graph. In recent years, graph neural networks (GNNs) have been studied extensively and have successfully performed difficult mac… ▽ More Graphs are used widely to model complex systems, and detecting anomalies in a graph is an important task in the analysis of complex systems. Graph anomalies are patterns in a graph that do not conform to normal patterns expected of the attributes and/or structures of the graph. In recent years, graph neural networks (GNNs) have been studied extensively and have successfully performed difficult machine learning tasks in node classification, link prediction, and graph classification thanks to the highly expressive capability via message passing in effectively learning graph representations. To solve the graph anomaly detection problem, GNN-based methods leverage information about the graph attributes (or features) and/or structures to learn to score anomalies appropriately. In this survey, we review the recent advances made in detecting graph anomalies using GNN models. Specifically, we summarize GNN-based methods according to the graph type (i.e., static and dynamic), the anomaly type (i.e., node, edge, subgraph, and whole graph), and the network architecture (e.g., graph autoencoder, graph convolutional network). To the best of our knowledge, this survey is the first comprehensive review of graph anomaly detection methods based on GNNs. △ Less

Submitted 4 October, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

Comments: 9 pages, 2 figures, 1 tables; to appear in the IEEE Access (Please cite our journal version.)

arXiv:2206.04792 [pdf, other]

doi 10.1145/3534678.3539348

Adaptive Model Pooling for Online Deep Anomaly Detection from a Complex Evolving Data Stream

Authors: Susik Yoon, Youngjun Lee, Jae-Gil Lee, Byung Suk Lee

Abstract: Online anomaly detection from a data stream is critical for the safety and security of many applications but is facing severe challenges due to complex and evolving data streams from IoT devices and cloud-based infrastructures. Unfortunately, existing approaches fall too short for these challenges; online anomaly detection methods bear the burden of handling the complexity while offline deep anoma… ▽ More Online anomaly detection from a data stream is critical for the safety and security of many applications but is facing severe challenges due to complex and evolving data streams from IoT devices and cloud-based infrastructures. Unfortunately, existing approaches fall too short for these challenges; online anomaly detection methods bear the burden of handling the complexity while offline deep anomaly detection methods suffer from the evolving data distribution. This paper presents a framework for online deep anomaly detection, ARCUS, which can be instantiated with any autoencoder-based deep anomaly detection methods. It handles the complex and evolving data streams using an adaptive model pooling approach with two novel techniques: concept-driven inference and drift-aware model pool update; the former detects anomalies with a combination of models most appropriate for the complexity, and the latter adapts the model pool dynamically to fit the evolving data streams. In comprehensive experiments with ten data sets which are both high-dimensional and concept-drifted, ARCUS improved the anomaly detection accuracy of the streaming variants of state-of-the-art autoencoder-based methods and that of the state-of-the-art streaming anomaly detection methods by up to 22% and 37%, respectively. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: Accepted by KDD 2022 Research Track

arXiv:2108.11523 [pdf, other]

SOMTimeS: Self Organizing Maps for Time Series Clustering and its Application to Serious Illness Conversations

Authors: Ali Javed, Donna M. Rizzo, Byung Suk Lee, Robert Gramling

Abstract: There is an increasing demand for scalable algorithms capable of clustering and analyzing large time series datasets. The Kohonen self-organizing map (SOM) is a type of unsupervised artificial neural network for visualizing and clustering complex data, reducing the dimensionality of data, and selecting influential features. Like all clustering methods, the SOM requires a measure of similarity betw… ▽ More There is an increasing demand for scalable algorithms capable of clustering and analyzing large time series datasets. The Kohonen self-organizing map (SOM) is a type of unsupervised artificial neural network for visualizing and clustering complex data, reducing the dimensionality of data, and selecting influential features. Like all clustering methods, the SOM requires a measure of similarity between input data (in this work time series). Dynamic time war** (DTW) is one such measure, and a top performer given that it accommodates the distortions when aligning time series. Despite its use in clustering, DTW is limited in practice because it is quadratic in runtime complexity with the length of the time series data. To address this, we present a new DTW-based clustering method, called SOMTimeS (a Self-Organizing Map for TIME Series), that scales better and runs faster than other DTW-based clustering algorithms, and has similar performance accuracy. The computational performance of SOMTimeS stems from its ability to prune unnecessary DTW computations during the SOM's training phase. We also implemented a similar pruning strategy for K-means for comparison with one of the top performing clustering algorithms. We evaluated the pruning effectiveness, accuracy, execution time and scalability on 112 benchmark time series datasets from the University of California, Riverside classification archive. We showed that for similar accuracy, the speed-up achieved for SOMTimeS and K-means was 1.8x on average; however, rates varied between 1x and 18x depending on the dataset. SOMTimeS and K-means pruned 43% and 50% of the total DTW computations, respectively. We applied SOMtimeS to natural language conversation data collected as part of a large healthcare cohort study of patient-clinician serious illness conversations to demonstrate the algorithm's utility with complex, temporally sequenced phenomena. △ Less

Submitted 25 August, 2021; originally announced August 2021.

Comments: 36 pages

arXiv:2004.09546 [pdf, other]

doi 10.1016/j.mlwa.2020.100001

A Benchmark Study on Time Series Clustering

Authors: Ali Javed, Byung Suk Lee, Dona M. Rizzo

Abstract: This paper presents the first time series clustering benchmark utilizing all time series datasets currently available in the University of California Riverside (UCR) archive -- the state of the art repository of time series data. Specifically, the benchmark examines eight popular clustering methods representing three categories of clustering algorithms (partitional, hierarchical and density-based)… ▽ More This paper presents the first time series clustering benchmark utilizing all time series datasets currently available in the University of California Riverside (UCR) archive -- the state of the art repository of time series data. Specifically, the benchmark examines eight popular clustering methods representing three categories of clustering algorithms (partitional, hierarchical and density-based) and three types of distance measures (Euclidean, dynamic time war**, and shape-based). We lay out six restrictions with special attention to making the benchmark as unbiased as possible. A phased evaluation approach was then designed for summarizing dataset-level assessment metrics and discussing the results. The benchmark study presented can be a useful reference for the research community on its own; and the dataset-level assessment metrics reported may be used for designing evaluation frameworks to answer different research questions. △ Less

Submitted 26 April, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: Typos corrected, figures resolution changed

Journal ref: Machine Learning with Applications, 1:100001, 2020

arXiv:1911.12466 [pdf, other]

doi 10.1016/j.jhydrol.2020.125802

Analysis of Hydrological and Suspended Sediment Events from Mad River Watershed using Multivariate Time Series Clustering

Authors: Ali Javed, Scott D. Hamshaw, Donna M. Rizzo, Byung Suk Lee

Abstract: Hydrological storm events are a primary driver for transporting water quality constituents such as turbidity, suspended sediments and nutrients. Analyzing the concentration (C) of these water quality constituents in response to increased streamflow discharge (Q), particularly when monitored at high temporal resolution during a hydrological event, helps to characterize the dynamics and flux of such… ▽ More Hydrological storm events are a primary driver for transporting water quality constituents such as turbidity, suspended sediments and nutrients. Analyzing the concentration (C) of these water quality constituents in response to increased streamflow discharge (Q), particularly when monitored at high temporal resolution during a hydrological event, helps to characterize the dynamics and flux of such constituents. A conventional approach to storm event analysis is to reduce the C-Q time series to two-dimensional (2-D) hysteresis loops and analyze these 2-D patterns. While effective and informative to some extent, this hysteresis loop approach has limitations because projecting the C-Q time series onto a 2-D plane obscures detail (e.g., temporal variation) associated with the C-Q relationships. In this paper, we address this issue using a multivariate time series clustering approach. Clustering is applied to sequences of river discharge and suspended sediment data (acquired through turbidity-based monitoring) from six watersheds located in the Lake Champlain Basin in the northeastern United States. While clusters of the hydrological storm events using the multivariate time series approach were found to be correlated to 2-D hysteresis loop classifications and watershed locations, the clusters differed from the 2-D hysteresis classifications. Additionally, using available meteorological data associated with storm events, we examine the characteristics of computational clusters of storm events in the study watersheds and identify the features driving the clustering approach. △ Less

Submitted 20 March, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

Comments: Corrected typo in title

Journal ref: Journal of Hydrology, 593:125802, 2021

arXiv:1508.06976 [pdf, ps, other]

Real-time Top-K Predictive Query Processing over Event Streams

Authors: Saurav Acharya, Byung Suk Lee, Paul Hines

Abstract: This paper addresses the problem of predicting the k events that are most likely to occur next, over historical real-time event streams. Existing approaches to causal prediction queries have a number of limitations. First, they exhaustively search over an acyclic causal network to find the most likely k effect events; however, data from real event streams frequently reflect cyclic causality. Secon… ▽ More This paper addresses the problem of predicting the k events that are most likely to occur next, over historical real-time event streams. Existing approaches to causal prediction queries have a number of limitations. First, they exhaustively search over an acyclic causal network to find the most likely k effect events; however, data from real event streams frequently reflect cyclic causality. Second, they contain conservative assumptions intended to exclude all possible non-causal links in the causal network; it leads to the omission of many less-frequent but important causal links. We overcome these limitations by proposing a novel event precedence model and a run-time causal inference mechanism. The event precedence model constructs a first order absorbing Markov chain incrementally over event streams, where an edge between two events signifies a temporal precedence relationship between them, which is a necessary condition for causality. Then, the run-time causal inference mechanism learns causal relationships dynamically during query processing. This is done by removing some of the temporal precedence relationships that do not exhibit causality in the presence of other events in the event precedence model. This paper presents two query processing algorithms -- one performs exhaustive search on the model and the other performs a more efficient reduced search with early termination. Experiments using two real datasets (cascading blackouts in power systems and web page views) verify the effectiveness of the probabilistic top-k prediction queries and the efficiency of the algorithms. Specifically, the reduced search algorithm reduced runtime, relative to exhaustive search, by 25-80% (depending on the application) with only a small reduction in accuracy. △ Less

Submitted 26 August, 2015; originally announced August 2015.

arXiv:1203.1185 [pdf, ps, other]

doi 10.1109/TVT.2012.2197768

A Self-Organization Framework for Wireless Ad Hoc Networks as Small Worlds

Authors: Abhik Banerjee, Rachit Agarwal, Vincent Gauthier, Chai Kiat Yeo, Hossam Afifi, Bu Sung Lee

Abstract: Motivated by the benefits of small world networks, we propose a self-organization framework for wireless ad hoc networks. We investigate the use of directional beamforming for creating long-range short cuts between nodes. Using simulation results for randomized beamforming as a guideline, we identify crucial design issues for algorithm design. Our results show that, while significant path length r… ▽ More Motivated by the benefits of small world networks, we propose a self-organization framework for wireless ad hoc networks. We investigate the use of directional beamforming for creating long-range short cuts between nodes. Using simulation results for randomized beamforming as a guideline, we identify crucial design issues for algorithm design. Our results show that, while significant path length reduction is achievable, this is accompanied by the problem of asymmetric paths between nodes. Subsequently, we propose a distributed algorithm for small world creation that achieves path length reduction while maintaining connectivity. We define a new centrality measure that estimates the structural importance of nodes based on traffic flow in the network, which is used to identify the optimum nodes for beamforming. We show, using simulations, that this leads to significant reduction in path length while maintaining connectivity. △ Less

Submitted 6 March, 2012; originally announced March 2012.

Comments: Submitted to IEEE Transactions on Vehicular Technology

arXiv:1111.4807 [pdf, ps, other]

Achieving Small World Properties using Bio-Inspired Techniques in Wireless Networks

Authors: Rachit Agarwal, Abhik Banerjee, Vincent Gauthier, Monique Becker, Chai Kiat Yeo, Bu Sung Lee

Abstract: It is highly desirable and challenging for a wireless ad hoc network to have self-organization properties in order to achieve network wide characteristics. Studies have shown that Small World properties, primarily low average path length and high clustering coefficient, are desired properties for networks in general. However, due to the spatial nature of the wireless networks, achieving small worl… ▽ More It is highly desirable and challenging for a wireless ad hoc network to have self-organization properties in order to achieve network wide characteristics. Studies have shown that Small World properties, primarily low average path length and high clustering coefficient, are desired properties for networks in general. However, due to the spatial nature of the wireless networks, achieving small world properties remains highly challenging. Studies also show that, wireless ad hoc networks with small world properties show a degree distribution that lies between geometric and power law. In this paper, we show that in a wireless ad hoc network with non-uniform node density with only local information, we can significantly reduce the average path length and retain the clustering coefficient. To achieve our goal, our algorithm first identifies logical regions using Lateral Inhibition technique, then identifies the nodes that beamform and finally the beam properties using Flocking. We use Lateral Inhibition and Flocking because they enable us to use local state information as opposed to other techniques. We support our work with simulation results and analysis, which show that a reduction of up to 40% can be achieved for a high-density network. We also show the effect of hopcount used to create regions on average path length, clustering coefficient and connectivity. △ Less

Submitted 3 March, 2012; v1 submitted 21 November, 2011; originally announced November 2011.

Comments: Accepted for publication: Special Issue on Security and Performance of Networks and Clouds (The Computer Journal)

arXiv:1109.5959 [pdf, other]

doi 10.1109/GLOCOMW.2011.6162587

Self-organization of Nodes using Bio-Inspired Techniques for Achieving Small World Properties

Authors: Rachit Agarwal, Abhik Banerjee, Vincent Gauthier, Monique Becker, Chai Kiat Yeo, Bu Sung Lee

Abstract: In an autonomous wireless sensor network, self-organization of the nodes is essential to achieve network wide characteristics. We believe that connectivity in wireless autonomous networks can be increased and overall average path length can be reduced by using beamforming and bio-inspired algorithms. Recent works on the use of beamforming in wireless networks mostly assume the knowledge of the net… ▽ More In an autonomous wireless sensor network, self-organization of the nodes is essential to achieve network wide characteristics. We believe that connectivity in wireless autonomous networks can be increased and overall average path length can be reduced by using beamforming and bio-inspired algorithms. Recent works on the use of beamforming in wireless networks mostly assume the knowledge of the network in aggregation to either heterogeneous or hybrid deployment. We propose that without the global knowledge or the introduction of any special feature, the average path length can be reduced with the help of inspirations from the nature and simple interactions between neighboring nodes. Our algorithm also reduces the number of disconnected components within the network. Our results show that reduction in the average path length and the number of disconnected components can be achieved using very simple local rules and without the full network knowledge. △ Less

Submitted 27 September, 2011; originally announced September 2011.

Comments: Accepted to Joint workshop on complex networks and pervasive group communication (CCNet/PerGroup), in conjunction with IEEE Globecom 2011

arXiv:1109.5338 [pdf, ps, other]

doi 10.1109/GLOCOMW.2011.6162373

Self-Organization of Wireless Ad Hoc Networks as Small Worlds Using Long Range Directional Beams

Authors: Abhik Banerjee, Rachit Agarwal, Vincent Gauthier, Chai Kiat Yeo, Hossam Afifi, Bu Sung Lee

Abstract: We study how long range directional beams can be used for self-organization of a wireless network to exhibit small world properties. Using simulation results for randomized beamforming as a guideline, we identify crucial design issues for algorithm design. Subsequently, we propose an algorithm for deterministic creation of small worlds. We define a new centrality measure that estimates the structu… ▽ More We study how long range directional beams can be used for self-organization of a wireless network to exhibit small world properties. Using simulation results for randomized beamforming as a guideline, we identify crucial design issues for algorithm design. Subsequently, we propose an algorithm for deterministic creation of small worlds. We define a new centrality measure that estimates the structural importance of nodes based on traffic flow in the network, which is used to identify the optimum nodes for beamforming. This results in significant reduction in path length while maintaining connectivity. △ Less

Submitted 25 September, 2011; originally announced September 2011.

Comments: Accepted to Joint workshop on complex networks and pervasive group communication (CCNet/PerGroup), in conjunction with IEEE Globecom 2011

arXiv:1103.5046 [pdf, other]

From Linked Data to Relevant Data -- Time is the Essence

Authors: Markus Kirchberg, Ryan K L Ko, Bu Sung Lee

Abstract: The Semantic Web initiative puts emphasis not primarily on putting data on the Web, but rather on creating links in a way that both humans and machines can explore the Web of data. When such users access the Web, they leave a trail as Web servers maintain a history of requests. Web usage mining approaches have been studied since the beginning of the Web given the log's huge potential for purposes… ▽ More The Semantic Web initiative puts emphasis not primarily on putting data on the Web, but rather on creating links in a way that both humans and machines can explore the Web of data. When such users access the Web, they leave a trail as Web servers maintain a history of requests. Web usage mining approaches have been studied since the beginning of the Web given the log's huge potential for purposes such as resource annotation, personalization, forecasting etc. However, the impact of any such efforts has not really gone beyond generating statistics detailing who, when, and how Web pages maintained by a Web server were visited. △ Less

Submitted 25 March, 2011; originally announced March 2011.

Comments: 1st International Workshop on Usage Analysis and the Web of Data (USEWOD2011) in the 20th International World Wide Web Conference (WWW2011), Hyderabad, India, March 28th, 2011

Report number: WWW2011USEWOD/2011/kirkolee

Showing 1–16 of 16 results for author: Lee, B S