Search | arXiv e-print repository

doi 10.1145/3664190.3672530

Evaluation of Temporal Change in IR Test Collections

Authors: Jüri Keller, Timo Breuer, Philipp Schaer

Abstract: Information retrieval systems have been evaluated using the Cranfield paradigm for many years. This paradigm allows a systematic, fair, and reproducible evaluation of different retrieval methods in fixed experimental environments. However, real-world retrieval systems must cope with dynamic environments and temporal changes that affect the document collection, topical trends, and the individual us… ▽ More Information retrieval systems have been evaluated using the Cranfield paradigm for many years. This paradigm allows a systematic, fair, and reproducible evaluation of different retrieval methods in fixed experimental environments. However, real-world retrieval systems must cope with dynamic environments and temporal changes that affect the document collection, topical trends, and the individual user's perception of what is considered relevant. Yet, the temporal dimension in IR evaluations is still understudied. To this end, this work investigates how the temporal generalizability of effectiveness evaluations can be assessed. As a conceptual model, we generalize Cranfield-type experiments to the temporal context by classifying the change in the essential components according to the create, update, and delete operations of persistent storage known from CRUD. From the different types of change different evaluation scenarios are derived and it is outlined what they imply. Based on these scenarios, renowned state-of-the-art retrieval systems are tested and it is investigated how the retrieval effectiveness changes on different levels of granularity. We show that the proposed measures can be well adapted to describe the changes in the retrieval results. The experiments conducted confirm that the retrieval effectiveness strongly depends on the evaluation scenario investigated. We find that not only the average retrieval performance of single systems but also the relative system performance are strongly affected by the components that change and to what extent these components changed. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Journal ref: Proceedings of the 2024 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR '24), July 13, 2024, Washington, DC, USA

arXiv:2404.08785 [pdf, other]

Under pressure: learning-based analog gauge reading in the wild

Authors: Maurits Reitsma, Julian Keller, Kenneth Blomqvist, Roland Siegwart

Abstract: We propose an interpretable framework for reading analog gauges that is deployable on real world robotic systems. Our framework splits the reading task into distinct steps, such that we can detect potential failures at each step. Our system needs no prior knowledge of the type of gauge or the range of the scale and is able to extract the units used. We show that our gauge reading algorithm is able… ▽ More We propose an interpretable framework for reading analog gauges that is deployable on real world robotic systems. Our framework splits the reading task into distinct steps, such that we can detect potential failures at each step. Our system needs no prior knowledge of the type of gauge or the range of the scale and is able to extract the units used. We show that our gauge reading algorithm is able to extract readings with a relative reading error of less than 2%. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 7 pages, 8 figures, accepted for presentation at the 2024 IEEE International Conference on Robotics and Automation (ICRA) and for inclusion in the conference proceedings, finalist for the IEEE ICRA 2024 Best Paper Award in Automation, source code https://github.com/ethz-asl/analog_gauge_reader, Autonomous Systems Lab, ETH Zurich

arXiv:2310.07346 [pdf, other]

Preliminary Results of a Scientometric Analysis of the German Information Retrieval Community 2020-2023

Authors: Philipp Schaer, Svetlana Myshkina, Jüri Keller

Abstract: The German Information Retrieval community is located in two different sub-fields: Information and computer science. There are no current studies that investigate these communities on a scientometric level. Available studies only focus on the information scientific part of the community. We generated a data set of 401 recent IR-related publications extracted from six core IR conferences from a mai… ▽ More The German Information Retrieval community is located in two different sub-fields: Information and computer science. There are no current studies that investigate these communities on a scientometric level. Available studies only focus on the information scientific part of the community. We generated a data set of 401 recent IR-related publications extracted from six core IR conferences from a mainly computer scientific background. We analyze this data set at the institutional and researcher level. The data set is publicly released, and we also demonstrate a map** use case. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: Data available at https://github.com/irgroup/LWDA2023-IR-community

Journal ref: M. Leyer, Wichmann, J. (Eds.): Proceedings of the LWDA 2023 Workshops: BIA, DB, IR, KDML and WM. Marburg, Germany, 09.-11. October 2023

arXiv:2308.10549 [pdf, other]

Evaluating Temporal Persistence Using Replicability Measures

Authors: Jüri Keller, Timo Breuer, Philipp Schaer

Abstract: In real-world Information Retrieval (IR) experiments, the Evaluation Environment (EE) is exposed to constant change. Documents are added, removed, or updated, and the information need and the search behavior of users is evolving. Simultaneously, IR systems are expected to retain a consistent quality. The LongEval Lab seeks to investigate the longitudinal persistence of IR systems, and in this work… ▽ More In real-world Information Retrieval (IR) experiments, the Evaluation Environment (EE) is exposed to constant change. Documents are added, removed, or updated, and the information need and the search behavior of users is evolving. Simultaneously, IR systems are expected to retain a consistent quality. The LongEval Lab seeks to investigate the longitudinal persistence of IR systems, and in this work, we describe our participation. We submitted runs of five advanced retrieval systems, namely a Reciprocal Rank Fusion (RRF) approach, ColBERT, monoT5, Doc2Query, and E5, to both sub-tasks. Further, we cast the longitudinal evaluation as a replicability study to better understand the temporal change observed. As a result, we quantify the persistence of the submitted runs and see great potential in this evaluation method. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: To be published in Proceedings of the Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum, Thessaloniki, Greece 18 - 21, 2023

arXiv:2307.15202 [pdf, other]

doi 10.1109/LRA.2023.3342553

Multi-Robot Multi-Room Exploration with Geometric Cue Extraction and Circular Decomposition

Authors: Seungchan Kim, Micah Corah, John Keller, Graeme Best, Sebastian Scherer

Abstract: This work proposes an autonomous multi-robot exploration pipeline that coordinates the behaviors of robots in an indoor environment composed of multiple rooms. Contrary to simple frontier-based exploration approaches, we aim to enable robots to methodically explore and observe an unknown set of rooms in a structured building, kee** track of which rooms are already explored and sharing this infor… ▽ More This work proposes an autonomous multi-robot exploration pipeline that coordinates the behaviors of robots in an indoor environment composed of multiple rooms. Contrary to simple frontier-based exploration approaches, we aim to enable robots to methodically explore and observe an unknown set of rooms in a structured building, kee** track of which rooms are already explored and sharing this information among robots to coordinate their behaviors in a distributed manner. To this end, we propose (1) a geometric cue extraction method that processes 3D point cloud data and detects the locations of potential cues such as doors and rooms, (2) a circular decomposition for free spaces used for target assignment. Using these two components, our pipeline effectively assigns tasks among robots, and enables a methodical exploration of rooms. We evaluate the performance of our pipeline using a team of up to 3 aerial robots, and show that our method outperforms the baseline by 33.4% in simulation and 26.4% in real-world experiments. △ Less

Submitted 4 December, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

arXiv:2307.07607 [pdf, other]

SubT-MRS Dataset: Pushing SLAM Towards All-weather Environments

Authors: Shibo Zhao, Yuanjun Gao, Tianhao Wu, Damanpreet Singh, Rushan Jiang, Haoxiang Sun, Mansi Sarawata, Yuheng Qiu, Warren Whittaker, Ian Higgins, Yi Du, Shaoshu Su, Can Xu, John Keller, Jay Karhade, Lucas Nogueira, Sourojit Saha, Ji Zhang, Wenshan Wang, Chen Wang, Sebastian Scherer

Abstract: Simultaneous localization and map** (SLAM) is a fundamental task for numerous applications such as autonomous navigation and exploration. Despite many SLAM datasets have been released, current SLAM solutions still struggle to have sustained and resilient performance. One major issue is the absence of high-quality datasets including diverse all-weather conditions and a reliable metric for assessi… ▽ More Simultaneous localization and map** (SLAM) is a fundamental task for numerous applications such as autonomous navigation and exploration. Despite many SLAM datasets have been released, current SLAM solutions still struggle to have sustained and resilient performance. One major issue is the absence of high-quality datasets including diverse all-weather conditions and a reliable metric for assessing robustness. This limitation significantly restricts the scalability and generalizability of SLAM technologies, impacting their development, validation, and deployment. To address this problem, we present SubT-MRS, an extremely challenging real-world dataset designed to push SLAM towards all-weather environments to pursue the most robust SLAM performance. It contains multi-degraded environments including over 30 diverse scenes such as structureless corridors, varying lighting conditions, and perceptual obscurants like smoke and dust; multimodal sensors such as LiDAR, fisheye camera, IMU, and thermal camera; and multiple locomotions like aerial, legged, and wheeled robots. We develop accuracy and robustness evaluation tracks for SLAM and introduced novel robustness metrics. Comprehensive studies are performed, revealing new observations, challenges, and opportunities for future research. △ Less

Submitted 29 May, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

Journal ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, WA, June 2024

arXiv:2307.05263 [pdf, other]

Pegasus Simulator: An Isaac Sim Framework for Multiple Aerial Vehicles Simulation

Authors: Marcelo Jacinto, João Pinto, Jay Patrikar, John Keller, Rita Cunha, Sebastian Scherer, António Pascoal

Abstract: Develo** and testing novel control and motion planning algorithms for aerial vehicles can be a challenging task, with the robotics community relying more than ever on 3D simulation technologies to evaluate the performance of new algorithms in a variety of conditions and environments. In this work, we introduce the Pegasus Simulator, a modular framework implemented as an NVIDIA Isaac Sim extensio… ▽ More Develo** and testing novel control and motion planning algorithms for aerial vehicles can be a challenging task, with the robotics community relying more than ever on 3D simulation technologies to evaluate the performance of new algorithms in a variety of conditions and environments. In this work, we introduce the Pegasus Simulator, a modular framework implemented as an NVIDIA Isaac Sim extension that enables real-time simulation of multiple multirotor vehicles in photo-realistic environments, while providing out-of-the-box integration with the widely adopted PX4-Autopilot and ROS2 through its modular implementation and intuitive graphical user interface. To demonstrate some of its capabilities, a nonlinear controller was implemented and simulation results for two drones performing aggressive flight maneuvers are presented. Code and documentation for this framework are also provided as supplementary material. △ Less

Submitted 15 April, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

Comments: This paper has been accepted for publication in the 2024 International Conference on Unmanned Aircraft Systems (ICUAS), IEEE

arXiv:2302.12131 [pdf, other]

Automated Statement Extraction from Press Briefings

Authors: Jüri Keller, Meik Bittkowski, Philipp Schaer

Abstract: Scientific press briefings are a valuable information source. They consist of alternating expert speeches, questions from the audience and their answers. Therefore, they can contribute to scientific and fact-based media coverage. Even though press briefings are highly informative, extracting statements relevant to individual journalistic tasks is challenging and time-consuming. To support this tas… ▽ More Scientific press briefings are a valuable information source. They consist of alternating expert speeches, questions from the audience and their answers. Therefore, they can contribute to scientific and fact-based media coverage. Even though press briefings are highly informative, extracting statements relevant to individual journalistic tasks is challenging and time-consuming. To support this task, an automated statement extraction system is proposed. Claims are used as the main feature to identify statements in press briefing transcripts. The statement extraction task is formulated as a four-step procedure. First, the press briefings are split into sentences and passages, then claim sentences are identified through sequence classification. Subsequently, topics are detected, and the sentences are filtered to improve the coherence and assess the length of the statements. The results indicate that claim detection can be used to identify statements in press briefings. While many statements can be extracted automatically with this system, they are not always as coherent as needed to be understood without context and may need further review by knowledgeable persons. △ Less

Submitted 24 February, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

Comments: Datenbanksysteme für Business, Technologie und Web (BTW 2023)

ACM Class: H.3.3

arXiv:2212.11850 [pdf, other]

doi 10.1109/TDSC.2024.3410679

DYST (Did You See That?): An Amplified Covert Channel That Points To Previously Seen Data

Authors: Steffen Wendzel, Tobias Schmidbauer, Sebastian Zillien, Jörg Keller

Abstract: Covert channels are stealthy communication channels that enable manifold adversary and legitimate scenarios, ranging from malware communications to the exchange of confidential information by journalists and censorship circumvention. We introduce a new class of covert channels that we call history covert channels. We further present a new paradigm: covert channel amplification. All covert channels… ▽ More Covert channels are stealthy communication channels that enable manifold adversary and legitimate scenarios, ranging from malware communications to the exchange of confidential information by journalists and censorship circumvention. We introduce a new class of covert channels that we call history covert channels. We further present a new paradigm: covert channel amplification. All covert channels described until now need to craft seemingly legitimate flows or need to modify third-party flows, mimicking unsuspicious behavior. In contrast, history covert channels can communicate by pointing to unaltered legitimate traffic created by regular network nodes. Only a negligible fraction of the covert communication process requires the transfer of covert information by the covert channel's sender. This information can be sent through different protocols/channels. Our approach allows an amplification of the covert channel's message size, i.e., minimizing the fraction of actually transferred secret data by a covert channel's sender in relation to the overall secret data being exchanged. Further, we extend the current taxonomy for covert channels to show how history channels can be categorized. We describe multiple scenarios in which history covert channels can be realized, analyze the characteristics of these channels, and show how their configuration can be optimized. △ Less

Submitted 7 June, 2024; v1 submitted 22 December, 2022; originally announced December 2022.

Comments: 20 pages. IEEE Transactions on Dependable and Secure Computing (TDSC), 2024

arXiv:2209.14843 [pdf, other]

doi 10.1007/978-3-031-13643-6_11

Evaluating Research Dataset Recommendations in a Living Lab

Authors: Jüri Keller, Leon Paul Mondrian Munz

Abstract: The search for research datasets is as important as laborious. Due to the importance of the choice of research data in further research, this decision must be made carefully. Additionally, because of the growing amounts of data in almost all areas, research data is already a central artifact in empirical sciences. Consequentially, research dataset recommendations can beneficially supplement scient… ▽ More The search for research datasets is as important as laborious. Due to the importance of the choice of research data in further research, this decision must be made carefully. Additionally, because of the growing amounts of data in almost all areas, research data is already a central artifact in empirical sciences. Consequentially, research dataset recommendations can beneficially supplement scientific publication searches. We formulated the recommendation task as a retrieval problem by focussing on broad similarities between research datasets and scientific publications. In a multistage approach, initial recommendations were retrieved by the BM25 ranking function and dynamic queries. Subsequently, the initial ranking was re-ranked utilizing click feedback and document embeddings. The proposed system was evaluated live on real user interaction data using the STELLA infrastructure in the LiLAS Lab at CLEF 2021. Our experimental system could efficiently be fine-tuned before the live evaluation by pre-testing the system with a pseudo test collection based on prior user interaction data from the live system. The results indicate that the experimental system outperforms the other participating systems. △ Less

Submitted 30 September, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

Comments: Best of 2021 Labs: LiLAS

ACM Class: H.3.3

Journal ref: Lab Experimental IR Meets Multilinguality, Multimodality, and Interaction - 13th International Conference of the CLEF Association, CLEF 2022, Bologna, Italy, September 5-8, 2022, Proceedings

arXiv:2209.03878 [pdf, other]

doi 10.1109/ICMLA55696.2022.00032

Histogram Layers for Synthetic Aperture Sonar Imagery

Authors: Joshua Peeples, Alina Zare, Jeffrey Dale, James Keller

Abstract: Synthetic aperture sonar (SAS) imagery is crucial for several applications, including target recognition and environmental segmentation. Deep learning models have led to much success in SAS analysis; however, the features extracted by these approaches may not be suitable for capturing certain textural information. To address this problem, we present a novel application of histogram layers on SAS i… ▽ More Synthetic aperture sonar (SAS) imagery is crucial for several applications, including target recognition and environmental segmentation. Deep learning models have led to much success in SAS analysis; however, the features extracted by these approaches may not be suitable for capturing certain textural information. To address this problem, we present a novel application of histogram layers on SAS imagery. The addition of histogram layer(s) within the deep learning models improved performance by incorporating statistical texture information on both synthetic and real-world datasets. △ Less

Submitted 8 September, 2022; originally announced September 2022.

Comments: 7 pages, 9 Figures, Accepted to IEEE International Conference on Machine Learning and Applications (ICMLA) 2022

arXiv:2207.08922 [pdf, other]

doi 10.1145/3477495.3531738

ir_metadata: An Extensible Metadata Schema for IR Experiments

Authors: Timo Breuer, Jüri Keller, Philipp Schaer

Abstract: The information retrieval (IR) community has a strong tradition of making the computational artifacts and resources available for future reuse, allowing the validation of experimental results. Besides the actual test collections, the underlying run files are often hosted in data archives as part of conferences like TREC, CLEF, or NTCIR. Unfortunately, the run data itself does not provide much info… ▽ More The information retrieval (IR) community has a strong tradition of making the computational artifacts and resources available for future reuse, allowing the validation of experimental results. Besides the actual test collections, the underlying run files are often hosted in data archives as part of conferences like TREC, CLEF, or NTCIR. Unfortunately, the run data itself does not provide much information about the underlying experiment. For instance, the single run file is not of much use without the context of the shared task's website or the run data archive. In other domains, like the social sciences, it is good practice to annotate research data with metadata. In this work, we introduce ir_metadata - an extensible metadata schema for TREC run files based on the PRIMAD model. We propose to align the metadata annotations to PRIMAD, which considers components of computational experiments that can affect reproducibility. Furthermore, we outline important components and information that should be reported in the metadata and give evidence from the literature. To demonstrate the usefulness of these metadata annotations, we implement new features in repro_eval that support the outlined metadata schema for the use case of reproducibility studies. Additionally, we curate a dataset with run files derived from experiments with different instantiations of PRIMAD components and annotate these with the corresponding metadata. In the experiments, we cover reproducibility experiments that are identified by the metadata and classified by PRIMAD. With this work, we enable IR researchers to annotate TREC run files and improve the reuse value of experimental artifacts even further. △ Less

Submitted 18 July, 2022; originally announced July 2022.

Comments: Resource paper

Journal ref: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '22), July 11-15, 2022, Madrid, Spain

arXiv:2204.03140 [pdf, other]

Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments

Authors: Yafei Hu, Junyi Geng, Chen Wang, John Keller, Sebastian Scherer

Abstract: Autonomous exploration has many important applications. However, classic information gain-based or frontier-based exploration only relies on the robot current state to determine the immediate exploration goal, which lacks the capability of predicting the value of future states and thus leads to inefficient exploration decisions. This paper presents a method to learn how "good" states are, measured… ▽ More Autonomous exploration has many important applications. However, classic information gain-based or frontier-based exploration only relies on the robot current state to determine the immediate exploration goal, which lacks the capability of predicting the value of future states and thus leads to inefficient exploration decisions. This paper presents a method to learn how "good" states are, measured by the state value function, to provide a guidance for robot exploration in real-world challenging environments. We formulate our work as an off-policy evaluation (OPE) problem for robot exploration (OPERE). It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. We also design an intrinsic reward function based on sensor information coverage to enable the robot to gain more information with sparse extrinsic rewards. Results show that our method enables the robot to predict the value of future states so as to better guide robot exploration. The proposed algorithm achieves better prediction and exploration performance compared with the state-of-the-arts. To the best of our knowledge, this work for the first time demonstrates value function prediction on real-world dataset for robot exploration in challenging subterranean and urban environments. More details and demo videos can be found at https://jeffreyyh.github.io/opere/. △ Less

Submitted 24 May, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: Published in RA-L 2023

arXiv:2110.07433 [pdf, other]

doi 10.1117/12.2305178

Possibilistic Fuzzy Local Information C-Means with Automated Feature Selection for Seafloor Segmentation

Authors: Joshua Peeples, Daniel Suen, Alina Zare, James Keller

Abstract: The Possibilistic Fuzzy Local Information C-Means (PFLICM) method is presented as a technique to segment side-look synthetic aperture sonar (SAS) imagery into distinct regions of the sea-floor. In this work, we investigate and present the results of an automated feature selection approach for SAS image segmentation. The chosen features and resulting segmentation from the image will be assessed bas… ▽ More The Possibilistic Fuzzy Local Information C-Means (PFLICM) method is presented as a technique to segment side-look synthetic aperture sonar (SAS) imagery into distinct regions of the sea-floor. In this work, we investigate and present the results of an automated feature selection approach for SAS image segmentation. The chosen features and resulting segmentation from the image will be assessed based on a select quantitative clustering validity criterion and the subset of the features that reach a desired threshold will be used for the segmentation process. △ Less

Submitted 14 October, 2021; originally announced October 2021.

Comments: Proc. SPIE 10628, Detection and Sensing of Mines, Explosive Objects, and Obscured Targets XXIII (30 April 2018), 14 pages, 7 figures, 5 tables

arXiv:2106.08654 [pdf, other]

doi 10.1145/3465481.3470069

A Revised Taxonomy of Steganography Embedding Patterns

Authors: Steffen Wendzel, Luca Caviglione, Wojciech Mazurczyk, Aleksandra Mileva, Jana Dittmann, Christian Krätzer, Kevin Lamshöft, Claus Vielhauer, Laura Hartmann, Jörg Keller, Tom Neubert

Abstract: Steganography embraces several hiding techniques which spawn across multiple domains. However, the related terminology is not unified among the different domains, such as digital media steganography, text steganography, cyber-physical systems steganography, network steganography (network covert channels), local covert channels, and out-of-band covert channels. To cope with this, a prime attempt ha… ▽ More Steganography embraces several hiding techniques which spawn across multiple domains. However, the related terminology is not unified among the different domains, such as digital media steganography, text steganography, cyber-physical systems steganography, network steganography (network covert channels), local covert channels, and out-of-band covert channels. To cope with this, a prime attempt has been done in 2015, with the introduction of the so-called hiding patterns, which allow to describe hiding techniques in a more abstract manner. Despite significant enhancements, the main limitation of such a taxonomy is that it only considers the case of network steganography. Therefore, this paper reviews both the terminology and the taxonomy of hiding patterns as to make them more general. Specifically, hiding patterns are split into those that describe the embedding and the representation of hidden data within the cover object. As a first research action, we focus on embedding hiding patterns and we show how they can be applied to multiple domains of steganography instead of being limited to the network scenario. Additionally, we exemplify representation patterns using network steganography. Our pattern collection is available under https://patterns.ztt.hs-worms.de. △ Less

Submitted 16 June, 2021; originally announced June 2021.

Journal ref: Proc. of the 16th International Conference on Availability, Reliability and Security (ARES'21), August 17--20, 2021, Vienna, Austria

arXiv:2105.05784 [pdf, other]

Particle-Based Assembly Using Precise Global Control

Authors: Jakob Keller, Christian Rieck, Christian Scheffer, Arne Schmidt

Abstract: In micro- and nano-scale systems, particles can be moved by using an external force like gravity or a magnetic field. In the presence of adhesive particles that can attach to each other, the challenge is to decide whether a shape is constructible. Previous work provides a class of shapes for which constructibility can be decided efficiently when particles move maximally into the same direction ind… ▽ More In micro- and nano-scale systems, particles can be moved by using an external force like gravity or a magnetic field. In the presence of adhesive particles that can attach to each other, the challenge is to decide whether a shape is constructible. Previous work provides a class of shapes for which constructibility can be decided efficiently when particles move maximally into the same direction induced by a global signal. In this paper we consider the single step model, i.e., a model in which each particle moves one unit step into the given direction. We restrict the assembly process such that at each single time step actually one particle is added to and moved within the workspace. We prove that deciding constructibility is NP-complete for three-dimensional shapes, and that a maximum constructible shape can be approximated. The same approximation algorithm applies for 2D. We further present linear-time algorithms to decide whether or not a tree-shape in 2D or 3D is constructible. Scaling a shape yields constructibility; in particular we show that the $2$-scaled copy of every non-degenerate polyomino is constructible. In the three-dimensional setting we show that the $3$-scaled copy of every non-degenerate polycube is constructible. △ Less

Submitted 15 June, 2022; v1 submitted 12 May, 2021; originally announced May 2021.

Comments: 25 pages, 14 figures, full version of an extended abstract that appeared in the proceedings of the 17th Algorithms and Data Structures Symposium (WADS 2021); revised version with clearer model/problem description and some additional related work

ACM Class: F.2.2

arXiv:2103.16829 [pdf, other]

Graph-Based Topological Exploration Planning in Large-Scale 3D Environments

Authors: Fan Yang, Dung-Han Lee, John Keller, Sebastian Scherer

Abstract: Currently, state-of-the-art exploration methods maintain high-resolution map representations in order to optimize exploration goals in each step that maximizes information gain. However, during exploring, those "optimal" selections could quickly become obsolete due to the influx of new information, especially in large-scale environments, and result in high-frequency re-planning that hinders the ov… ▽ More Currently, state-of-the-art exploration methods maintain high-resolution map representations in order to optimize exploration goals in each step that maximizes information gain. However, during exploring, those "optimal" selections could quickly become obsolete due to the influx of new information, especially in large-scale environments, and result in high-frequency re-planning that hinders the overall exploration efficiency. In this paper, we propose a graph-based topological planning framework, building a sparse topological map in three-dimensional (3D) space to guide exploration steps with high-level intents so as to render consistent exploration maneuvers. Specifically, this work presents a novel method to estimate 3D space's geometry with convex polyhedrons. Then, the geometry information is utilized to group space into distinctive regions. And those regions are added as nodes into the topological map, directing the exploration process. We compared our method with the state-of-the-art in simulated environments. The proposed method achieves higher space coverage and outperforms exploration efficiency by more than 40% during experiments. Finally, a field experiment was conducted to further evaluate the applicability of our method to empower efficient and robust exploration in real-world environments. △ Less

Submitted 31 March, 2021; originally announced March 2021.

Comments: Preprint version for ICRA2021 final submission

MSC Class: 68T40 ACM Class: I.2.9

arXiv:2103.04872 [pdf, ps, other]

The Weakly-Labeled Rand Index

Authors: Dylan Stewart, Anna Hampton, Alina Zare, Jeff Dale, James Keller

Abstract: Synthetic Aperture Sonar (SAS) surveys produce imagery with large regions of transition between seabed types. Due to these regions, it is difficult to label and segment the imagery and, furthermore, challenging to score the image segmentations appropriately. While there are many approaches to quantify performance in standard crisp segmentation schemes, drawing hard boundaries in remote sensing ima… ▽ More Synthetic Aperture Sonar (SAS) surveys produce imagery with large regions of transition between seabed types. Due to these regions, it is difficult to label and segment the imagery and, furthermore, challenging to score the image segmentations appropriately. While there are many approaches to quantify performance in standard crisp segmentation schemes, drawing hard boundaries in remote sensing imagery where gradients and regions of uncertainty exist is inappropriate. These cases warrant weak labels and an associated appropriate scoring approach. In this paper, a labeling approach and associated modified version of the Rand index for weakly-labeled data is introduced to address these issues. Results are evaluated with the new index and compared to traditional segmentation evaluation methods. Experimental results on a SAS data set containing must-link and cannot-link labels show that our Weakly-Labeled Rand index scores segmentations appropriately in reference to qualitative performance and is more suitable than traditional quantitative metrics for scoring weakly-labeled data. △ Less

Submitted 8 March, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

arXiv:2103.00433 [pdf, other]

doi 10.1016/j.future.2018.12.047

Countering Adaptive Network Covert Communication with Dynamic Wardens

Authors: Wojciech Mazurczyk, Steffen Wendzel, Mehdi Chourib, Jörg Keller

Abstract: Network covert channels are hidden communication channels in computer networks. They influence several factors of the cybersecurity economy. For instance, by improving the stealthiness of botnet communications, they aid and preserve the value of darknet botnet sales. Covert channels can also be used to secretly exfiltrate confidential data out of organizations, potentially resulting in loss of mar… ▽ More Network covert channels are hidden communication channels in computer networks. They influence several factors of the cybersecurity economy. For instance, by improving the stealthiness of botnet communications, they aid and preserve the value of darknet botnet sales. Covert channels can also be used to secretly exfiltrate confidential data out of organizations, potentially resulting in loss of market/research advantage. Considering the above, efforts are needed to develop effective countermeasures against such threats. Thus in this paper, based on the introduced novel warden taxonomy, we present and evaluate a new concept of a dynamic warden. Its main novelty lies in the modification of the warden's behavior over time, making it difficult for the adaptive covert communication parties to infer its strategy and perform a successful hidden data exchange. Obtained experimental results indicate the effectiveness of the proposed approach. △ Less

Submitted 28 February, 2021; originally announced March 2021.

Journal ref: Elsevier FGCS, Volume 94, May 2019, Pages 712-725

arXiv:2101.03134 [pdf, other]

Explainable Systematic Analysis for Synthetic Aperture Sonar Imagery

Authors: Sarah Walker, Joshua Peeples, Jeff Dale, James Keller, Alina Zare

Abstract: In this work, we present an in-depth and systematic analysis using tools such as local interpretable model-agnostic explanations (LIME) (arXiv:1602.04938) and divergence measures to analyze what changes lead to improvement in performance in fine tuned models for synthetic aperture sonar (SAS) data. We examine the sensitivity to factors in the fine tuning process such as class imbalance. Our findin… ▽ More In this work, we present an in-depth and systematic analysis using tools such as local interpretable model-agnostic explanations (LIME) (arXiv:1602.04938) and divergence measures to analyze what changes lead to improvement in performance in fine tuned models for synthetic aperture sonar (SAS) data. We examine the sensitivity to factors in the fine tuning process such as class imbalance. Our findings show not only an improvement in seafloor texture classification, but also provide greater insight into what features play critical roles in improving performance as well as a knowledge of the importance of balanced data for fine tuning deep learning models for seafloor classification in SAS imagery. △ Less

Submitted 16 March, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

Comments: IGARSS 2021

arXiv:2012.15764 [pdf, other]

doi 10.1109/LGRS.2022.3156532

Divergence Regulated Encoder Network for Joint Dimensionality Reduction and Classification

Authors: Joshua Peeples, Sarah Walker, Connor McCurley, Alina Zare, James Keller, Weihuang Xu

Abstract: Feature representation is an important aspect of remote-sensing based image classification. While deep convolutional neural networks are able to effectively amalgamate information, large numbers of parameters often make learned features inscrutable and difficult to transfer to alternative models. In order to better represent statistical texture information for remote-sensing image classification,… ▽ More Feature representation is an important aspect of remote-sensing based image classification. While deep convolutional neural networks are able to effectively amalgamate information, large numbers of parameters often make learned features inscrutable and difficult to transfer to alternative models. In order to better represent statistical texture information for remote-sensing image classification, in this paper, we investigate performing joint dimensionality reduction and classification using a novel histogram neural network. Motivated by a popular dimensionality reduction approach, t-Distributed Stochastic Neighbor Embedding (t-SNE), our proposed method incorporates a classification loss computed on samples in a low-dimensional embedding space. We compare the learned sample embeddings against coordinates found by t-SNE in terms of classification accuracy and qualitative assessment. We also explore use of various divergence measures in the t-SNE objective. The proposed method has several advantages such as readily embedding out-of-sample points and reducing feature dimensionality while retaining class discriminability. Our results show that the proposed approach maintains and/or improves classification performance and reveals characteristics of features produced by neural networks that may be helpful for other applications. △ Less

Submitted 3 March, 2022; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: 8 pages (5 main pages and 3 supplemental pages), 2 figures, accepted to IEEE Geoscience and Remote Sensing Letters

arXiv:2010.00635 [pdf]

StreamSoNG: A Soft Streaming Classification Approach

Authors: Wenlong Wu, James M. Keller, Jeffrey Dale, James C. Bezdek

Abstract: Examining most streaming clustering algorithms leads to the understanding that they are actually incremental classification models. They model existing and newly discovered structures via summary information that we call footprints. Incoming data is normally assigned a crisp label (into one of the structures) and that structure's footprint is incrementally updated. There is no reason that these as… ▽ More Examining most streaming clustering algorithms leads to the understanding that they are actually incremental classification models. They model existing and newly discovered structures via summary information that we call footprints. Incoming data is normally assigned a crisp label (into one of the structures) and that structure's footprint is incrementally updated. There is no reason that these assignments need to be crisp. In this paper, we propose a new streaming classification algorithm that uses Neural Gas prototypes as footprints and produces a possibilistic label vector (of typicalities) for each incoming vector. These typicalities are generated by a modified possibilistic k-nearest neighbor algorithm. The approach is tested on synthetic and real image datasets. We compare our approach to three other streaming classifiers based on the Adaptive Random Forest, Very Fast Decision Rules, and the DenStream algorithm with excellent results. △ Less

Submitted 13 July, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:2006.11062 [pdf, other]

Influence of Incremental Constraints on Energy Consumption and Static Scheduling Time for Moldable Tasks with Deadline

Authors: Jörg Keller, Sebastian Litzinger

Abstract: Static scheduling of independent, moldable tasks on parallel machines with frequency scaling comprises decisions on core allocation, assignment, frequency scaling and ordering, to meet a deadline and minimize energy consumption. Constraining some of these decisions reduces the solution space, i.e. may increase energy consumption, but may also reduce scheduling time or give the chance to tackle lar… ▽ More Static scheduling of independent, moldable tasks on parallel machines with frequency scaling comprises decisions on core allocation, assignment, frequency scaling and ordering, to meet a deadline and minimize energy consumption. Constraining some of these decisions reduces the solution space, i.e. may increase energy consumption, but may also reduce scheduling time or give the chance to tackle larger task sets. We investigate the influence of different constraints that lead from an unrestricted scheduler via two intermediate steps to the crown scheduler, by presenting integer linear programs for all four schedulers. We compare scheduling time and energy consumption for a benchmark suite of synthetic task sets of different sizes. Our results indicate that the final step towards the crown scheduler -- the execution order constraint -- is responsible for faster scheduling when task sets are small, and lower energy consumption when we deal with large task sets. △ Less

Submitted 19 June, 2020; originally announced June 2020.

Comments: Presented at the 13th International Workshop on Programmability and Architectures for Heterogeneous Multicores, 2020 (arXiv:2005.07619)

Report number: Report-no: MULTIPROG/2020/5

arXiv:1912.02259 [pdf, other]

Extending the Morphological Hit-or-Miss Transform to Deep Neural Networks

Authors: Muhammad Aminul Islam, Bryce Murray, Andrew Buck, Derek T. Anderson, Grant Scott, Mihail Popescu, James Keller

Abstract: While most deep learning architectures are built on convolution, alternative foundations like morphology are being explored for purposes like interpretability and its connection to the analysis and processing of geometric structures. The morphological hit-or-miss operation has the advantage that it takes into account both foreground and background information when evaluating target shape in an ima… ▽ More While most deep learning architectures are built on convolution, alternative foundations like morphology are being explored for purposes like interpretability and its connection to the analysis and processing of geometric structures. The morphological hit-or-miss operation has the advantage that it takes into account both foreground and background information when evaluating target shape in an image. Herein, we identify limitations in existing hit-or-miss neural definitions and we formulate an optimization problem to learn the transform relative to deeper architectures. To this end, we model the semantically important condition that the intersection of the hit and miss structuring elements (SEs) should be empty and we present a way to express Don't Care (DNC), which is important for denoting regions of an SE that are not relevant to detecting a target pattern. Our analysis shows that convolution, in fact, acts like a hit-miss transform through semantic interpretation of its filter differences. On these premises, we introduce an extension that outperforms conventional convolution on benchmark data. Quantitative experiments are provided on synthetic and benchmark data, showing that the direct encoding hit-or-miss transform provides better interpretability on learned shapes consistent with objects whereas our morphologically inspired generalized convolution yields higher classification accuracy. Last, qualitative hit and miss filter visualizations are provided relative to single morphological layer. △ Less

Submitted 27 September, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

arXiv:1910.04874 [pdf, other]

A Stereo Algorithm for Thin Obstacles and Reflective Objects

Authors: John Keller, Sebastian Scherer

Abstract: Stereo cameras are a popular choice for obstacle avoidance for outdoor lighweight, low-cost robotics applications. However, they are unable to sense thin and reflective objects well. Currently, many algorithms are tuned to perform well on indoor scenes like the Middlebury dataset. When navigating outdoors, reflective objects, like windows and glass, and thin obstacles, like wires, are not well han… ▽ More Stereo cameras are a popular choice for obstacle avoidance for outdoor lighweight, low-cost robotics applications. However, they are unable to sense thin and reflective objects well. Currently, many algorithms are tuned to perform well on indoor scenes like the Middlebury dataset. When navigating outdoors, reflective objects, like windows and glass, and thin obstacles, like wires, are not well handled by most stereo disparity algorithms. Reflections, repeating patterns and objects parallel to the cameras' baseline causes mismatches between image pairs which leads to bad disparity estimates. Thin obstacles are difficult for many sliding window based disparity methods to detect because they do not take up large portions of the pixels in the sliding window. We use a trinocular camera setup and micropolarizer camera capable of detecting reflective objects to overcome these issues. We present a hierarchical disparity algorithm that reduces noise, separately identify wires using semantic object triangulation in three images, and use information about the polarization of light to estimate the disparity of reflective objects. We evaluate our approach on outdoor data that we collected. Our method contained an average of 9.27% of bad pixels compared to a typical stereo algorithm's 18.4% of bad pixels in scenes containing reflective objects. Our trinocular and semantic wire disparity methods detected 53% of wire pixels, whereas a typical two camera stereo algorithm detected 5%. △ Less

Submitted 3 October, 2019; originally announced October 2019.

Comments: 6 pages, 5 figures

arXiv:1905.04394 [pdf, other]

doi 10.1109/TFUZZ.2019.2917124

Enabling Explainable Fusion in Deep Learning with Fuzzy Integral Neural Networks

Authors: Muhammad Aminul Islam, Derek T. Anderson, Anthony J. Pinar, Timothy C. Havens, Grant Scott, James M. Keller

Abstract: Information fusion is an essential part of numerous engineering systems and biological functions, e.g., human cognition. Fusion occurs at many levels, ranging from the low-level combination of signals to the high-level aggregation of heterogeneous decision-making processes. While the last decade has witnessed an explosion of research in deep learning, fusion in neural networks has not observed the… ▽ More Information fusion is an essential part of numerous engineering systems and biological functions, e.g., human cognition. Fusion occurs at many levels, ranging from the low-level combination of signals to the high-level aggregation of heterogeneous decision-making processes. While the last decade has witnessed an explosion of research in deep learning, fusion in neural networks has not observed the same revolution. Specifically, most neural fusion approaches are ad hoc, are not understood, are distributed versus localized, and/or explainability is low (if present at all). Herein, we prove that the fuzzy Choquet integral (ChI), a powerful nonlinear aggregation function, can be represented as a multi-layer network, referred to hereafter as ChIMP. We also put forth an improved ChIMP (iChIMP) that leads to a stochastic gradient descent-based optimization in light of the exponential number of ChI inequality constraints. An additional benefit of ChIMP/iChIMP is that it enables eXplainable AI (XAI). Synthetic validation experiments are provided and iChIMP is applied to the fusion of a set of heterogeneous architecture deep models in remote sensing. We show an improvement in model accuracy and our previously established XAI indices shed light on the quality of our data, model, and its decisions. △ Less

Submitted 10 May, 2019; originally announced May 2019.

Comments: IEEE Transactions on Fuzzy Systems

arXiv:1904.12059 [pdf, other]

ARCHANGEL: Tamper-proofing Video Archives using Temporal Content Hashes on the Blockchain

Authors: Tu Bui, Daniel Cooper, John Collomosse, Mark Bell, Alex Green, John Sheridan, Jez Higgins, Arindra Das, Jared Keller, Olivier Thereaux, Alan Brown

Abstract: We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the co… ▽ More We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated. △ Less

Submitted 26 April, 2019; originally announced April 2019.

Comments: Accepted to CVPR Blockchain Workshop 2019

arXiv:1904.01795 [pdf, other]

MAVNet: an Effective Semantic Segmentation Micro-Network for MAV-based Tasks

Authors: Ty Nguyen, Shreyas S. Shivakumar, Ian D. Miller, James Keller, Elijah S. Lee, Alex Zhou, Tolga Ozaslan, Giuseppe Loianno, Joseph H. Harwood, Jennifer Wozencraft, Camillo J. Taylor, Vijay Kumar

Abstract: Real-time semantic image segmentation on platforms subject to size, weight and power (SWaP) constraints is a key area of interest for air surveillance and inspection. In this work, we propose MAVNet: a small, light-weight, deep neural network for real-time semantic segmentation on micro Aerial Vehicles (MAVs). MAVNet, inspired by ERFNet, features 400 times fewer parameters and achieves comparable… ▽ More Real-time semantic image segmentation on platforms subject to size, weight and power (SWaP) constraints is a key area of interest for air surveillance and inspection. In this work, we propose MAVNet: a small, light-weight, deep neural network for real-time semantic segmentation on micro Aerial Vehicles (MAVs). MAVNet, inspired by ERFNet, features 400 times fewer parameters and achieves comparable performance with some reference models in empirical experiments. Our model achieves a trade-off between speed and accuracy, achieving up to 48 FPS on an NVIDIA 1080Ti and 9 FPS on the NVIDIA Jetson Xavier when processing high resolution imagery. Additionally, we provide two novel datasets that represent challenges in semantic segmentation for real-time MAV tracking and infrastructure inspection tasks and verify MAVNet on these datasets. Our algorithm and datasets are made publicly available. △ Less

Submitted 8 June, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

Comments: 8 pages, 9 figures

arXiv:1904.01014 [pdf, other]

doi 10.1117/12.2519484

Comparison of Possibilistic Fuzzy Local Information C-Means and Possibilistic K-Nearest Neighbors for Synthetic Aperture Sonar Image Segmentation

Authors: Joshua Peeples, Matthew Cook, Daniel Suen, Alina Zare, James Keller

Abstract: Synthetic aperture sonar (SAS) imagery can generate high resolution images of the seafloor. Thus, segmentation algorithms can be used to partition the images into different seafloor environments. In this paper, we compare two possibilistic segmentation approaches. Possibilistic approaches allow for the ability to detect novel or outlier environments as well as well known classes. The Possibilistic… ▽ More Synthetic aperture sonar (SAS) imagery can generate high resolution images of the seafloor. Thus, segmentation algorithms can be used to partition the images into different seafloor environments. In this paper, we compare two possibilistic segmentation approaches. Possibilistic approaches allow for the ability to detect novel or outlier environments as well as well known classes. The Possibilistic Fuzzy Local Information C-Means (PFLICM) algorithm has been previously applied to segment SAS imagery. Additionally, the Possibilistic K-Nearest Neighbors (PKNN) algorithm has been used in other domains such as landmine detection and hyperspectral imagery. In this paper, we compare the segmentation performance of a semi-supervised approach using PFLICM and a supervised method using Possibilistic K-NN. We include final segmentation results on multiple SAS images and a quantitative assessment of each algorithm. △ Less

Submitted 1 April, 2019; originally announced April 2019.

Journal ref: Proc. SPIE 110120, Detection and Sensing of Mines, Explosive Objects, and Obscured Targets XXIV (10 May 2019)

arXiv:1809.06576 [pdf, other]

U-Net for MAV-based Penstock Inspection: an Investigation of Focal Loss in Multi-class Segmentation for Corrosion Identification

Authors: Ty Nguyen, Tolga Ozaslan, Ian D. Miller, James Keller, Giuseppe Loianno, Camillo J. Taylor, Daniel D. Lee, Vijay Kumar, Joseph H. Harwood, Jennifer Wozencraft

Abstract: Periodical inspection and maintenance of critical infrastructure such as dams, penstocks, and locks are of significant importance to prevent catastrophic failures. Conventional manual inspection methods require inspectors to climb along a penstock to spot corrosion, rust and crack formation which is unsafe, labor-intensive, and requires intensive training. This work presents an alternative approac… ▽ More Periodical inspection and maintenance of critical infrastructure such as dams, penstocks, and locks are of significant importance to prevent catastrophic failures. Conventional manual inspection methods require inspectors to climb along a penstock to spot corrosion, rust and crack formation which is unsafe, labor-intensive, and requires intensive training. This work presents an alternative approach using a Micro Aerial Vehicle (MAV) that autonomously flies to collect imagery which is then fed into a pretrained deep-learning model to identify corrosion. Our simplified U-Net trained with less than 40 image samples can do inference at 12 fps on a single GPU. We analyze different loss functions to solve the class imbalance problem, followed by a discussion on choosing proper metrics and weights for object classes. Results obtained with the dataset collected from Center Hill Dam, TN show that focal loss function, combined with a proper set of class weights yield better segmentation results than the base loss, Softmax cross entropy. Our method can be used in combination with planning algorithm to offer a complete, safe and cost-efficient solution to autonomous infrastructure inspection. △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: 8 Pages, 4 figures

arXiv:1709.10180 [pdf, other]

Possibilistic Fuzzy Local Information C-Means for Sonar Image Segmentation

Authors: Alina Zare, Nicholas Young, Daniel Suen, Thomas Nabelek, Aquila Galusha, James Keller

Abstract: Side-look synthetic aperture sonar (SAS) can produce very high quality images of the sea-floor. When viewing this imagery, a human observer can often easily identify various sea-floor textures such as sand ripple, hard-packed sand, sea grass and rock. In this paper, we present the Possibilistic Fuzzy Local Information C-Means (PFLICM) approach to segment SAS imagery into sea-floor regions that exh… ▽ More Side-look synthetic aperture sonar (SAS) can produce very high quality images of the sea-floor. When viewing this imagery, a human observer can often easily identify various sea-floor textures such as sand ripple, hard-packed sand, sea grass and rock. In this paper, we present the Possibilistic Fuzzy Local Information C-Means (PFLICM) approach to segment SAS imagery into sea-floor regions that exhibit these various natural textures. The proposed PFLICM method incorporates fuzzy and possibilistic clustering methods and leverages (local) spatial information to perform soft segmentation. Results are shown on several SAS scenes and compared to alternative segmentation approaches. △ Less

Submitted 28 September, 2017; originally announced September 2017.

Comments: 8 pages, 11 figures, to appear in the 2017 IEEE Symposium Series on Computational Intelligence (SSCI) Proceedings

arXiv:1510.02055 [pdf, other]

Diverse Large-Scale ITS Dataset Created from Continuous Learning for Real-Time Vehicle Detection

Authors: Justin A. Eichel, Akshaya Mishra, Nicholas Miller, Nicholas Jankovic, Mohan A. Thomas, Tyler Abbott, Douglas Swanson, Joel Keller

Abstract: In traffic engineering, vehicle detectors are trained on limited datasets resulting in poor accuracy when deployed in real world applications. Annotating large-scale high quality datasets is challenging. Typically, these datasets have limited diversity; they do not reflect the real-world operating environment. There is a need for a large-scale, cloud based positive and negative mining (PNM) proces… ▽ More In traffic engineering, vehicle detectors are trained on limited datasets resulting in poor accuracy when deployed in real world applications. Annotating large-scale high quality datasets is challenging. Typically, these datasets have limited diversity; they do not reflect the real-world operating environment. There is a need for a large-scale, cloud based positive and negative mining (PNM) process and a large-scale learning and evaluation system for the application of traffic event detection. The proposed positive and negative mining process addresses the quality of crowd sourced ground truth data through machine learning review and human feedback mechanisms. The proposed learning and evaluation system uses a distributed cloud computing framework to handle data-scaling issues associated with large numbers of samples and a high-dimensional feature space. The system is trained using AdaBoost on $1,000,000$ Haar-like features extracted from $70,000$ annotated video frames. The trained real-time vehicle detector achieves an accuracy of at least $95\%$ for $1/2$ and about $78\%$ for $19/20$ of the time when tested on approximately $7,500,000$ video frames. At the end of 2015, the dataset is expect to have over one billion annotated video frames. △ Less

Submitted 7 October, 2015; originally announced October 2015.

Comments: 13 pages, 11 figures

arXiv:1508.05228 [pdf, other]

A Case Study on Covert Channel Establishment via Software Caches in High-Assurance Computing Systems

Authors: Wolfgang Schmidt, Michael Hanspach, Jörg Keller

Abstract: Covert channels can be utilized to secretly deliver information from high privileged processes to low privileged processes in the context of a high-assurance computing system. In this case study, we investigate the possibility of covert channel establishment via software caches in the context of a framework for component-based operating systems. While component-based operating systems offer securi… ▽ More Covert channels can be utilized to secretly deliver information from high privileged processes to low privileged processes in the context of a high-assurance computing system. In this case study, we investigate the possibility of covert channel establishment via software caches in the context of a framework for component-based operating systems. While component-based operating systems offer security through the encapsulation of system service processes, complete isolation of these processes is not reasonably feasible. This limitation is practically demonstrated with our concept of a specific covert timing channel based on file system caching. The stability of the covert channel is evaluated and a methodology to disrupt the covert channel transmission is presented. While these kinds of attacks are not limited to high-assurance computing systems, our study practically demonstrates that even security-focused computing systems with a minimal trusted computing base are vulnerable for such kinds of attacks and careful design decisions are necessary for secure operating system architectures. △ Less

Submitted 21 August, 2015; originally announced August 2015.

Comments: 12 pages, based upon the master's thesis of Schmidt

arXiv:1505.07757 [pdf, other]

Micro protocol engineering for unstructured carriers: On the embedding of steganographic control protocols into audio transmissions

Authors: Matthias Naumann, Steffen Wendzel, Wojciech Mazurczyk, Jörg Keller

Abstract: Network steganography conceals the transfer of sensitive information within unobtrusive data in computer networks. So-called micro protocols are communication protocols placed within the payload of a network steganographic transfer. They enrich this transfer with features such as reliability, dynamic overlay routing, or performance optimization --- just to mention a few. We present different desig… ▽ More Network steganography conceals the transfer of sensitive information within unobtrusive data in computer networks. So-called micro protocols are communication protocols placed within the payload of a network steganographic transfer. They enrich this transfer with features such as reliability, dynamic overlay routing, or performance optimization --- just to mention a few. We present different design approaches for the embedding of hidden channels with micro protocols in digitized audio signals under consideration of different requirements. On the basis of experimental results, our design approaches are compared, and introduced into a protocol engineering approach for micro protocols. △ Less

Submitted 28 May, 2015; originally announced May 2015.

Comments: 20 pages, 7 figures, 4 tables

arXiv:1403.1165 [pdf, other]

A Taxonomy for Attack Patterns on Information Flows in Component-Based Operating Systems

Authors: Michael Hanspach, Jörg Keller

Abstract: We present a taxonomy and an algebra for attack patterns on component-based operating systems. In a multilevel security scenario, where isolation of partitions containing data at different security classifications is the primary security goal and security breaches are mainly defined as undesired disclosure or modification of classified data, strict control of information flows is the ultimate goal… ▽ More We present a taxonomy and an algebra for attack patterns on component-based operating systems. In a multilevel security scenario, where isolation of partitions containing data at different security classifications is the primary security goal and security breaches are mainly defined as undesired disclosure or modification of classified data, strict control of information flows is the ultimate goal. In order to prevent undesired information flows, we provide a classification of information flow types in a component-based operating system and, by this, possible patterns to attack the system. The systematic consideration of informations flows reveals a specific type of operating system covert channel, the covert physical channel, which connects two former isolated partitions by emitting physical signals into the computer's environment and receiving them at another interface. △ Less

Submitted 5 March, 2014; originally announced March 2014.

Comments: 9 pages

Journal ref: In Proceedings of the 7th Layered Assurance Workshop, New Orleans, LA, USA, December 2013

arXiv:1211.4414 [pdf, ps, other]

Towards a Scalable Dynamic Spatial Database System

Authors: Joaquín Keller, Raluca Diaconu, Mathieu Valero

Abstract: With the rise of GPS-enabled smartphones and other similar mobile devices, massive amounts of location data are available. However, no scalable solutions for soft real-time spatial queries on large sets of moving objects have yet emerged. In this paper we explore and measure the limits of actual algorithms and implementations regarding different application scenarios. And finally we propose a nove… ▽ More With the rise of GPS-enabled smartphones and other similar mobile devices, massive amounts of location data are available. However, no scalable solutions for soft real-time spatial queries on large sets of moving objects have yet emerged. In this paper we explore and measure the limits of actual algorithms and implementations regarding different application scenarios. And finally we propose a novel distributed architecture to solve the scalability issues. △ Less

Submitted 19 November, 2012; originally announced November 2012.

Comments: (2012)

arXiv:1210.6411 [pdf, other]

doi 10.4204/EPTCS.99.4

A structural analysis of the A5/1 state transition graph

Authors: Andreas Beckmann, Jaroslaw Fedorowicz, Jörg Keller, Ulrich Meyer

Abstract: We describe efficient algorithms to analyze the cycle structure of the graph induced by the state transition function of the A5/1 stream cipher used in GSM mobile phones and report on the results of the implementation. The analysis is performed in five steps utilizing HPC clusters, GPGPU and external memory computation. A great reduction of this huge state transition graph of 2^64 nodes is achieve… ▽ More We describe efficient algorithms to analyze the cycle structure of the graph induced by the state transition function of the A5/1 stream cipher used in GSM mobile phones and report on the results of the implementation. The analysis is performed in five steps utilizing HPC clusters, GPGPU and external memory computation. A great reduction of this huge state transition graph of 2^64 nodes is achieved by focusing on special nodes in the first step and removing leaf nodes that can be detected with limited effort in the second step. This step does not break the overall structure of the graph and keeps at least one node on every cycle. In the third step the nodes of the reduced graph are connected by weighted edges. Since the number of nodes is still huge an efficient bitslice approach is presented that is implemented with NVIDIA's CUDA framework and executed on several GPUs concurrently. An external memory algorithm based on the STXXL library and its parallel pipelining feature further reduces the graph in the fourth step. The result is a graph containing only cycles that can be further analyzed in internal memory to count the number and size of the cycles. This full analysis which previously would take months can now be completed within a few days and allows to present structural results for the full graph for the first time. The structure of the A5/1 graph deviates notably from the theoretical results for random map**s. △ Less

Submitted 23 October, 2012; originally announced October 2012.

Comments: In Proceedings GRAPHITE 2012, arXiv:1210.6118

Journal ref: EPTCS 99, 2012, pp. 5-19

arXiv:0810.0852 [pdf, ps, other]

Evaluation of Authors and Journals

Authors: Joseph B. Keller

Abstract: A method is presented for evaluating authors on the basis of citations. It assigns to each author a citation score which depends upon the number of times he is cited, and upon the scores of the citers. The scores are found to be the components of an eigenvector of a normalized citation matrix. The same method can be applied to citation of journals by other journals, to evaluating teams in a leag… ▽ More A method is presented for evaluating authors on the basis of citations. It assigns to each author a citation score which depends upon the number of times he is cited, and upon the scores of the citers. The scores are found to be the components of an eigenvector of a normalized citation matrix. The same method can be applied to citation of journals by other journals, to evaluating teams in a league [1], etc. △ Less

Submitted 5 October, 2008; originally announced October 2008.

Comments: 6 pages

arXiv:cs/0105028 [pdf, ps, other]

When being Weak is Brave: Privacy in Recommender Systems

Authors: Naren Ramakrishnan, Benjamin J. Keller, Batul J. Mirza, Ananth Y. Grama, George Karypis

Abstract: We explore the conflict between personalization and privacy that arises from the existence of weak ties. A weak tie is an unexpected connection that provides serendipitous recommendations. However, information about weak ties could be used in conjunction with other sources of data to uncover identities and reveal other personal information. In this article, we use a graph-theoretic model to stud… ▽ More We explore the conflict between personalization and privacy that arises from the existence of weak ties. A weak tie is an unexpected connection that provides serendipitous recommendations. However, information about weak ties could be used in conjunction with other sources of data to uncover identities and reveal other personal information. In this article, we use a graph-theoretic model to study the benefit and risk from weak ties. △ Less

Submitted 18 May, 2001; originally announced May 2001.

ACM Class: H.4.2

arXiv:cs/0104009 [pdf, ps, other]

Evaluating Recommendation Algorithms by Graph Analysis

Authors: Batul J. Mirza, Benjamin J. Keller, Naren Ramakrishnan

Abstract: We present a novel framework for evaluating recommendation algorithms in terms of the `jumps' that they make to connect people to artifacts. This approach emphasizes reachability via an algorithm within the implicit graph structure underlying a recommender dataset, and serves as a complement to evaluation in terms of predictive accuracy. The framework allows us to consider questions relating alg… ▽ More We present a novel framework for evaluating recommendation algorithms in terms of the `jumps' that they make to connect people to artifacts. This approach emphasizes reachability via an algorithm within the implicit graph structure underlying a recommender dataset, and serves as a complement to evaluation in terms of predictive accuracy. The framework allows us to consider questions relating algorithmic parameters to properties of the datasets. For instance, given a particular algorithm `jump,' what is the average path length from a person to an artifact? Or, what choices of minimum ratings and jumps maintain a connected graph? We illustrate the approach with a common jump called the `hammock' using movie recommender datasets. △ Less

Submitted 3 April, 2001; originally announced April 2001.

ACM Class: H.4.2

Showing 1–40 of 40 results for author: Keller, J